Data is proliferating inside the enterprise and employees are using more applications than ever before to get their jobs done, in fact according to Okta Inc., the number of software apps deployed by large firms across all industries world-wide has increased 68%, reaching an average of 129 apps per company.
As employees continue to self-serve and the number of applications they use grows, so will the likelihood that critical business information will remain hard to find or get lost between systems, negatively impacting workforce productivity and operating costs.
Amazon Kendra is an intelligent search service powered by machine learning (ML). Unlike conventional search technologies, Amazon Kendra reimagines search by unifying unstructured data across multiple data sources as part of a single searchable index. It’s deep learning and natural language processing capabilities then make it easy for you to get relevant answers when you need them.
Amazon Kendra Enterprise Edition includes storage capacity for 500,000 documents (150 GB of storage) and a query capacity of 40,000 queries per day (0.5 queries per second), and allows you to adjust index capacity by increasing or decreasing your query and storage capacity units as needed.
However, usage patterns and business needs are not always predictable. In this post we’ll demonstrate how you can automatically scale your Amazon Kendra index based on a time schedule using Amazon EventBridge and AWS Lambda. By doing this you can increase capacity for peak usage, avoid service throttling, maintain flexibility, and control costs.
Amazon Kendra provides a dashboard that allows you to evaluate the average number of queries per second for your index. With this information, you can estimate the number of additional capacity units your workload requires at a specific point in time.
For example, the following graph shows that during business hours, a surge occurs in the average queries per second, but after hours, the number of queries reduces. We base our solution on this pattern to set up an EventBridge scheduled event that triggers the automatic scaling Lambda function.
The following diagram illustrates our architecture.
You can deploy the solution into your account two different ways:
To create the Lambda function that we use for scaling, we create a function using the Python runtime (for this post, we use the Python 3.8 runtime).
Use the following code as the content of your lambda_function.py code:
# # Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved. # # Permission is hereby granted, free of charge, to any person obtaining a copy of this # software and associated documentation files (the “Software”), to deal in the Software # without restriction, including without limitation the rights to use, copy, modify, # merge, publish, distribute, sublicense, and/or sell copies of the Software, and to # permit persons to whom the Software is furnished to do so. # # THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION # OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE # SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # ”’ Changes the number of Amazon Kendra Enterprise Edition index capacity units Parameters ———- event : dict Lambda event Returns ——- The additional capacity action or an error ”’ import json import boto3 from botocore.exceptions import ClientError # Variable declaration KENDRA = boto3.client(“kendra”) # Define your Amazon Kendra Enterprise Edition index ID INDEX_ID = “
You must modify the following variables to match with your environment:
# Define your Amazon Kendra Enterprise Edition index ID INDEX_ID = “
To query the status of your index and to modify the number of query capacity units, you need to attach a policy to your Lambda function AWS Identity and Access Management (IAM) execution role with those permissions.
The IAM console opens automatically.
A new tab opens.
{ “Version”: “2012-10-17”, “Statement”: [ { “Sid”: “MyPolicy”, “Effect”: “Allow”, “Action”: [ “kendra:UpdateIndex”, “kendra:DescribeIndex” ], “Resource”: “arn:aws:kendra:
You can test your Lambda function by running a test event. For more information, see Invoke the Lambda function.
On the Lambda function logs, you can see the following messages:
Function Logs START RequestId: 9b2382b7-0229-4b2b-883e-ba0f6b149513 Version: $LATEST Checking for capacity units…… Current capacity units are: 1 Current index status is: ACTIVE Adding capacity…
An EventBridge scheduled event is an EventBridge event that is triggered on a regular schedule. This section shows how to create an EventBridge scheduled event that runs every day at 7 AM UTC and at 8 PM UTC to trigger the kendra-index-scaler Lambda function. This allows your index to scale up with the additional query capacity units at 7 AM and scale down at 8 PM.
When you set up EventBridge scheduled events, you do so for the UTC time zone, so you need to calculate the time offset. For example, to run the event at 7 AM Central Standard Time (CST), you need to set the time to 1 PM UTC. If you want to accommodate for daylight savings, you have to create a different rule to account for the difference.
We use this cron expression to trigger the EventBridge event every day at 7 AM and 8 PM.
You can check Amazon CloudWatch Logs for the lambda_function_kendra_index_handler function and see how it behaves depending on your index’s query capacity units.
In this post, you deployed a mechanism to automatically scale additional query processing units for your Amazon Kendra Enterprise Edition index.
As a next step, you could periodically review your usage patterns in order to plan the schedule to accommodate your query volume. To learn more about Amazon Kendra’s use cases, benefits, and how to get started with it, visit the webpage!
Juan Bustos is an AI Services Specialist Solutions Architect at Amazon Web Services, based in Dallas, TX. Outside of work, he loves spending time writing and playing music as well as trying random restaurants with his family.
Tapodipta Ghosh is a Senior Architect. He leads the Content And Knowledge Engineering Machine Learning team that focuses on building models related to AWS Technical Content. He also helps our customers with AI/ML strategy and implementation using our AI Language services like Amazon Kendra.
Tom McMahon is a Product Marketing Manager on the AI Services team at AWS. He’s passionate about technology and storytelling and has spent time across a wide-range of industries including healthcare, retail, logistics, and ecommerce. In his spare time he enjoys spending time with family, music, playing golf, and exploring the amazing Pacific northwest and its surrounds.