Automatically scale Amazon Kendra query capacity units with Amazon EventBridge and AWS Lambda

Data is proliferating inside the enterprise and employees are using more applications than ever before to get their jobs done, in fact according to Okta Inc., the number of software apps deployed by large firms across all industries world-wide has increased 68%, reaching an average of 129 apps per company.

As employees continue to self-serve and the number of applications they use grows, so will the likelihood that critical business information will remain hard to find or get lost between systems, negatively impacting workforce productivity and operating costs.

Amazon Kendra is an intelligent search service powered by machine learning (ML). Unlike conventional search technologies, Amazon Kendra reimagines search by unifying unstructured data across multiple data sources as part of a single searchable index. It’s deep learning and natural language processing capabilities then make it easy for you to get relevant answers when you need them.

Amazon Kendra Enterprise Edition includes storage capacity for 500,000 documents (150 GB of storage) and a query capacity of 40,000 queries per day (0.5 queries per second), and allows you to adjust index capacity by increasing or decreasing your query and storage capacity units as needed.

However, usage patterns and business needs are not always predictable. In this post we’ll demonstrate how you can automatically scale your Amazon Kendra index based on a time schedule using Amazon EventBridge and AWS Lambda. By doing this you can increase capacity for peak usage, avoid service throttling, maintain flexibility, and control costs.

Solution overview

Amazon Kendra provides a dashboard that allows you to evaluate the average number of queries per second for your index. With this information, you can estimate the number of additional capacity units your workload requires at a specific point in time.

For example, the following graph shows that during business hours, a surge occurs in the average queries per second, but after hours, the number of queries reduces. We base our solution on this pattern to set up an EventBridge scheduled event that triggers the automatic scaling Lambda function.

The following diagram illustrates our architecture.

You can deploy the solution into your account two different ways:

  • Deploy an AWS Serverless Application Model (AWS SAM) template:
    • Clone the project from the aws-samples repository on GitHub and follow the instructions.
  • Create the resources by using the AWS Management Console. In this post, we walk you through the following steps:
    • Set up the Lambda function for scaling
    • Configure permissions for the function
    • Test the function
    • Set up an EventBridge scheduled event

Set up the Lambda function

To create the Lambda function that we use for scaling, we create a function using the Python runtime (for this post, we use the Python 3.8 runtime).

Use the following code as the content of your code:

# # Copyright 2020, Inc. or its affiliates. All Rights Reserved. # # Permission is hereby granted, free of charge, to any person obtaining a copy of this # software and associated documentation files (the “Software”), to deal in the Software # without restriction, including without limitation the rights to use, copy, modify, # merge, publish, distribute, sublicense, and/or sell copies of the Software, and to # permit persons to whom the Software is furnished to do so. # # THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION # OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE # SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. # ”’ Changes the number of Amazon Kendra Enterprise Edition index capacity units Parameters ———- event : dict Lambda event Returns ——- The additional capacity action or an error ”’ import json import boto3 from botocore.exceptions import ClientError # Variable declaration KENDRA = boto3.client(“kendra”) # Define your Amazon Kendra Enterprise Edition index ID INDEX_ID = “” # Define your baseline units DEFAULT_UNITS = 0 # Define your the number of Query Capacity Units needed for increased capacity ADDITIONAL_UNITS= 1 def add_capacity(INDEX_ID,capacity_units): try: response = KENDRA.update_index( Id=INDEX_ID, CapacityUnits={ ‘QueryCapacityUnits’: int(capacity_units), ‘StorageCapacityUnits’: 0 }) return(response) except Exception as e: raise e def reset_capacity(INDEX_ID,DEFAULT_UNITS): try: response = KENDRA.update_index( Id=INDEX_ID, CapacityUnits={ ‘QueryCapacityUnits’: DEFAULT_UNITS, ‘StorageCapacityUnits’: 0 }) except Exception as e: raise e def current_capacity(INDEX_ID): try: response = KENDRA.describe_index( Id=INDEX_ID) return(response) except Exception as e: raise e def lambda_handler(event,context): print(“Checking for query capacity units……”) response = current_capacity(INDEX_ID) currentunits = response[‘CapacityUnits’][‘QueryCapacityUnits’] print (“Current query capacity units are: “+str(currentunits)) status = response[‘Status’] print (“Current index status is: “+status) # If index is stuck in UPDATE state, don’t attempt changing the capacity if status == “UPDATING”: return (“Index is currently being updated. No changes have been applied”) if status == “ACTIVE”: if currentunits == 0: print (“Adding query capacity…”) response = add_capacity(INDEX_ID,ADDITIONAL_UNITS) print(response) return response else: print (“Removing query capacity….”) response = reset_capacity(INDEX_ID, DEFAULT_UNITS) print(response) return response else: response = “Index is not ready to modify capacity. No changes have been applied.” return(response)

You must modify the following variables to match with your environment:

# Define your Amazon Kendra Enterprise Edition index ID INDEX_ID = “” # Define your baseline units DEFAULT_UNITS = 1 # Define your the number of Query Capacity Units needed for increased capacity ADDITIONAL_UNITS = 4

  • INDEX_ID – The ID for your index; you can check it on the Amazon Kendra console.
  • DEFAULT_UNITS – The number of query processing units that your Amazon Kendra Enterprise Edition requires to operate at minimum capacity. This number can range from 0–20 (you can request more capacity). 0 represents that no extra capacity units are provisioned to your Amazon Kendra Enterprise Edition index, which leaves it with a default capacity of 0.5 queries per second.
  • ADDITIONAL_UNITS – The number of query capacity units you require at those times where additional capacity is required. This value can range from 1–20 (you can request additional capacity).

Configure function permissions

To query the status of your index and to modify the number of query capacity units, you need to attach a policy to your Lambda function AWS Identity and Access Management (IAM) execution role with those permissions.

  1. On the Lambda console, navigate to your function.
  2. On the Permissions tab, choose the execution role.

The IAM console opens automatically.

  1. On the Permissions tab, choose Attach policies.

  1. Choose Create policy.

A new tab opens.

  1. On the JSON tab, add the following content (make sure to provide your account and user information):

{ “Version”: “2012-10-17”, “Statement”: [ { “Sid”: “MyPolicy”, “Effect”: “Allow”, “Action”: [ “kendra:UpdateIndex”, “kendra:DescribeIndex” ], “Resource”: “arn:aws:kendra:::index/” } ] }

  1. Choose Next: Tags.
  2. Choose Next: Review.
  3. For Name, enter a policy name (for this post, we use AmazonKendra_UpdateIndex).
  4. Choose Create policy.
  5. On the Attach permissions page, choose the refresh icon.
  6. Filter to find the policy you created.
  7. Select the policy and choose Attach policy.

Test the function

You can test your Lambda function by running a test event. For more information, see Invoke the Lambda function.

  1. On the Lambda console, navigate to your function.
  2. Create a new test event by choosing Test.

  1. Select Create new test event.
  2. For Event template, because your function doesn’t require any input from the event, you can choose the hello-world event template.

  1. Choose Create.
  2. Choose Test.

On the Lambda function logs, you can see the following messages:

Function Logs START RequestId: 9b2382b7-0229-4b2b-883e-ba0f6b149513 Version: $LATEST Checking for capacity units…… Current capacity units are: 1 Current index status is: ACTIVE Adding capacity…

Set up an EventBridge scheduled event

An EventBridge scheduled event is an EventBridge event that is triggered on a regular schedule. This section shows how to create an EventBridge scheduled event that runs every day at 7 AM UTC and at 8 PM UTC to trigger the kendra-index-scaler Lambda function. This allows your index to scale up with the additional query capacity units at 7 AM and scale down at 8 PM.

When you set up EventBridge scheduled events, you do so for the UTC time zone, so you need to calculate the time offset. For example, to run the event at 7 AM Central Standard Time (CST), you need to set the time to 1 PM UTC. If you want to accommodate for daylight savings, you have to create a different rule to account for the difference.

  1. On the EventBridge console, in the navigation pane, under Events, choose Rules.
  2. Choose Create rule.

  1. For Name, enter a name for your rule (for this post, we use kendra-index-scaler).

  1. In the Define pattern section, select Schedule.
  2. Select Cron expression and enter 0 7,20 * * ? *.

We use this cron expression to trigger the EventBridge event every day at 7 AM and 8 PM.

  1. In the Select event bus section, select AWS default event bus.

  1. In the Select targets section, for Target, choose Lambda function.
  2. For Function, enter the function you created earlier (lambda_function_kendra_index_handler).

  1. Choose Create.

You can check Amazon CloudWatch Logs for the lambda_function_kendra_index_handler function and see how it behaves depending on your index’s query capacity units.


In this post, you deployed a mechanism to automatically scale additional query processing units for your Amazon Kendra Enterprise Edition index.

As a next step, you could periodically review your usage patterns in order to plan the schedule to accommodate your query volume. To learn more about Amazon Kendra’s use cases, benefits, and how to get started with it, visit the webpage!

About the Authors

Juan Bustos is an AI Services Specialist Solutions Architect at Amazon Web Services, based in Dallas, TX. Outside of work, he loves spending time writing and playing music as well as trying random restaurants with his family.




Tapodipta Ghosh is a Senior Architect. He leads the Content And Knowledge Engineering Machine Learning team that focuses on building models related to AWS Technical Content. He also helps our customers with AI/ML strategy and implementation using our AI Language services like Amazon Kendra.




Tom McMahon is a Product Marketing Manager on the AI Services team at AWS. He’s passionate about technology and storytelling and has spent time across a wide-range of industries including healthcare, retail, logistics, and ecommerce. In his spare time he enjoys spending time with family, music, playing golf, and exploring the amazing Pacific northwest and its surrounds.