[]Amazon Kendra is an intelligent search service powered by machine learning (ML). It indexes the documents stored in a wide range of repositories and finds the most relevant document based on the keywords or natural language questions the user has searched for. In some scenarios, you need the search results to be filtered based on the context of the user making the search. Additional refinement is needed to find the documents specific to that user or user group as the top search result.
[]In this blog post, we focus on retrieving custom search results that apply to a specific user or user group. For instance, faculty in an educational institution belongs to different departments, and if a professor belonging to the computer science department signs in to the application and searches with the keywords “faculty courses,” then documents relevant to the same department come up as the top results, based on data source availability.
[]To solve this problem, you can identify one or more unique metadata information that is associated with the documents being indexed and searched. When the user signs in to an Amazon Lex chatbot, user context information can be derived from Amazon Cognito. The Amazon Lex chatbot can be integrated into Amazon Kendra using a direct integration or via an AWS Lambda function. The use of the AWS Lambda function will provide you with fine-grained control of the Amazon Kendra API calls. This will allow you to pass contextual information from the Amazon Lex chatbot to Amazon Kendra to fine-tune the search queries.
[]In Amazon Kendra, you provide document metadata attributes using custom attributes. To customize the document metadata during the ingestion process, refer to the Amazon Kendra Developer Guide. After completing the document metadata generation and indexing steps, you need to focus on refining the search results using the metadata attributes. Based on this, for example, you can ensure that users from the computer science department will get search results ranked according to their relevance to the department. That is, if there’s a document relevant to that department, it should be on the top of the search-result list preceding any other document without department information or nonmatching department.
[]Let’s now explore how to build this solution in more detail.
[]Figure 1: Architecture diagram of proposed solution
[]The sample architecture used in this blog to demonstrate the use case is shown in Figure 1. You will set up an Amazon Kendra document index that consumes data from an Amazon Simple Storage Service (Amazon S3) bucket. You will set up a simple chatbot using Amazon Lex that will connect to the Amazon Kendra index via an AWS Lambda function. Users will rely on Amazon Cognito to authenticate and gain access to the Amazon Lex chatbot user interface. For the purposes of the demo, you will have two different users in Amazon Cognito belonging to two different departments. Using this setup, when you sign in using User 1 in Department A, search results will be filtered documents belonging to Department A and vice versa for Department B users.
[]Before you can try to integrate the Amazon Lex chatbot with an Amazon Kendra index, you need to set up the basic building blocks for the solution. At a high level, you need to perform the following steps to enable this demo:
{ “DocumentId”: “Faculty Certification Course-Computer Science”, “Attributes”: { “_category”: “dosanddonts”, “department”: “Computer Science”, “document_type”: “job aid” }, “Title”: “Faculty Certification Course-Computer Science”, “ContentType”: “PDF” }
[]Once the basic building blocks are in place, your next step will be to create the AWS Lambda function that will tie together the Amazon Lex chatbot intent fulfillment with the Amazon Kendra index. The rest of this blog will specifically focus on this step and provide details on how to achieve this integration.
[]Now that the prerequisites are in place, you can start working on integrating your Amazon Lex chatbot with the Amazon Kendra index. As part of the integration, you will need to perform the following tasks:
[]Let’s look at these steps in more detail below.
[]The first thing you need to do is code and set up the Lambda function that can act as a bridge between the Amazon Lex chatbot intent and the Amazon Kendra index. The input event format documentation provides the full input Javascript Object Notation (JSON) input event structure. If the authentication system provides the user ID as an HTTP POST request to Amazon Lex, then the value will be available in the “userId” key of the JSON object. When the authentication is performed using Amazon Cognito, the “sessionState”.”sessionAttributes”.”idtokenjwt” key will contain a JSON Web Token (JWT) token object. If you are programming the AWS Lambda function in Python, the two lines of code to read the attributes from the event object will be as follows:
userid = event[‘userId’] token = event[‘sessionState’][‘sessionAttributes’][‘idtokenjwt’] []The JWT token is encoded. Once you’ve decoded the JWT token, you will be able to read the value of the custom attribute associated with the Amazon Cognito user. Refer to How can I decode and verify the signature of an Amazon Cognito JSON Web Token to understand how to decode the JWT token, verify it, and retrieve the custom values. Once you have the claims from the token, you can extract the custom attribute, like “department” in Python, as follows:
userDept = claims[‘custom:department’] []When using a third-party identity provider (IDP) to authenticate against the chatbot, you need to ensure that the IDP sends an token with required attributes. The token should include required data for the custom attributes, such as department, group memberships, etc. This will be passed to the Amazon Lex chatbot in the session context variables. If you are using the lex-web-ui as the chatbot interface, then refer to the credential management section of the lex-web-ui readme documentation to understand how Amazon Cognito is integrated with lex-web-ui. To understand how you can integrate third-party identity providers with an Amazon Cognito identity pool, refer to the documentation on Identity pools (federated identities) external identity providers.
[]For the query topic from the user, you can extract from the event object by reading the value of the slots identified by Amazon Lex. The actual value of the slot can be read from the attribute with the key “sessionState”.”intent”.”slots”.”slot name”.”value”.”interpretedValue” based on the identified data type. In the example in this blog, using Python, you could use the following lines of code to read the query values:
slotValue = event[‘sessionState’][‘intent’][‘slots’][‘elective_year’][‘value’][‘interpretedValue’] []As described in the documentation for input event format, the slots value is an object that can have multiple entries of different data types. The data type for any given value will be indicated by “’sessionState”.”intent”.”slots”.”slot name”.”shape”. If the attribute is empty or missing, then the datatype is a string. In the example in this blog, using Python, you could use the following lines of code to read the query values:
slotType = event[‘sessionState’][‘intent’][‘slots’][‘elective_year’][‘shape’] []Once you know the data format for the slot, you can interpret the value of ‘slotValue’ based on the data type identified in ‘slotType’.
[]Now that you’ve managed to extract all the relevant information from the input event object, you need to construct an Amazon Kendra query within the Lambda. Amazon Kendra lets you filter queries via specific attributes. When you submit a query to Amazon Kendra using the Query API, you can provide a document attribute as an attribute filter so that your users’ search results will be based on values matching that filter. Filters can be logically combined when you need to query on a hierarchy of attributes. A sample-filtered query will look as follows:
response=kendra.query( QueryText = query, IndexId = index, AttributeFilter = { Filter Conditions Object } ) []To understand filtering queries in Amazon Kendra in more detail, you can refer to AWS documentation – Filtering queries. Based on the above query, search results from Amazon Kendra will be scoped to include documents where the metadata attribute for “document” matches the value for the filter provided. In Python, this will look as follows:
response = kendra.query( QueryText = slotValue, IndexId = index_id, QueryResultTypeFilter = ‘ANSWER’, AttributeFilter = {‘AndAllFilters’: [ {‘EqualsTo’: {‘Key’: ‘department’, ‘Value’: {‘StringValue’: userDept}}} ] } ) []As highlighted earlier, please refer to Amazon Kendra Query API documentation to understand all the various attributes that can be provided into the query, including complex filter conditions for filtering the user search.
[]Upon a successful query within the Amazon Kendra index, you will receive a JSON object back as a response from the Query API. The full structure of the response object, including all its attributes details, are listed in the Amazon Kendra Query API documentation. You can read the “TotalNumberOfResults” to check the total number of results returned for the query you submitted. Do note that the SDK will only let you retrieve up to a maximum of 100 items. The query results are returned in the “ResultItems” attribute as an array of “QueryResultItem” objects. From the “QueryResultItem”, the attributes of immediate interest are “DocumentTitle”, “DocumentExcerpt”, and “DocumentURI”. In Python, you can use the below code to extract these values from the first “ResultItems” in the Amazon Kendra response:
docTitle = response[‘ResultItems’][0][‘DocumentTitle’][‘Text’] docURI = response[‘ResultItems’][0][‘DocumentURI’] docTitle = response[‘ResultItems’][0][‘DocumentExcerpt’][‘Text’] []Ideally, you should check the value of “TotalNumberOfResults” and iterate through the “ResultItems” array to retrieve all the results of interest. You need to then pack it properly into a valid AWS Lambda response object to be sent to the Amazon Lex chatbot. The structure of the expected Amazon Lex v2 chatbot response is documented in the Response format section. At a minimum, you need to populate the following attributes in the response object before returning it to the chatbot:
response = { ‘sessionState’: { ‘activeContexts’: [], ‘dialogAction’: { ‘type’: ‘Close’ }, ‘intent’: { ‘name’: ‘SearchCourses, ‘slots’: ‘elective_year’, ‘state’: ‘Fulfilled’ } } }
response.update({ ‘messages’: { ‘contentType’: ‘PlainText’, ‘content’: docTitle } })
[]At this point, you have a complete AWS Lambda function in place that can extract the user context from the incoming event, perform a filtered query against Amazon Kendra based on user context, and respond back to the Amazon Lex chatbot. The next step is to configure the Amazon Lex chatbot to use this AWS Lambda function as part of the intent fulfillment process. You can accomplish this by following the documented steps at Attaching a Lambda function to a bot alias. At this point, you now have a fully functioning Amazon Lex chatbot integrated with the Amazon Kendra index that can perform contextual queries based on the user interacting with the chatbot.
[]In our example, we have 2 users, User1 and User 2. User 1 is from the computer science department and User 2 is from the civil engineering department. Based on their contextual information related to department, Figure 2 will depict how the same conversation can result in different results in a side-by-side screenshot of two chatbot interactions:
[]Figure 2: Side-by-side comparison of multiple user chat sessions
[]If you followed along the example setup, then you should clean up any resources you created to avoid additional charges in the long run. To perform a cleanup of the resources, you need to:
[]Amazon Kendra is a highly accurate enterprise search service. Combining its natural language processing feature with an intelligent chatbot creates a solution that is robust for any use case needing custom outputs based on user context. Here we considered a sample use case of an organization with multiple departments, but this mechanism can be applied to any other relevant use cases with minimal changes.
[]Ready to get started? The Accenture AWS Business Group (AABG) helps customers accelerate their pace of digital innovation and realize incremental business value from cloud adoption and transformation. Connect with our team at accentureaws@amazon.com to learn how to build intelligent chatbot solutions for your customers.
[]Rohit Satyanarayana is a Partner Solutions Architect at AWS in Singapore and is part of the AWS GSI team working with Accenture globally. His hobbies are reading fantasy and science fiction, watching movies and listening to music.
[]Leo An is a Senior Solutions Architect who has demonstrated the ability to design and deliver cost-effective, high-performance infrastructure solutions in a private and public cloud. He enjoys helping customers in using cloud technologies to address their business challenges and is specialized in machine learning and is focused on helping customers leverage AI/ML for their business outcomes.
[]Hemalatha Katari is a Solution Architect at Accenture. She is part of rapid prototyping team within the Accenture AWS Business Group (AABG). She helps organizations migrate and run their businesses in AWS cloud. She enjoys growing ornamental indoor plants and loves going for long nature trail walks.
[]Sruthi Mamidipalli is an AWS solutions architect at Accenture, where she is helping clients with successful adoption of cloud native architecture. Outside of work, she loves gardening, cooking, and spending time with her toddler.