Powered by Amazon Lex, the QnABot on AWS solution is an open-source, multi-channel, multi-language conversational chatbot. QnABot allows you to quickly deploy self-service conversational AI into your contact center, websites, and social media channels, reducing costs, shortening hold times, and improving customer experience and brand sentiment. Customers now want to apply the power of large language models (LLMs) to further improve the customer experience with generative AI capabilities. This includes automatically generating accurate answers from existing company documents and knowledge bases, and making their self-service chatbots more conversational.
Our latest QnABot releases, v5.4.0+, can now use an LLM to disambiguate customer questions by taking conversational context into account, dynamically generating answers from relevant FAQs or Amazon Kendra search results and document passages. It also provides attribution and transparency by displaying links to the reference documents and context passages that were used by the LLM to construct the answers.
When you deploy QnABot, you can choose to automatically deploy a state-of-the-art open-source LLM model (Falcon-40B-instruct) on an Amazon SageMaker endpoint. The LLM landscape is constantly evolving—new models are released frequently and our customers want to experiment with different models and providers to see what works best for their use cases. This is why QnABot also integrates with any other LLM using an AWS Lambda function that you provide. To help you get started, we’ve also released a set of sample one-click deployable Lambda functions (plugins) to integrate QnABot with your choice of leading LLM providers, including our own Amazon Bedrock service and APIs from third-party providers, Anthropic and AI21.
In this post, we introduce the new Generative AI features for QnABot and walk through a tutorial to create, deploy, and customize QnABot to use these features. We also discuss some relevant use cases.
Using the LLM, QnABot now has two new important features, which we discuss in this section.
QnABot can now generate concise answers to questions from document extracts provided by an Amazon Kendra search, or text passages created or imported directly. This provides the following advantages:
For example, when asked “What is Amazon Lex?”, QnABot can retrieve relevant passages from an Amazon Kendra index (containing AWS documentation). QnABot then asks (prompts) the LLM to answer the question based on the context of the passages (which can also optionally be viewed in the web client). The following screenshot shows an example.
Understanding the direction and context of an ever-evolving conversation is key to building natural, human-like conversational interfaces. User queries often require a bot to interpret requests based on conversation memory and context. Now QnABot will ask the LLM to generate a disambiguated question based on the conversation history. This can then be used as a search query to retrieve the FAQs, passages, or Amazon Kendra results to answer the user’s question. The following is an example chat history:
Human: What is Amazon Lex? AI: “Amazon Lex is an AWS service for building conversational interfaces for applications using voice and text…” Human: Can it integrate with my CRM?
QnABot uses the LLM to rewrite the follow-up question to make “it” unambiguous, for example, “Can Amazon Lex integrate with my CRM system?” This allows users to interact like they would in a human conversation, and QnABot generates clear search queries to find the relevant FAQs or document passages that have the information to answer the user’s question.
These new features make QnABot more conversational and provide the ability to dynamically generate responses based on a knowledge base. This is still an experimental feature with tremendous potential. We strongly encourage users to experiment to find the best LLM and corresponding prompts and model parameters to use. QnABot makes it straightforward to experiment!
Time to try it! Let’s deploy the latest QnABot (v5.4.0 or later) and enable the new Generative AI features. The high-level steps are as follows:
Download and use the following AWS CloudFormation template to create a new Amazon Kendra index.
This template includes sample data containing AWS online documentation for Amazon Kendra, Amazon Lex, and SageMaker. Deploying the stack requires about 30 minutes followed by about 15 minutes to synchronize it and ingest the data in the index.
When the Amazon Kendra index stack is successfully deployed, navigate to the stack’s Outputs tab and note the Index Id, which you will use later when deploying QnABot.
Alternatively, if you already have an Amazon Kendra index with your own content, you can use it instead with your own example questions for the tutorial.
QnABot can deploy a built-in LLM (Falcon-40B-instruct on SageMaker) or use Lambda functions to call any other LLMs of your choice. In this section, we show you how to use the Lambda option with a pre-built sample Lambda function. Skip to the next step if you want to use the built-in LLM instead.
First, choose the plugin LLM you want to use. Review your options from the qnabot-on-aws-plugin-samples repository README. As of this writing, plugins are available for Amazon Bedrock (in preview), and for AI21 and Anthropic third-party APIs. We expect to add more sample plugins over time.
Deploy your chosen plugin by choosing Launch Stack in the Deploy a new Plugin stack section, which will deploy into the us-east-1 Region by default (to deploy in other Regions, see Build and Publish QnABot Plugins CloudFormation artifacts).
When the Plugin stack is successfully deployed, navigate to the stack’s Outputs tab (see the following screenshot) and inspect its contents, which you will use in the following steps to deploy and configure QnABot. Keep this tab open in your browser.
Choose Launch Solution from the QnABot implementation guide to deploy the latest QnABot template via AWS CloudFormation. Provide the following parameters:
For all other parameters, accept the defaults (see the implementation guide for parameter definitions), and proceed to launch the QnABot stack.
If you deployed QnABot using a sample LLM Lambda plugin to access a different LLM, update the QnABot model parameters and prompt template settings as recommended for your chosen plugin. For more information, see Update QnABot Settings. If you used the SageMaker (built-in) LLM option, skip to the next step, because the settings are already configured for you.
On the AWS CloudFormation console, choose the Outputs tab of the QnABot CloudFormation stack and choose the ClientURL link. Alternatively, launch the client by choosing QnABot on AWS Client from the Content Designer tools menu.
Now, try to ask questions related to AWS services, for example:
Then you can ask follow-up questions without specifying the previously mentioned services or context, for example:
You can customize many settings on the QnABot Content Designer Settings page—see README – LLM Settings for a full list of relevant settings. For example, try the following:
QnABot can, of course, continue to answer questions based on curated Q&As. It can also use the LLM to generate answers from text passages created or imported directly into QnABot, in addition to using Amazon Kendra index.
QnABot attempts to find a good answer to the disambiguated user question in the following sequence:
Let’s try some examples.
On the QnABot Content Designer tools menu, choose Import, then load the two example packages:
QnABot can use text embeddings to provide semantic search capability (using QnABot’s built-in OpenSearch index as a vector store), which improves accuracy and reduces question tuning, compared to standard OpenSearch keyword based matching. To illustrate this, try questions like the following:
These should ideally match the sample QNA you imported, even though the words used to ask the question are poor keyword matches (but good semantic matches) with the configured QnA items: Alexa.001 (What is an Amazon Echo Show) and FireTV.001 (What is an Amazon Fire TV).
Even if you are not (yet) using Amazon Kendra (and you should!), QnABot can also answer questions based on passages created or imported into Content Designer. The following questions (and follow-up questions) are all answered from an imported text passage item that contains the nursery rhyme 0.HumptyDumpty:
When using embeddings, a good answer is an answer that returns a similarity score above the threshold defined by the corresponding threshold setting. See Semantic question matching, using Large Language Model Text Embeddings for more details on how to test and tune the threshold settings.
If there are no good answers, or if the LLM’s response matches the regular expression defined in LLM_QA_NO_HITS_REGEX, then QnABot invokes the configurable Custom Don’t Know (no_hits) behavior, which, by default, returns a message saying “You stumped me.”
Try some experiments by creating Q&As or text passage items in QnABot, as well as using an Amazon Kendra index for fallback generative answers. Experiment (using the TEST tab in the designer) to find the best values to use for the embedding threshold settings to get the behavior you want. It’s hard to get the perfect balance, but see if you can find a good enough balance that results in useful answers most of the time.
You can, of course, leave QnABot running to experiment with it and show it to your colleagues! But it does incur some cost—see Plan your deployment – Cost for more details. To remove the resources and avoid costs, delete the following CloudFormation stacks:
These new features make QnABot relevant for many customer use cases such as self-service customer service and support bots and automated web-based Q&A bots. We discuss two such use cases in this section.
QnABot’s automated question answering capabilities deliver effective self-service for inbound voice calls in contact centers, with compelling outcomes. For example, see how Kentucky Transportation Cabinet reduced call hold time and improved customer experience with self-service virtual agents using Amazon Connect and Amazon Lex. Integrating the new generative AI features strengthens this value proposition further by dynamically generating reliable answers from existing content such as documents, knowledge bases, and websites. This eliminates the need for bot designers to anticipate and manually curate responses to every possible question that a user might ask. To integrate QnABot with Amazon Connect, see Connecting QnABot on AWS to an Amazon Connect call center. To integrate with other contact centers, See how Amazon Chime SDK can be used to connect Amazon Lex voice bots with 3rd party contact centers via SIPREC and Build an AI-powered virtual agent for Genesys Cloud using QnABot and Amazon Lex.
The LLM-powered QnABot can also play a pivotal role as an automated real-time agent assistant. In this solution, QnABot passively listens to the conversation and uses the LLM to generate real-time suggestions for the human agents based on certain cues. It’s straightforward to set up and try—give it a go! This solution can be utilized with both Amazon Connect and other on-prem and cloud contact centers. For more information, see Live call analytics and agent assist for your contact center with Amazon language AI services.
Embedding QnABot in your websites and applications allows users to get automated assistance with natural dialogue. For more information, see Deploy a Web UI for your Chatbot. For curated Q&A content, use markdown syntax and UI buttons and incorporate links, images, videos, and other dynamic elements that inform and delight your users. Integrate the QnABot Amazon Lex web UI with Amazon Connect live chat to facilitate quick escalation to human agents when the automated assistant cannot fully address a user’s inquiry on its own.
As shown in this post, QnABot v5.4.0+ not only offers built-in support for embeddings and LLM models hosted on SageMaker, but it also offers the ability to easily integrate with any other LLM by using Lambda functions. You can author your own custom Lambda functions or get started faster with one of the samples we have provided in our new qnabot-on-aws-plugin-samples repository.
This repository includes a ready-to-deploy plugin for Amazon Bedrock, which supports both embeddings and text generation requests. At the time of writing, Amazon Bedrock is available through private preview—you can request preview access. When Amazon Bedrock is generally available, we expect to integrate it directly with QnABot, but why wait? Apply for preview access and use our sample plugin to start experimenting!
Today’s LLM innovation cycle is driving a breakneck pace of new model releases, each aiming to surpass the last. This repository will expand to include additional QnABot plugin samples over time. As of this writing, we have support for two third-party model providers: Anthropic and AI21. We plan to add integrations for more LLMs, embeddings, and potentially common use case examples involving Lambda hooks and knowledge bases. These plugins are offered as-is without warranty, for your convenience—users are responsible for supporting and maintaining them once deployed.
We hope that the QnABot plugins repository will mature into a thriving open-source community project. Watch the qnabot-on-aws-plugin-samples GitHub repo to receive updates on new plugins and features, use the Issues forum to report problems or provide feedback, and contribute improvements via pull requests. Contributions are welcome!
In this post, we introduced the new generative AI features for QnABot and walked through a solution to create, deploy, and customize QnABot to use these features. We also discussed some relevant use cases. Automating repetitive inquiries frees up human workers and boosts productivity. Rich responses create engaging experiences. Deploying the LLM-powered QnABot can help you elevate the self-service experience for customers and employees.
Don’t miss this opportunity—get started today and revolutionize the user experience on your QnABot deployment!
Clevester Teo is a Senior Partner Solutions Architect at AWS, focused on the Public Sector partner ecosystem. He enjoys building prototypes, staying active outdoors, and experiencing new cuisines. Clevester is passionate about experimenting with emerging technologies and helping AWS partners innovate and better serve public sector customers.
Windrich is a Solutions Architect at AWS who works with customers in industries such as finance and transport, to help accelerate their cloud adoption journey. He is especially interested in Serverless technologies and how customers can leverage them to bring values to their business. Outside of work, Windrich enjoys playing and watching sports, as well as exploring different cuisines around the world.
Bob Strahan is a Principal Solutions Architect in the AWS Language AI Services team.