Organizations are continuing to evaluate remote working arrangements and explore moving to a hybrid workforce model. Emerging trends suggest that not only has the number of online meetings attended by employees on a day-to-day basis increased, but also the number of attendees per meeting. One of the key challenges with online meetings is ensuring efficient dissemination of information to all the attendees after the meeting. There could be loss of information, either due to ad hoc, overlapping communication between the attendees, or due to technical challenges, like network disruption or bandwidth constraints. You can overcome such challenges by using AWS artificial intelligence (AI) and machine learning (ML) technologies to generate meeting artifacts automatically, such as summaries, call-to-action items, and meeting transcriptions.
In this post, we demonstrate a solution that uses the Amazon Chime SDK, Amazon Transcribe, Amazon Comprehend, and AWS Step Functions to record, process, and generate meeting artifacts. Our proposed solution is based on a Step Functions workflow that starts when the meeting bot stores the recorded file in an Amazon Simple Storage Service (Amazon S3) bucket. The workflow contains steps that transcribe and derive insights from the meeting recording. Lastly, it compiles the data into an email template and sends it to meeting attendees. You can easily adapt this workflow for different use cases, such as web conferencing solutions.
The application is primarily divided into two parts: the conferencing solution built using the Amazon Chime SDK, and the AI/ML-based processing workflow implemented using Amazon Transcribe and Amazon Comprehend. The following diagram illustrates the architecture.
The conferencing application is a web-based application built using the Amazon Chime JS SDK and hosted using a combination of Amazon Elastic Container Service (Amazon ECS), AWS Lambda, and Amazon API Gateway. Session information for the meetings is stored in Amazon DynamoDB tables. During a conference call, the session information is captured using an Amazon EventBridge connector for the Amazon Chime SDK, and written to the DynamoDB tables. The following features are available on the web application:
The preceding features allow users to start, attend, and record conference calls. The call recording generates a video file that is delivered to an S3 bucket. The S3 bucket is configured with an Amazon S3 event notification for the s3:ObjectCreated:Put event, and initiates the AI/ML processing workflow. These solutions are available as demos on the Amazon Chime JS SDK page on GitHub.
The AI/ML processing workflow built with Step Functions uses Amazon Transcribe and Amazon Comprehend. The output of this processing workflow is a well-crafted email that is sent to the conference call owner using Amazon Simple Email Service (Amazon SES). The following sequence of steps is involved in the AI/ML workflow:
response = client.start_transcription_job( TranscriptionJobName=job_name, #Name of the job LanguageCode=language_code, #Language code for the language in media file MediaFormat=media_format, #Format of input media file Media={ ‘MediaFileUri’: file_uri #S3 object location of input media file }, Settings={ ‘VocabularyName’: vocab_name #Name of the custom vocabulary to use } )
The following is a sample code using the Boto3 SDKs for starting an asynchronous entity detection from the transcribed output:
response = client.start_entities_detection_job( InputDataConfig={ ‘S3Uri’: input_path, #Location of the transcribed output ‘InputFormat’: ‘ONE_DOC_PER_FILE’ #or ‘ONE_DOC_PER_LINE’ }, OutputDataConfig={ ‘S3Uri’: output_path #Location of the comprehend output }, EntityRecognizerArn=cer_arn, #The Amazon Resource Name (ARN) that identifies the specific entity recognizer LanguageCode=language_code, #Language code for the transcribed output DataAccessRoleArn=role, JobName=job_name, #Name of the job )
The entire AI/ML processing workflow is shown in the following figure.
The following figure shows a sample email that is sent out to the meeting attendees by the AI/ML processing workflow. The email provides details such as the meeting title, attendees, key discussion points, and the action items.
In this post, we demonstrated how you can use AWS AI services such as Amazon Transcribe and Amazon Comprehend along with the Amazon Chime SDK to generate high-quality meeting artifacts. We demonstrated the custom vocabulary feature of Amazon Transcribe and the custom entities feature of Amazon Comprehend that allow you to customize the artifacts based on your business requirements.
Learn more about AWS AI services and get started building your own custom processing workflow using AWS Step Functions and Amazon Chime SDK.
Rajdeep Tarat is a Senior Solutions Architect at AWS. He lives in Bengaluru, India, and helps customers architect and optimize applications on AWS. In his spare time, he enjoys music, programming, and reading.
Venugopal Pai is a Solutions Architect at AWS. He lives in Bengaluru, India, and helps digital native customers scale and optimize their applications on AWS.