Amazon Comprehend is a managed AI service that uses natural language processing (NLP) with ready-made intelligence to extract insights about the content of documents. It develops insights by recognizing the entities, key phrases, language, sentiments, and other common elements in a document. The ability to train custom models through the Custom classification and Custom entity recognition features of Comprehend has enabled customers to explore out-of-the-box NLP capabilities tied to their requirements without having to take the approach of building classification and entity recognition models from scratch.
Today, users invest a significant amount of resources to build, train, and maintain custom models. However, these models are sensitive to changes in the real world. For example, since 2020, COVID has become a new entity type that businesses need to extract from documents. In order to do so, customers have to retrain their existing entity extraction models with new training data that includes COVID. Custom Comprehend users need to manually monitor model performance to assess drifts, maintain data to retrain models, and select the right models that improve performance.
Comprehend flywheel is a new Amazon Comprehend resource that simplifies the process of improving a custom model over time. You can use a flywheel to orchestrate the tasks associated with training and evaluating new custom model versions. You can create a flywheel to use an existing trained model, or Amazon Comprehend can create and train a new model for the flywheel. Flywheel creates a data lake (in Amazon S3) in your account where all the training and test data for all versions of the model are managed and stored. Periodically, the new labeled data (to retrain the model) can be made available to flywheel by creating datasets. To incorporate the new datasets into your custom model, you create and run a flywheel iteration. A flywheel iteration is a workflow that uses the new datasets to evaluate the active model version and to train a new model version.
Based on the quality metrics for the existing and new model versions, you set the active model version to be the version of the flywheel model that you want to use for inference jobs. You can use the flywheel active model version to run custom analysis (real-time or asynchronous jobs). To use the flywheel model for real-time analysis, you must create an endpoint for the flywheel.
This post demonstrates how you can build a custom text classifier (no prior ML knowledge needed) that can assign a specific label to a given text. We will also illustrate how flywheel can be used to orchestrate the training of a new model version and improve the accuracy of the model using new labeled data.
To complete this walkthrough, you need an AWS account and access to create resources in AWS Identity and Access Management (IAM), Amazon S3 and Amazon Comprehend within the account.
For information about creating IAM policies for Amazon Comprehend, see Permissions to perform Amazon Comprehend actions.
In this post, we use the Yahoo corpus from Text Understanding from scratch by Xiang Zhang and Yann LeCun. The data can be accessed from AWS Open Data Registry. Please refer to section 4, “Preparing data,” from the post Building a custom classifier using Amazon Comprehend for the script and detailed information on data preparation and structure.
Alternatively, for even more convenience, you can download the prepared data by entering the following two command lines:
Admin:~/environment $ aws s3 cp s3://aws-blogs-artifacts-public/artifacts/ML-13607/custom-classifier-partial-dataset.csv . Admin:~/environment $ aws s3 cp s3://aws-blogs-artifacts-public/artifacts/ML-13607/custom-classifier-complete-dataset.csv .
We will be using the custom-classifier-partial-dataset.csv (about 15,000 documents) dataset to create the initial version of the custom classifier. Next, we will create a flywheel to orchestrate the retraining of the initial version of the model using the complete dataset custom-classifier-complete-dataset.csv (about 100,000 documents). Upon retraining the model by triggering a flywheel iteration, we evaluate the model performance metrics of the two versions of the custom model and choose the better-performing one as the active model version and demonstrate real-time custom classification using the same.
Please find the following steps to set up the environment and the data lake to create a Comprehend flywheel iteration to retrain the custom models.
You can interact with Amazon Comprehend via the AWS Management Console, AWS Command Line Interface (AWS CLI), or Amazon Comprehend API. For more information, refer to Getting started with Amazon Comprehend.
In this post, we use AWS CLI to create and manage the resources. AWS Cloud9 is a cloud-based integrated development environment (IDE) that lets you write, run, and debug your code. It includes a code editor, debugger, and terminal. AWS Cloud9 comes prepackaged with AWS CLI.
Please refer to Creating an environment in AWS Cloud9 to set up the environment.
Use the following command to create a custom classifier: yahoo-answers-version1 using the dataset: custom-classifier-partial-dataset.csv. Replace the data access role ARN and the S3 bucket locations with your own.
$ aws comprehend create-document-classifier –document-classifier-name “yahoo-answers-version1” –data-access-role-arn arn:aws:iam::123456789012:role/comprehend-data-access-role –input-data-config S3Uri=s3://123456789012-comprehend/custom-classifier-partial-dataset.csv –output-data-config S3Uri=s3://123456789012-comprehend/TrainingOutput/ –language-code en
The above API call results in the following output:
{ “DocumentClassifierArn”: “arn:aws:comprehend:us-east-1:123456789012:document-classifier/yahoo-answers-version1”}
CreateDocumentClassifier starts the training of the custom classifier model. In order to further track the progress of the training, use DescribeDocumentClassifier.
$ aws comprehend describe-document-classifier –document-classifier-arn arn:aws:comprehend:us-east-1:123456789012:document-classifier/yahoo-answers-version1 { “DocumentClassifierProperties”: { “DocumentClassifierArn”: “arn:aws:comprehend:us-east-1:123456789012:document-classifier/yahoo-answers-version1”, “LanguageCode”: “en”, “Status”: “TRAINED”, “SubmitTime”: “2022-09-22T21:17:53.380000+05:30”, “EndTime”: “2022-09-22T23:04:52.243000+05:30”, “TrainingStartTime”: “2022-09-22T21:21:55.670000+05:30”, “TrainingEndTime”: “2022-09-22T23:04:17.057000+05:30”, “InputDataConfig”: { “DataFormat”: “COMPREHEND_CSV”, “S3Uri”: “s3://123456789012-comprehend/custom-classifier-partial-dataset.csv” }, “OutputDataConfig”: { “S3Uri”: “s3://123456789012-comprehend/TrainingOutput/333997476486-CLR-4ea35141e42aa6b2eb2b3d3aadcbe731/output/output.tar.gz” }, “ClassifierMetadata”: { “NumberOfLabels”: 10, “NumberOfTrainedDocuments”: 13501, “NumberOfTestDocuments”: 1500, “EvaluationMetrics”: { “Accuracy”: 0.6827, “Precision”: 0.7002, “Recall”: 0.6906, “F1Score”: 0.693, “MicroPrecision”: 0.6827, “MicroRecall”: 0.6827, “MicroF1Score”: 0.6827, “HammingLoss”: 0.3173 } }, “DataAccessRoleArn”: “arn:aws:iam::123456789012:role/comprehend-data-access-role”, “Mode”: “MULTI_CLASS” }}
Console view of the initial version of the custom classifier as a result of the create-document-classifier command previously described
Model Performance
Once Status shows TRAINED, the classifier is ready to use. The initial version of the model has an F1-score of 0.69. F1-score is an important evaluation metric in machine learning. It sums up the predictive performance of a model by combining two otherwise competing metrics—precision and recall.
As the next step, create a new version of the model with the updated dataset (custom-classifier-complete-dataset.csv). For retraining, we will be using Comprehend flywheel to help orchestrate and simplify the process of retraining the model.
You can create a flywheel for an existing trained model (as in our case) or train a new model for the flywheel. When you create a flywheel, Amazon Comprehend creates a data lake to hold all the data that the flywheel needs, such as the training data and test data for each version of the model. When Amazon Comprehend creates the data lake, it sets up the following folder structure in the Amazon S3 location.
Datasets Annotations pool Model datasets (data for each version of the model) VersionID-1 Training Test ModelStats VersionID-2 Training Test ModelStats
Warning: Amazon Comprehend manages the data lake folder organization and contents. If you modify the datalake folders, your flywheel may not operate correctly.
Note: If you create a flywheel for an existing trained model version, the model type and model configuration are preconfigured.
Be sure to replace the model ARN, data access role, and data lake S3 URI with your resource’s ARNs. Use the second S3 bucket 123456789012-comprehend-flywheel-datalake created in the “Setting up S3 buckets” step as the data lake for flywheel.
$ aws comprehend create-flywheel –flywheel-name custom-model-flywheel-test –active-model-arn arn:aws:comprehend:us-east-1:123456789012:document-classifier/yahoo-answers-version1 — data-access-role-arn arn:aws:iam::123456789012:role/comprehend-data-access-role –data-lake-s3-uri s3://123456789012-comprehend-flywheel-datalake/
The above API call results in a FlyWheelArn.
{ “FlywheelArn”: “arn:aws:comprehend:us-east-1:123456789012:flywheel/custom-model-flywheel-test”}
Console view of the flywheel
To add labeled training or test data to a flywheel, use the Amazon Comprehend console or API to create a dataset.
Use flywheel iterations to help you create and manage new model versions. Users can also view per-dataset metrics in the “model stats” folder in the data lake in S3 bucket. Run the following command to start the flywheel iteration:
$ aws comprehend start-flywheel-iteration –flywheel-arn “arn:aws:comprehend:us-east-1:123456789012:flywheel/custom-model-flywheel-test”
The response contains the following content :
{ “FlywheelArn”: “arn:aws:comprehend:us-east-1:123456789012:flywheel/custom-model-flywheel-test”, “FlywheelIterationId”: “20220922T192911Z”}
When you run the flywheel, it creates a new iteration that trains and evaluates a new model version with the updated dataset. You can promote the new model version if its performance is superior to the existing active model version.
Result of the flywheel iteration
We notice that the model performance has improved as a result of the recent iteration (highlighted above). To promote the new model version as the active model version for inferences, use UpdateFlywheel API call:
$ aws comprehend update-flywheel –flywheel-arn arn:aws:comprehend:us-east-1:123456789012:flywheel/custom-model-flywheel-test –active-model-arn “arn:aws:comprehend:us-east-1:123456789012:document-classifier/yahoo-answers-version1/version/Comprehend-Generated-v1-1b235dd0”
The response contains the following contents, which shows that the newly trained model is being promoted as the active version:
{“FlywheelProperties”: {“FlywheelArn”: “arn:aws:comprehend:us-east-1:123456789012:flywheel/custom-model-flywheel-test”,”ActiveModelArn”: “arn:aws:comprehend:us-east-1:123456789012:document-classifier/yahoo-answers-version1/version/Comprehend-Generated-v1-1b235dd0″,”DataAccessRoleArn”: “arn:aws:iam::123456789012:role/comprehend-data-access-role”,”TaskConfig”: {“LanguageCode”: “en”,”DocumentClassificationConfig”: {“Mode”: “MULTI_CLASS”}},”DataLakeS3Uri”: “s3://123456789012-comprehend-flywheel-datalake/custom-model-flywheel-test/schemaVersion=1/20220922T175848Z/”,”Status”: “ACTIVE”,”ModelType”: “DOCUMENT_CLASSIFIER”,”CreationTime”: “2022-09-22T23:28:48.959000+05:30″,”LastModifiedTime”: “2022-09-23T07:05:54.826000+05:30″,”LatestFlywheelIteration”: “20220922T192911Z”}}
You can use the flywheel’s active model version to run analysis jobs for custom classification. This can be for both real-time analysis or for asynchronous classification jobs.
Run the following command to create the endpoint:
$ aws comprehend —endpoint-name custom-classification-endpoint —model-arn arn:aws:comprehend:us-east-1:123456789012:flywheel/custom-model-flywheel-test —desired-inference-units 1
Warning: You will be charged for this endpoint from the time it is created until it is deleted. Ensure you delete the endpoint when not in use to avoid charges.
For API, use the ClassifyDocument API operation. Provide the endpoint of the flywheel for the EndpointArn parameter OR use the console to classify documents in real time.
Flywheel APIs are free of charge. However, you will be billed for custom model training and management. You are charged $3 per hour for model training (billed by the second) and $0.50 per month for custom model management. For synchronous custom classification and entities inference requests, you provision an endpoint with the appropriate throughput. For more details, please visit Comprehend Pricing.
As discussed, you are charged from the time that you start your endpoint until it is deleted. Once you no longer need your endpoint, you should delete it so that you stop incurring costs from it. You can easily create another endpoint whenever you need it from the Endpoints section. For more information, refer to Deleting endpoints.
In this post, we walked through the capabilities of Comprehend flywheel and how it simplifies the process of retraining and improving custom models over time. As part of the next steps, you can explore the following:
There are many possibilities, and we are excited to see how you use Amazon Comprehend for your NLP use cases. Happy learning and experimentation!
Supreeth S Angadi is a Greenfield Startup Solutions Architect at AWS and a member of AI/ML technical field community. He works closely with ML Core , SaaS and Fintech startups to help accelerate their journey to the cloud. Supreeth likes spending his time with family and friends, loves playing football and follows the sport immensely. His day is incomplete without a walk and playing fetch with his ‘DJ’ (Golden Retriever).