In Part 1 of this series, we walk through a continuous model improvement machine learning (ML) workflow with Amazon Rekognition Custom Labels and Amazon Augmented AI (Amazon A2I). We explained how we use AWS Step Functions to orchestrate model training and deployment, and custom label detection backed by a human labeling private workforce. We described how we use AWS Systems Manager Parameter Store to parameterize the ML workflow to provide flexibility without needing development rework.
In this post, we provide step-by-step instructions to deploy the solution with AWS CloudFormation.
You need to complete the following prerequisites before deploying the solution:
Now that we have explained how this solution works, we show you how to use AWS CloudFormation to deploy the required and optional AWS resources for this solution.
The CloudFormation stack takes about 5 minutes to complete. You should receive an email “AWS Notification – Subscription Confirmation” asking you to confirm a subscription. Choose Confirm subscription in the email to confirm the subscription. Additionally, if AWS CloudFormation deployed the private work team for you, you should receive an email “Your temporary password” with a username and temporary password. You need this information later to access the Amazon A2I web UI to perform the human labeling tasks.
In the following sections, we walk you through the end-to-end process of using this solution. The process includes the following high-level steps:
In this post, we use Amazon and AWS logos for our initial training images.
Amazon Rekognition Custom Labels supports two methods of labeling datasets: object-level labeling and image-level labeling. This solution uses the image-level labeling method. Specifically, we use the folder name as the label such that images in the amazon_logo and aws_logo folders are labeled amazon_log and aws_logo, respectively.
In this section, we walk you through the steps to add training images to Amazon S3 and explain how the automatic model training process works.
You should see five folders, as in the following screenshot.
You should now have two folders and 10 images, as shown in the S3 images_labeled_by_folder folder.
At this point, we have 10 training images, which meet the minimum of 10 new images (as currently set in Parameter Store) to qualify for new model training. When the Amazon EventBridge schedule rule triggers to invoke the Step Functions state machine to check for new training, it initiates a full model training and deployment process. Because the current schedule rule polling frequency is 600 minutes, the schedule rule doesn’t trigger in the time frame of this demo. We manually trigger the schedule rule for the sole purpose of this demo only.
This resets the schedule rule to initiate an event to send to Step Functions to check the conditions for new model training. You should also receive three separate emails, two indicating that you updated a parameter and one indicating that automatic training is checked.
Wait for the model training and deployment to complete. This should take approximately 1 hour. You should receive periodic emails on model training and deployment status. The model is deployed when you receive an email with the message “Status: RUNNING.”
If you’re redirected to a different Region, switch to the correct Region.
You should see a list of runs and associated statuses.
The state machine records every run so you can always refer back to troubleshoot issues.
Now that your initial model has been deployed, let’s test out the model with some images.
In this section, we walk you through the steps to detect custom logos and explain how A2I human labeling task is created.
The upload process triggers an S3 PutObject event, which invokes an AWS Lambda function to run the Amazon Rekognition Custom Labels detection process. You should receive an email with a detection result indicating a confidence score of at least 70, which is as expected because you’re using the same image you used for training.
You should receive emails with detection results with varying confidence scores, and some should indicate that Amazon A2I human loops have been created. When a detection result has a confidence score less than the acceptance level, which is currently set at 70, a human labeling task is created to label that image.
In the last section, we staged some human labeling tasks. In this section, we walk you through the steps to complete a human labeling task and explain how the human labeling results are processed for new model training.
At this stage, you should have two folders and a total of 10 images.
If you’re redirected to a different Region, switch to the correct Region.
This opens the Amazon A2I web portal for the human labeling task.
If the private team was deployed for you, the username and temporary password are in the email that was sent. If you created your own private team, you should have the information already.
For each image you labeled, a Lambda function makes a copy of the original image, appends the prefix humanLoopName and an UUID to the original S3 object key, and adds it to the corresponding labeled folder. If the folder doesn’t exist, the Lambda function creates a corresponding labeled folder. For the None of the Above labels, nothing is done. This is how newly captured and labeled images are added to the training dataset.
In the previous sections, you uploaded 10 training images and manually invoked model training for the purpose of this demo only. The model was trained and automatically deployed because the F1 score was greater than the minimum of 0.8 as set in Parameter Store. In addition, the model was deployed with a minimum inference unit of 1, as set in Parameter Store.
Next, you uploaded some inference images for detection. Based on the minimum detection confidence score of 70% (as set in Parameter Store), images with less than 70% required human review. For each image requiring review, a human review task was created, because the Amazon A2 workflow process was enabled in the parameter. You completed the human review tasks and some labeled images were added to the total training dataset.
For new model training to begin, it needs to meet three conditions in the following order:
Complete the following steps to clean up the AWS resources that we created as part of this post to avoid potential recurring charges.
aws sagemaker delete-workforce –workforce-name default –region your-region-x
In Part 1 of this series, we provided the overview of a continuous model improvement ML workflow with Amazon Rekognition Custom Labels and Amazon A2I.
In this post, we walked through the steps to deploy the solution with AWS CloudFormation and completed an end-to-end process to train and deploy a Amazon Rekognition Custom Labels model, perform custom label detection, and create and complete Amazon A2I human labeling tasks.
For an in-depth explanation and the code sample in this post, see the AWS Samples GitHub repository.
Les Chan is a Sr. Partner Solutions Architect at Amazon Web Services. He helps AWS Partners enable their AWS technical capacities and build solutions around AWS services. His expertise spans application architecture, DevOps, serverless, and machine learning.
Daniel Duplessis is a Sr. Partner Solutions Architect at Amazon Web Services, based out of Toronto. He helps AWS Partners and customers in enterprise segments build solutions using AWS services. His favorite technical domains are serverless and machine learning.