To help you fast track your company’s adoption of machine learning (ML), AWS offers educational solutions for developers to get hands-on experience. We like to think of these programs as a fun way for developers to build their skills using ML technologies in real world scenarios. In this post, we walk you through how to prepare for and run an AI music competition using AWS DeepComposer. Through AWS DeepComposer, you can experience Generative AI in action and learn how to harness the latest in ML and AI. We provide an end-to-end kit that contains tools, techniques, processes, and best practices to run the event.
Since AWS re:Invent 2017, our team has launched three thought-leading products aimed to get developers hands-on with the latest machine learning technologies in a fun way:
Designed specifically to educate developers on generative AI, AWS DeepComposer includes tutorials, sample code, and training data in an immersive platform that can be used to build ML models with music as the medium of instruction. Developers, regardless of their background in ML or music, can get started with applying AI techniques including Generative Adversarial Networks (GANs), Autoregressive Convolutional Neural Networks (AR-CNN) and Transformers to generate new musical notes and accompaniments. With AWS DeepComposer, even non-musicians can use one of the pre-built models we provided to embellish a simple eight-bar tune to create unique compositions while more advanced users train and optimize ML models to create original music.
We have been running AI music events using AWS DeepComposer with customers, our partners, and internally within AWS, and these have received a lot of interest and engagement among both technical and non-technical users. Participants enjoy the opportunity to learn the latest in ML, compose music without the need for any knowledge of music theory, and win prizes in a talent show styled competition. And so, as a logical next step, we have packaged the resources required for you to run your own AI music challenge for employees, partners, and customers at corporate or meet-up style events. This not only scales out the educational experience but also helps accelerate ML adoption in your organization.
We walk you through the following topics in this post:
To run this event, we suggest you as the organizer start planning and preparing 3 weeks in advance to ensure you have everything in place. For participation, all that is needed is access to AWS Management Console and specifically AWS DeepComposer.
The purpose of the competition is to make learning fun and engaging by presenting a real use case of generative AI. You can organize the event multiple ways, for example:
You can choose a format that fits your participant schedule. The more time you give participants, the more time they have to experiment with different options to compose some really cool music. In this section, we provide a sample agenda encompassing the options we discussed; feel free to modify them based on how you organize your event. We go through some of these items in more detail in the rest of this post.
Note on Judging process: As we will see from the following agenda and the rest of this post, we have two options for judging the musical tracks to decide the winners. We will show you how to organize a team of human judges, and we also provide you a ML model (i.e. an AI Judge) that can identify similarity with Bach music. In our events we select winners using the human judges for the main event and the AI judge as a special category.
Considering the following agenda example runs for several hours, please make sure you add a 15-minute break every 2 hours or so. We have included buffer in the session durations to accommodate the breaks throughout the day.
|9 AM – to 9:15 AM||Agenda Walkthrough||Presentation||Walk participants through the agenda and logistics for the day.|
|9:15 AM – 9:45 AM||Introduction to ML at AWS||Presentation||Introduce the three levels of the ML stack at AWS. This helps to set context for participants for the event. For more information, see Machine Learning on AWS.|
|9:45 AM – 10 AM||AWS DeepComposer Kahoot||Game||The competition is all about making learning fun. So we have a quick icebreaker to get everyone’s game face on. Kahoot is an interactive and engaging way to take a fun quiz on AWS DeepComposer. You can collect questions from the learning capsules in the Music Studio or you can make this an organization-specific exercise as well. We typically present the winner of the Kahoot game Amazon swag, such as stickers. We announce this upfront to encourage participation.|
|10 AM – 10:15 AM||AWS DeepComposer Music Studio Walkthrough||Demo||For many of the participants, this is the first time they have been exposed to the AWS DeepComposer Music Studio, so we spend 10–15 minutes walking through the steps necessary for creating their own compositions. Facilitators demonstrate how to navigate the Music Studio to generate compositions. For a video demonstrating the Music Studio walkthrough, see “AWS DeepComposer – Music Studio Walkthrough”.|
|10:15 AM – 10:45 AM||ML Algorithms in AWS DeepComposer||Presentation||We introduce participants to the ML algorithms that power AWS DeepComposer, such as GAN, AR-CNN, and Transformers.|
|10:45 AM – 11:30 AM||Composing Music in AWS DeepComposer||Lab||In this interactive lab, participants try out AWS DeepComposer features, play with the ML algorithms, and generate AI music, all under prescriptive guidance of the organizers. This session serves as a precursor to the competition.|
|11:30 AM – 12:30 PM||AWS DeepComposer Got Talent Kickoff||Presentation||Organizers walk participants through competition guidelines, rules, best practices, SoundCloud setup, scoring process, leaderboards, and introduce the judges.|
|12:30 – 1 PM||Lunch|
|1 PM – 3 PM||AWS DeepComposer Got Talent Competition||Activity||Game on! The latest submission time for compositions is 3 PM.|
|3 PM – 4 PM||Create Playlists for Judges||Activity||Organizers create separate SoundCloud playlists for each of the judges. Depending on the number of participants and size of the event, you can have each of the judges listen to all of the tracks and take a weighted average score.|
|End of DAY 1|
|9 AM – 10:30 AM||Judge Music Compositions: Qualifiers||Activity||Judges listen to all the submissions and score them. A leaderboard keeps track of the scores and the top composers.|
|10:30 AM – 12||Judge Music Compositions: Finals||Activity||Judges listen to the top 15 (or more) tracks from the leaderboard and score these submissions again. From the revised leaderboard, the top three submissions are selected as winning submissions.|
|11:30 AM – 12||Run AI Judge||Activity||The ML model trained to listen to Bach music is run against the compositions to determine the top three submissions that are closest in similarity.|
|12 – 1 PM||Lunch|
|1 PM – 2 PM||Awards Ceremony||Presentation||The top three composers for the main challenge and the top three composers selected by the AI Judge receive their awards.|
|End of DAY 2|
The AWS DeepComposer Got Talent competition is intended to provide a fun backdrop for spending a day learning more about generative AI using AWS DeepComposer through the AWS DeepComposer console. Prizes may be at stake, so in the interest of fairness, we have created a short list of suggested rules for contestants to follow. You can come up with your own list or customize this list as needed.
Evolve the competition rules and format to scale with the number of judges and contestants expected. Each submission takes several minutes to judge, and you can scale the competition by increasing the number of judges or by allocating contestants to judges to limit the number of submissions required for each judge to listen to in preliminary rounds.
Event participants should either have or be provided the following before the start of the event:
Many of your participants may be unfamiliar with the AWS DeepComposer Music Studio, so spend 10–15 minutes walking through the steps necessary to create compositions. The Music Studio isn’t difficult to operate, but providing an overview of the functionality gives participants an accelerated fast path to creating compositions of their own.
The basic steps for creating a composition in the Music Studio that requires a demonstration are:
Read more about getting started with the AWS DeepComposer Music Studio.
We recommend providing a 100–200 level overview of each of the three algorithms utilized within AWS DeepComposer. With a basic understanding of the use case and output for GANs, AR-CNNs, and Transformers, your participants can more effectively utilize AWS DeepComposer to create an AWSome composition.
We recommend delivering this section as a 15-minute brief presentation that walks the participants through a conceptual and visual implementation of each algorithm. In our event, we condensed each of the included learning modules from the AWS DeepComposer console into a short slide deck that also included example musical samples as input and the generated composition from the model. In preparing this material, we recommend watching the re:Invent 2020 tech talk Getting Started with Generative AI with AWS DeepComposer
Generative Adversarial Networks (GANs) are one of the most popular generative AI techniques available today. Its core concept is to pit two neural networks against each other to generate new creative content such that one neural network generates new music, and the other discriminates and critiques the generate music to quantify how realistic the generated music is.
In AWS DeepComposer, we use GANs to generate accompanying tracks to an existing melody. The generator produces a new arrangement and sends it to the discriminator to receive feedback to create better music in the next iteration. Over time, the generator gets better at creating realistic music and eventually fools the discriminator. At this point, we retrain the discriminator using the new data from the generator so that it becomes even better at directing the generator.
Autoregressive Convolutional Neural Networks (AR-CNNs) work to sequentially and iteratively modify a melody to match a specific style more closely, or to make it more harmonious by removing wrong notes. At every step of an AR-CNN, we add or remove a single note from the composition, and each note that is added or removed depends on the notes around that step. The key idea is that each edit made depends on the other notes that have been added or removed. As part of the training process, the piano roll is modified and compared against the complete piano roll to more closely analyze the dataset it’s trying to replicate. Over iterations of training, the model learns to add or remove notes from a song so it can convert an out-of-tune song into a more correct song, or add more chords and melodies of a different style into a composition.
Transformers are state-of-the-art models for generating a complete set of sequential data such as text, music, and other time series data in a single operation. In contrast to sequential models like AR-CNNs, transformers take an entire sequential dataset such as a series of notes as a single input and operate on it all at once. Because of their ability to process an entire dataset as a single input, transformers stand out in their ability to capture near-term and long-term dependencies in a dataset. This enables the transformers within AWS DeepComposer to generate a longer version of an existing composition that is thematically and musically coherent with the original piece.
The ML algorithms used in AWS DeepComposer have already been applied to innovative and compelling use cases. Within the medical field, GANs are used to optimize the design process of new pharmaceutical compounds , and to improve cancer detection screening rates. Transformers are at the core of natural language processing, and power conversational AI, language translation, and text summarization.
In this section, we discuss what participants need to get started.
We recommend that each participant is provided a separate AWS account in order to compose their scores. For smaller events, manual account creation via the console and distribution via existing team communication channels or email is acceptable, but for large-scale events, more automated measures are preferred. For very small-scale events with fewer than 10 participants, you can also set up individual users in a single AWS account. This allows users to log in and compose within a single account without colliding with each other’s compositions.
If your account is already set up with AWS Organizations, a service that helps you centrally manage and govern your environment as you grow and scale your AWS resources, you can create an Organizational Unit dedicated to the AWS DeepComposer event, which allows you to restrict usage of those accounts to the Regions where AWS DeepComposer is available and the associated resources permitted to be created within those accounts. If your enterprise uses AWS Control Tower, a service that provides the easiest way to set up and govern a secure, multi-account AWS environment, you can also use Account Factory to provision the accounts. For more details about large-scale account provisioning and governance, see Automate account creation, and resource provisioning using AWS Service Catalog, AWS Organizations, and AWS Lambda.
If your enterprise allocates these accounts as part of your organization, you can use AWS Single Sign-On, which centrally manages access to multiple AWS accounts and business applications, or use your federation provider to allow users to access these accounts with a specific role created for the event. If you’re not using AWS Organizations, you need a list of account IDs, user names, and passwords.
For distribution, you can set up an application to allow users to self-obtain accounts, user names, and passwords for the event. AWS Amplify provides a set of tools and services that can be used together or on their own, to help front-end web and mobile developers build scalable full stack applications. Amplify also provides a simple method of creating a website that is accessible to specific users either via access controls like IP range, or by authentication like Amazon Cognito, an AWS service that provides simple and secure user signup, sign-in, and access control. When the application is ready, the user can log in to retrieve their AWS account credentials, access the console, and go to the AWS DeepComposer console to start composing.
This section is an interactive hands-on lab where participants explore the AWS DeepComposer Music Studio and create their first compositions on their own. Prior to this lab, ensure that all participants have lab accounts or AWS accounts to access the AWS DeepComposer console. We recommend this section to have at least two moderators, in addition to the presenter, to monitor the chats and guide the participants. The lab is split into the following sections on a high level:
For detailed instructions on the workshop, see AWS DeepComposer Lab. Throughout the lab, encourage participants to try out the various settings on both input and model inference pages to make their composition unique.
Share the following best practices with the participants:
AWS DeepComposer provides built-in integration with SoundCloud, a cloud-based music playlist app for submitting compositions for the event.
Before the competition begins, your participants should be directed to SoundCloud.com to create an account. If your participants already have SoundCloud accounts, they can use those instead.
Prior to the start of your event, decide upon and communicate a naming convention for your participant submissions. This allows you to easily search for and manage the submissions for judgment. The suggested naming convention is #eventName-userName-compositionName. For example, for an AWS Event called AWS Composition Jam and a user with the internal user name of joshr, the submitted track name would be #aws-composition-jam-joshr-BaroqueBeatz05.
Make sure participants don’t inadvertently submit their composition to the Chartbusters challenge.
You can decide if your participants are permitted to submit more than one composition for judgment, but we suggest having a defined upper limit of submissions. All users and staff can then use SoundCloud’s built-in search tool to locate submissions. For example, using the preceding naming convention, we can search based on the name prefix.
We wanted an easy, repeatable, and accessible mechanism for our judges to rate scores for compositions. To that end, we built an application within Amazon Honeycode, which gives you the power to build powerful mobile and web applications without writing any code. As our judges submit scores, the powerful spreadsheet and filtering functionality in Honeycode allows for real-time aggregation and ranking of compositions.
We have included sample composers and scores in your table to help validate your Honeycode tables and application. Please follow the instructions provided in README for Honeycode tables in our GitHub repository to complete the build.
The compositions submitted by the participants are queued for judging. There are two types of Judges – Human and AI. The human judges are a team of leaders from your organization that you select and train to participate in the competition. The AI Judge is a ML model that we provide you to run inference on musical tracks submitted by participants.
Human judges who listen to the tracks and use the scoring tool to rate their feedback. When selecting candidates to be part of the human judges team, make sure they have a musical disposition. Judges don’t need to have musical talent, but should be open for new experiences, enjoy what they’re doing, and carry a sense of objectivity. In short, select folks that subscribe to the following tenets:
The judges use the scoring tool that lists the evaluation criteria and an option to select one of several scores for each. The music is assessed based on the following criteria:
For the judging process, we use a five-point scale. For each of criteria, the judges pick a score of 1–5 based on their listening experience. The cumulative score drives the composer’s position in the leaderboard. The scores are titled as follows:
The scoring process involves the following steps:
AWS DeepComposer creates a completely original melody in seconds, powered by AI. It seems only appropriate that the composition is also judged by an AI model. Data scientists used samples of Bach and non-Bach input melodies to train an ML model that can provide a score of how Bach-like a composition is. MIDI (Musical Instrument Digital Interface) is a file format that is popularly used to store music, and the input melodies are provided to the model in the MIDI format. A Python library called PyPianoroll is used to read the MIDI files into a multitrack object, which allows us to easily wrangle and manipulate the music files.
The dataset for training is prepared with both positive examples (original Bach music MIDI files) and negative samples (non-Bach MIDI files). A CNN-based classifier is then trained on these samples to classify whether an input melody is in Bach style or not. To get the final score for each composition, the confidence score of the CNN model’s output is combined with the distribution similarity on pitch frequency of the input melody to original Bach music.
To run the AI judge, the input files need to be in the MIDI format, and melodies can’t be more than eight bars long.
Before you run the judge, deploy a web application to collect input melodies. In this first step, we use the AWS Serverless Application Model (AWS SAM) to deploy an application that lets users upload files to an S3 bucket using Amazon S3 presigned URLs.
Make sure that the AWS Identity and Access Management (IAM) role you use to access the AWS account has sufficient permissions to deploy an AWS CloudFormation template, Lambda functions, and Amazon S3.
The new file lets users upload only MIDI files.
We update this function to receive a file name along with the file as input, and upload the incoming file under a specific name.
Refer to the AWS Samples repository for more detailed instructions on deploying the template. This step might take several minutes.
It should look similar to https://abcd123345.execute-api.us-west-2.amazonaws.com/.
The upload URL is your endpoint with the /uploads route added, and looks similar to https://abcd123345.execute-api.us-west-2.amazonaws.com/uploads.
You can use the same bucket that hosts the leaderboard.
We recommend that the unique name for each composition be the participant’s work alias, email, or a firstname_lastname format to ensure that the composition can be traced back to the participant. If they fail to provide a user name, the application stores the melody with a random file name and they lose their chance of winning.
You’re now ready to run the AI judge. The inference code to run the AI Judge is provided in the repository. A Jupyter notebook is provided with the repository to run the inference through an interactive session. This can be run either on your local machine, if you have Jupyter installed, or on a SageMaker notebook instance. If you use SageMaker, we recommend a p3 instance for faster processing of the input melodies.
The folder contains three files: a pre-trained model, inference code, and the IPython notebook file that you can use to run the AI judge.
The notebook downloads all input files from the S3 bucket, runs the inference script to generate scores based on the similarity to Bach music, and prints the results.
The last cell of the notebook outputs the scores for each composition in descending order.
A good way to drive engagement for the event is to select a senior leader (or multiple senior leaders) to present the awards and facilitate the ceremony. We provide the following recommendations based on our experience.
We have three 3 different awards categories:
We recommend the following order when the senior leaders present the awards:
In this post, we presented a detailed walkthrough on how to prepare for and run your own AI-powered musical challenge for your employees, customers, and partners: AWS DeepComposer Got Talent.
To get you started with AWS DeepComposer we provide a Free Tier. Please refer to https://aws.amazon.com/deepcomposer/pricing/ for more details.
Music is always fun, and when combined with the potential to learn the latest in ML, and also win prizes, you are sure to drive a lot of engagement. We look forward to hearing your feedback. Have fun with your event!
Maryam Rezapoor is a Senior Product Manager with AWS DeepLabs team based in Santa Clara, CA. She works on developing products to put Machine Learning in the hands of everyone. She loves hiking through the US national parks and is currently training for 1-day Grand Canyon Rim to Rim hike. She is a fan of Metallica and Evanescence. The drummer, Lars Ulrich, has inspired her to pick up those sticks and play drum while singing “nothing else matters.”
Jana Gnanachandran is an Enterprise Solutions Architect at AWS based in Houston, Texas, focusing on Data Analytics, AI/ML and Serverless platforms. In his spare time, he enjoys playing tennis, learning new songs in his keyboard, and photography.
Durga Sury is a Data Scientist with AWS Professional Services, based in Houston, TX. Her current interests are Natural Language Processing, AutoML and MLOps. Outside of work, she loves binging on crime shows, motorcycle rides, and hiking with her husky. She is trained in Indian classical music and hopes to play Metallica on the guitar one day.
Josh Schairbaum is an Enterprise Solutions Architect based in Austin, TX. When he’s not assisting customers derive and implement the necessary technical strategy to meet their business goals, Josh spends a lot of time with his kids, exercising, and diving deeper into Hip Hop, both old and new.
David Coon is a Senior Technical Account Manager based out of Houston TX. He loves to dive deep into his customer’s technical and operational issues, and has a passion for music especially by Snarky Puppy, machine learning, and automation. When he’s not on the clock, David is usually walking his dogs, tinkering with a DIY project at home, or re-watching The Expanse.
David Bounds is an Enterprise Solutions Architect based in Atlanta, GA. When he isn’t obsessing over his customers, David is endlessly tinkering with his 3D Printers, running, cooking, or discovering a new, expensive, and time-consuming hobby. David is usually accompanied by a very worried Boxer who quite likes anything by the Wu-Tang Clan.
Henry Wang is a Data Scientist at Amazon Machine Learning Solutions Lab where he helps AWS customers across industries to adopt Cloud and AI to tackle various challenges. In his spare time, he enjoys playing tennis and golf, watching StarCraft II tournaments, reading while listening to New Age music.
Prem Ranga is an Enterprise Solutions Architect based out of Houston, Texas. He is part of the Machine Learning Technical Field Community and loves working with customers on their ML and AI journey. Prem can moonwalk just like MJ, and is recently enamored by the deep sounds of the Liquid Mind collection, Max Richter, Nils Frahm and considers the soundtrack of Ad Astra profound.