Amazon Personalize is excited to announce the new Trending-Now recipe to help you recommend items gaining popularity at the fastest pace among your users.
Amazon Personalize is a fully managed machine learning (ML) service that makes it easy for developers to deliver personalized experiences to their users. It enables you to improve customer engagement by powering personalized product and content recommendations in websites, applications, and targeted marketing campaigns. You can get started without any prior ML experience, using APIs to easily build sophisticated personalization capabilities in a few clicks. All your data is encrypted to be private and secure, and is only used to create recommendations for your users.
User interests can change based on a variety of factors, such as external events or the interests of other users. It’s critical for websites and apps to tailor their recommendations to these changing interests to improve user engagement. With Trending-Now, you can surface items from your catalog that are rising in popularity with higher velocity than other items, such as trending news, popular social content, or newly released movies. Amazon Personalize looks for items that are rising in popularity at a faster rate than other catalog items to help users discover items that are engaging their peers. Amazon Personalize also allows you to define the time periods over which trends are calculated depending on their unique business context, with options for every 30 minutes, 1 hour, 3 hours, or 1 day, based on the most recent interactions data from users.
In this post, we show how to use this new recipe to recommend top trending items to your users.
Trending-Now identifies the top trending items by calculating the increase in interactions that each item has over configurable intervals of time. The items with the highest rate of increase are considered trending items. The time is based on timestamp data in your interactions dataset. You can specify the time interval by providing a trend discovery frequency when you create your solution.
The Trending-Now recipe requires an interactions dataset, which contains a record of the individual user and item events (such as clicks, watches, or purchases) on your website or app along with the event timestamps. You can use the parameter Trend discovery frequency to define the time intervals over which trends are calculated and refreshed. For example, if you have a high traffic website with rapidly changing trends, you can specify 30 minutes as the trend discovery frequency. Every 30 minutes, Amazon Personalize looks at the interactions that have been ingested successfully and refreshes the trending items. This recipe also allows you to capture and surface any new content that has been introduced in the last 30 minutes and has seen a higher degree of interest from your user base than any preexisting catalog items. For any parameter values that are greater than 2 hours, Amazon Personalize automatically refreshes the trending item recommendations every 2 hours to account for new interactions and new items.
Datasets that have low traffic but use a 30-minute value can see poor recommendation accuracy due to sparse or missing interactions data. The Trending-Now recipe requires that you provide interaction data for at least two past time periods (this time period is your desired trend discovery frequency). If interaction data doesn’t exist for the last 2 time periods, Amazon Personalize will replace the trending items with popular items until the required minimum data is available.
The Trending-Now recipe is available for both custom dataset groups as well as video-on-demand domain dataset groups. In this post, we demonstrate how to tailor your recommendations for the fast-changing trends in user interest with this new Trending-Now feature for a media use case with a custom dataset group. The following diagram illustrates the solution workflow.
For example, in video-on-demand applications, you can use this feature to show what movies are trending in the last 1 hour by specifying 1 hour for your trend discovery frequency. For every 1 hour of data, Amazon Personalize identifies the items with the greatest rate of increase in interactions since the last evaluation. Available frequencies include 30 minutes, 1 hour, 3 hours, and 1 day.
To use the Trending-Now recipe, you first need to set up Amazon Personalize resources on the Amazon Personalize console. Create your dataset group, import your data, train a solution version, and deploy a campaign. For full instructions, see Getting started.
For this post, we have followed the console approach to deploy a campaign using the new Trending-Now recipe. Alternatively, you can build the entire solution using the SDK approach with this provided notebook. For both approaches, we use the MovieLens public dataset.
Complete the following steps to prepare your dataset:
For the interactions data, we use ratings history from the movies review dataset, MovieLens.
Please use below python code to curate interactions dataset from the MovieLens public dataset.
import pandas as pd import time import datetime data_dir = “blog_data” !mkdir $data_dir !cd $data_dir && wget http://files.grouplens.org/datasets/movielens/ml-25m.zip !cd $data_dir && unzip ml-25m.zip dataset_dir = data_dir + “/ml-25m/” interactions_df = pd.read_csv(dataset_dir + ‘/ratings.csv’) interactions_df.drop(columns=[‘rating’], axis=1, inplace=True) interactions_df = interactions_df.rename(columns = {‘userId’:’USER_ID’, ‘movieId’:’ITEM_ID’, ‘timestamp’:’TIMESTAMP’}) interactions_file = ‘curated_interactions_training_data.csv’ interactions_df.to_csv(interactions_file, index=False)
The MovieLens dataset contains the user_id, rating, item_id, interactions between the users and items, and the time this interaction took place (a timestamp, which is given as UNIX epoch time). The dataset also contains movie title information to map the movie ID to the actual title and genres. The following table is a sample of the dataset.
USER_ID | ITEM_ID | TIMESTAMP | TITLE | GENRES |
116927 | 1101 | 1105210919 | Top Gun (1986) | Action|Romance |
158267 | 719 | 974847063 | Multiplicity (1996) | Comedy |
55098 | 186871 | 1526204585 | Heal (2017) | Documentary |
159290 | 59315 | 1485663555 | Iron Man (2008) | Action|Adventure|Sci-Fi |
108844 | 34319 | 1428229516 | Island, The (2005) | Action|Sci-Fi|Thriller |
85390 | 2916 | 953264936 | Total Recall (1990) | Action|Adventure|Sci-Fi|Thriller |
103930 | 18 | 839915700 | Four Rooms (1995) | Comedy |
104176 | 1735 | 985295513 | Great Expectations (1998) | Drama|Romance |
97523 | 1304 | 1158428003 | Butch Cassidy and the Sundance Kid (1969) | Action|Western |
87619 | 6365 | 1066077797 | Matrix Reloaded, The (2003) | Action|Adventure|Sci-Fi|Thriller|IMAX |
The curated dataset includes USER_ID, ITEM_ID (movie ID), and TIMESTAMP to train the Amazon Personalize model. These are the mandatory required fields to train a model with the Trending-Now recipe. The following table is a sample of the curated dataset.
USER_ID | ITEM_ID | TIMESTAMP |
48953 | 529 | 841223587 |
23069 | 1748 | 1092352526 |
117521 | 26285 | 1231959564 |
18774 | 457 | 848840461 |
58018 | 179819 | 1515032190 |
9685 | 79132 | 1462582799 |
41304 | 6650 | 1516310539 |
152634 | 2560 | 1113843031 |
57332 | 3387 | 986506413 |
12857 | 6787 | 1356651687 |
After the dataset import job is complete, you’re ready to train your model.
In Amazon Personalize, you use a campaign to make recommendations for your users. In this step, you create a campaign using the solution you created in the previous step and get the Trending-Now recommendations:
After you create or update your campaign, you can get a recommended list of items that are trending, sorted from highest to lowest. On the campaign (trending-now-campaign) Personalization API tab, choose Get recommendations.
The following screenshot shows the campaign detail page with results from a GetRecommendations call that includes the recommended items and the recommendation ID.
The results from the GetRecommendations call includes the IDs of recommended items. The following table is a sample after mapping the IDs to the actual movie titles for readability. The code to perform the mapping is provided in the attached notebook.
ITEM_ID | TITLE |
356 | Forrest Gump (1994) |
318 | Shawshank Redemption, The (1994) |
58559 | Dark Knight, The (2008) |
33794 | Batman Begins (2005) |
44191 | V for Vendetta (2006) |
48516 | Departed, The (2006) |
195159 | Spider-Man: Into the Spider-Verse (2018) |
122914 | Avengers: Infinity War – Part II (2019) |
91974 | Underworld: Awakening (2012) |
204698 | Joker (2019) |
After you create a solution version using the aws-trending-now recipe, Amazon Personalize will identify the top trending items by calculating the increase in interactions that each item has over configurable intervals of time. The items with the highest rate of increase are considered trending items. The time is based on timestamp data in your interactions dataset.
Now let’s provide the latest interactions to Amazon Personalize to calculate the trending items. We can provide the latest interactions using real-time ingestion by creating an event tracker or through a bulk data upload with a dataset import job in incremental mode. In the notebook, we have provided sample code to individually import the latest real-time interactions data into Amazon Personalize using the event tracker.
For this post we will provide the latest interactions as a bulk data upload with a dataset import job in incremental mode. Please use below python code to generate dummy incremental interactions and upload the incremental interactions data using a dataset import job.
import pandas as pd import time import datetime #Selecting some random USER_ID’s for generating incremental interactions. users_list = [‘20371′,’63409′,’54535′,’119138′,’58953′,’82982′,’19044′,’139171′,’98598′,’23822′,’112012′,’121380′,’2660′,’46948′,’5656′,’68919′,’152414′,’31234′,’88240′,’40395′,’49296′,’80280′,’150179′,’138474′,’124489′,’145218′,’141810′,’82607’] #Selecting some random ITEM_ID’s for generating incremental interactions. items_list = [ ‘153’,’2459′,’1792′,’3948′,’2363′,’260′,’61248′,’6539′,’2407′,’8961′] time_epoch = int(time.time()) time_epoch = time_epoch-3600 inc_df = pd.DataFrame(columns=[“USER_ID”,”ITEM_ID”,”TIMESTAMP”]) i=0 for j in range(0,10): for k in users_list: for l in items_list: time_epoch = time_epoch+1 list_row = [str(k),str(l),time_epoch] inc_df.loc[i] = list_row i=i+1 incremental_file = ‘interactions_incremental_data.csv’ inc_df.to_csv(incremental_file, index=False)
We have synthetically generated these interactions by randomly selecting a few values for USER_ID and ITEM_ID, and generating interactions between those users and items with latest timestamps. The following table contains the randomly selected ITEM_ID values that are used for generating incremental interactions.
ITEM_ID | TITLE |
153 | Batman Forever (1995) |
260 | Star Wars: Episode IV – A New Hope (1977) |
1792 | U.S. Marshals (1998) |
2363 | Godzilla (Gojira) (1954) |
2407 | Cocoon (1985) |
2459 | Texas Chainsaw Massacre, The (1974) |
3948 | Meet the Parents (2000) |
6539 | Pirates of the Caribbean: The Curse of the Bla… |
8961 | Incredibles, The (2004) |
61248 | Death Race (2008) |
Upload the incremental interactions data by selecting Append to current dataset (or use incremental mode if using APIs), as shown in the following snapshot.
After the import job of incremental interactions dataset is complete, wait for the length of the trend discovery frequency time that you configured for the new recommendations to get reflected.
Choose Get recommendations on the campaign API page to get the latest recommended list of items that are trending.
Now we see the latest list of recommended items. The following table contains the data after mapping the IDs to the actual movie titles for readability. The code to perform the mapping is provided in the attached notebook.
ITEM_ID | TITLE |
260 | Star Wars: Episode IV – A New Hope (1977) |
6539 | Pirates of the Caribbean: The Curse of the Bla… |
153 | Batman Forever (1995) |
3948 | Meet the Parents (2000) |
1792 | U.S. Marshals (1998) |
2459 | Texas Chainsaw Massacre, The (1974) |
2363 | Godzilla (Gojira) (1954) |
61248 | Death Race (2008) |
8961 | Incredibles, The (2004) |
2407 | Cocoon (1985) |
The preceding GetRecommendations call includes the IDs of recommended items. Now we see the ITEM_ID values recommended are from the incremental interactions dataset that we had provided to the Amazon Personalize model. This is not surprising because these are the only items that gained interactions in the most recent 30 minutes from our synthetic dataset.
You have now successfully trained a Trending-Now model to generate item recommendations that are becoming popular with your users and tailor the recommendations according to user interest. Going forward, you can adapt this code to create other recommenders.
You can also use filters along with the Trending-Now recipe to differentiate the trends between different types of content, like long vs. short videos, or apply promotional filters to explicitly recommend specific items based on rules that align with your business goals.
Make sure you clean up any unused resources you created in your account while following the steps outlined in this post. You can delete filters, recommenders, datasets, and dataset groups via the AWS Management Console or using the Python SDK.
The new aws-trending-now recipe from Amazon Personalize helps you identify the items that are rapidly becoming popular with your users and tailor your recommendations for the fast-changing trends in user interest.
For more information about Amazon Personalize, see the Amazon Personalize Developer Guide.
Vamshi Krishna Enabothala is a Sr. Applied AI Specialist Architect at AWS. He works with customers from different sectors to accelerate high-impact data, analytics, and machine learning initiatives. He is passionate about recommendation systems, NLP, and computer vision areas in AI and ML. Outside of work, Vamshi is an RC enthusiast, building RC equipment (planes, cars, and drones), and also enjoys gardening.
Anchit Gupta is a Senior Product Manager for Amazon Personalize. She focuses on delivering products that make it easier to build machine learning solutions. In her spare time, she enjoys cooking, playing board/card games, and reading.
Abhishek Mangal is a Software Engineer for Amazon Personalize and works on architecting software systems to serve customers at scale. In his spare time, he likes to watch anime and believes ‘One Piece’ is the greatest piece of story-telling in recent history.