Recommend top trending items to your users using the new Amazon Personalize recipe

Amazon Personalize is excited to announce the new Trending-Now recipe to help you recommend items gaining popularity at the fastest pace among your users.

Amazon Personalize is a fully managed machine learning (ML) service that makes it easy for developers to deliver personalized experiences to their users. It enables you to improve customer engagement by powering personalized product and content recommendations in websites, applications, and targeted marketing campaigns. You can get started without any prior ML experience, using APIs to easily build sophisticated personalization capabilities in a few clicks. All your data is encrypted to be private and secure, and is only used to create recommendations for your users.

User interests can change based on a variety of factors, such as external events or the interests of other users. It’s critical for websites and apps to tailor their recommendations to these changing interests to improve user engagement. With Trending-Now, you can surface items from your catalog that are rising in popularity with higher velocity than other items, such as trending news, popular social content, or newly released movies. Amazon Personalize looks for items that are rising in popularity at a faster rate than other catalog items to help users discover items that are engaging their peers. Amazon Personalize also allows you to define the time periods over which trends are calculated depending on their unique business context, with options for every 30 minutes, 1 hour, 3 hours, or 1 day, based on the most recent interactions data from users.

In this post, we show how to use this new recipe to recommend top trending items to your users.

Solution overview

Trending-Now identifies the top trending items by calculating the increase in interactions that each item has over configurable intervals of time. The items with the highest rate of increase are considered trending items. The time is based on timestamp data in your interactions dataset. You can specify the time interval by providing a trend discovery frequency when you create your solution.

The Trending-Now recipe requires an interactions dataset, which contains a record of the individual user and item events (such as clicks, watches, or purchases) on your website or app along with the event timestamps. You can use the parameter Trend discovery frequency to define the time intervals over which trends are calculated and refreshed. For example, if you have a high traffic website with rapidly changing trends, you can specify 30 minutes as the trend discovery frequency. Every 30 minutes, Amazon Personalize looks at the interactions that have been ingested successfully and refreshes the trending items. This recipe also allows you to capture and surface any new content that has been introduced in the last 30 minutes and has seen a higher degree of interest from your user base than any preexisting catalog items. For any parameter values that are greater than 2 hours, Amazon Personalize automatically refreshes the trending item recommendations every 2 hours to account for new interactions and new items.

Datasets that have low traffic but use a 30-minute value can see poor recommendation accuracy due to sparse or missing interactions data. The Trending-Now recipe requires that you provide interaction data for at least two past time periods (this time period is your desired trend discovery frequency). If interaction data doesn’t exist for the last 2 time periods, Amazon Personalize will replace the trending items with popular items until the required minimum data is available.

The Trending-Now recipe is available for both custom dataset groups as well as video-on-demand domain dataset groups. In this post, we demonstrate how to tailor your recommendations for the fast-changing trends in user interest with this new Trending-Now feature for a media use case with a custom dataset group. The following diagram illustrates the solution workflow.

solution workflow.

For example, in video-on-demand applications, you can use this feature to show what movies are trending in the last 1 hour by specifying 1 hour for your trend discovery frequency. For every 1 hour of data, Amazon Personalize identifies the items with the greatest rate of increase in interactions since the last evaluation. Available frequencies include 30 minutes, 1 hour, 3 hours, and 1 day.

Prerequisites

To use the Trending-Now recipe, you first need to set up Amazon Personalize resources on the Amazon Personalize console. Create your dataset group, import your data, train a solution version, and deploy a campaign. For full instructions, see Getting started.

For this post, we have followed the console approach to deploy a campaign using the new Trending-Now recipe. Alternatively, you can build the entire solution using the SDK approach with this provided notebook. For both approaches, we use the MovieLens public dataset.

Prepare the dataset

Complete the following steps to prepare your dataset:

  1. Create a dataset group.
  2. Create an interactions dataset using the following schema: { “type”: “record”, “name”: “Interactions”, “namespace”: “com.amazonaws.personalize.schema”, “fields”: [ { “name”: “USER_ID”, “type”: “string” }, { “name”: “ITEM_ID”, “type”: “string” }, { “name”: “TIMESTAMP”, “type”: “long” } ], “version”: “1.0” }
  3. Import the interactions data to Amazon Personalize from Amazon Simple Storage Service (Amazon S3).

For the interactions data, we use ratings history from the movies review dataset, MovieLens.

Please use below python code to curate interactions dataset from the MovieLens public dataset.

import pandas as pd import time import datetime data_dir = “blog_data” !mkdir $data_dir !cd $data_dir && wget http://files.grouplens.org/datasets/movielens/ml-25m.zip !cd $data_dir && unzip ml-25m.zip dataset_dir = data_dir + “/ml-25m/” interactions_df = pd.read_csv(dataset_dir + ‘/ratings.csv’) interactions_df.drop(columns=[‘rating’], axis=1, inplace=True) interactions_df = interactions_df.rename(columns = {‘userId’:’USER_ID’, ‘movieId’:’ITEM_ID’, ‘timestamp’:’TIMESTAMP’}) interactions_file = ‘curated_interactions_training_data.csv’ interactions_df.to_csv(interactions_file, index=False)

The MovieLens dataset contains the user_id, rating, item_id, interactions between the users and items, and the time this interaction took place (a timestamp, which is given as UNIX epoch time). The dataset also contains movie title information to map the movie ID to the actual title and genres. The following table is a sample of the dataset.

USER_ID ITEM_ID TIMESTAMP TITLE GENRES
116927 1101 1105210919 Top Gun (1986) Action|Romance
158267 719 974847063 Multiplicity (1996) Comedy
55098 186871 1526204585 Heal (2017) Documentary
159290 59315 1485663555 Iron Man (2008) Action|Adventure|Sci-Fi
108844 34319 1428229516 Island, The (2005) Action|Sci-Fi|Thriller
85390 2916 953264936 Total Recall (1990) Action|Adventure|Sci-Fi|Thriller
103930 18 839915700 Four Rooms (1995) Comedy
104176 1735 985295513 Great Expectations (1998) Drama|Romance
97523 1304 1158428003 Butch Cassidy and the Sundance Kid (1969) Action|Western
87619 6365 1066077797 Matrix Reloaded, The (2003) Action|Adventure|Sci-Fi|Thriller|IMAX

The curated dataset includes USER_ID, ITEM_ID (movie ID), and TIMESTAMP to train the Amazon Personalize model. These are the mandatory required fields to train a model with the Trending-Now recipe. The following table is a sample of the curated dataset.

USER_ID ITEM_ID TIMESTAMP
48953 529 841223587
23069 1748 1092352526
117521 26285 1231959564
18774 457 848840461
58018 179819 1515032190
9685 79132 1462582799
41304 6650 1516310539
152634 2560 1113843031
57332 3387 986506413
12857 6787 1356651687

Train a model

After the dataset import job is complete, you’re ready to train your model.

  1. On the Solutions tab, choose Create solution.
  2. Choose the new aws-trending-now recipe.
  3. In the Advanced configuration section, set Trend discovery frequency to 30 minutes.
  4. Choose Create solution to start training.
    Create Solution

Create a campaign

In Amazon Personalize, you use a campaign to make recommendations for your users. In this step, you create a campaign using the solution you created in the previous step and get the Trending-Now recommendations:

  1. On the Campaigns tab, choose Create campaign.
  2. For Campaign name, enter a name.
  3. For Solution, choose the solution trending-now-solution.
  4. For Solution version ID, choose the solution version that uses the aws-trending-now recipe.
  5. For Minimum provisioned transactions per second, leave it at the default value.
  6. Choose Create campaign to start creating your campaign.
    Create new campaign

Get recommendations

After you create or update your campaign, you can get a recommended list of items that are trending, sorted from highest to lowest. On the campaign (trending-now-campaign) Personalization API tab, choose Get recommendations.

Get recommendations

The following screenshot shows the campaign detail page with results from a GetRecommendations call that includes the recommended items and the recommendation ID.

campaign detail page with results

The results from the GetRecommendations call includes the IDs of recommended items. The following table is a sample after mapping the IDs to the actual movie titles for readability. The code to perform the mapping is provided in the attached notebook.

ITEM_ID TITLE
356 Forrest Gump (1994)
318 Shawshank Redemption, The (1994)
58559 Dark Knight, The (2008)
33794 Batman Begins (2005)
44191 V for Vendetta (2006)
48516 Departed, The (2006)
195159 Spider-Man: Into the Spider-Verse (2018)
122914 Avengers: Infinity War – Part II (2019)
91974 Underworld: Awakening (2012)
204698 Joker (2019)

Get trending recommendations

After you create a solution version using the aws-trending-now recipe, Amazon Personalize will identify the top trending items by calculating the increase in interactions that each item has over configurable intervals of time. The items with the highest rate of increase are considered trending items. The time is based on timestamp data in your interactions dataset.

Now let’s provide the latest interactions to Amazon Personalize to calculate the trending items. We can provide the latest interactions using real-time ingestion by creating an event tracker or through a bulk data upload with a dataset import job in incremental mode. In the notebook, we have provided sample code to individually import the latest real-time interactions data into Amazon Personalize using the event tracker.

For this post we will provide the latest interactions as a bulk data upload with a dataset import job in incremental mode. Please use below python code to generate dummy incremental interactions and upload the incremental interactions data using a dataset import job.

import pandas as pd import time import datetime #Selecting some random USER_ID’s for generating incremental interactions. users_list = [‘20371′,’63409′,’54535′,’119138′,’58953′,’82982′,’19044′,’139171′,’98598′,’23822′,’112012′,’121380′,’2660′,’46948′,’5656′,’68919′,’152414′,’31234′,’88240′,’40395′,’49296′,’80280′,’150179′,’138474′,’124489′,’145218′,’141810′,’82607’] #Selecting some random ITEM_ID’s for generating incremental interactions. items_list = [ ‘153’,’2459′,’1792′,’3948′,’2363′,’260′,’61248′,’6539′,’2407′,’8961′] time_epoch = int(time.time()) time_epoch = time_epoch-3600 inc_df = pd.DataFrame(columns=[“USER_ID”,”ITEM_ID”,”TIMESTAMP”]) i=0 for j in range(0,10): for k in users_list: for l in items_list: time_epoch = time_epoch+1 list_row = [str(k),str(l),time_epoch] inc_df.loc[i] = list_row i=i+1 incremental_file = ‘interactions_incremental_data.csv’ inc_df.to_csv(incremental_file, index=False)

We have synthetically generated these interactions by randomly selecting a few values for USER_ID and ITEM_ID, and generating interactions between those users and items with latest timestamps. The following table contains the randomly selected ITEM_ID values that are used for generating incremental interactions.

ITEM_ID TITLE
153 Batman Forever (1995)
260 Star Wars: Episode IV – A New Hope (1977)
1792 U.S. Marshals (1998)
2363 Godzilla (Gojira) (1954)
2407 Cocoon (1985)
2459 Texas Chainsaw Massacre, The (1974)
3948 Meet the Parents (2000)
6539 Pirates of the Caribbean: The Curse of the Bla…
8961 Incredibles, The (2004)
61248 Death Race (2008)

Upload the incremental interactions data by selecting Append to current dataset (or use incremental mode if using APIs), as shown in the following snapshot.

Upload the incremental interactions data by selecting Append to current dataset (or use incremental mode if using APIs),

After the import job of incremental interactions dataset is complete, wait for the length of the trend discovery frequency time that you configured for the new recommendations to get reflected.

Choose Get recommendations on the campaign API page to get the latest recommended list of items that are trending.

Now we see the latest list of recommended items. The following table contains the data after mapping the IDs to the actual movie titles for readability. The code to perform the mapping is provided in the attached notebook.

ITEM_ID TITLE
260 Star Wars: Episode IV – A New Hope (1977)
6539 Pirates of the Caribbean: The Curse of the Bla…
153 Batman Forever (1995)
3948 Meet the Parents (2000)
1792 U.S. Marshals (1998)
2459 Texas Chainsaw Massacre, The (1974)
2363 Godzilla (Gojira) (1954)
61248 Death Race (2008)
8961 Incredibles, The (2004)
2407 Cocoon (1985)

The preceding GetRecommendations call includes the IDs of recommended items. Now we see the ITEM_ID values recommended are from the incremental interactions dataset that we had provided to the Amazon Personalize model. This is not surprising because these are the only items that gained interactions in the most recent 30 minutes from our synthetic dataset.

You have now successfully trained a Trending-Now model to generate item recommendations that are becoming popular with your users and tailor the recommendations according to user interest. Going forward, you can adapt this code to create other recommenders.

You can also use filters along with the Trending-Now recipe to differentiate the trends between different types of content, like long vs. short videos, or apply promotional filters to explicitly recommend specific items based on rules that align with your business goals.

Clean up

Make sure you clean up any unused resources you created in your account while following the steps outlined in this post. You can delete filters, recommenders, datasets, and dataset groups via the AWS Management Console or using the Python SDK.

Summary

The new aws-trending-now recipe from Amazon Personalize helps you identify the items that are rapidly becoming popular with your users and tailor your recommendations for the fast-changing trends in user interest.

For more information about Amazon Personalize, see the Amazon Personalize Developer Guide.

About the authors

Vamshi Krishna Enabothala is a Sr. Applied AI Specialist Architect at AWS. He works with customers from different sectors to accelerate high-impact data, analytics, and machine learning initiatives. He is passionate about recommendation systems, NLP, and computer vision areas in AI and ML. Outside of work, Vamshi is an RC enthusiast, building RC equipment (planes, cars, and drones), and also enjoys gardening.

Anchit Gupta is a Senior Product Manager for Amazon Personalize. She focuses on delivering products that make it easier to build machine learning solutions. In her spare time, she enjoys cooking, playing board/card games, and reading.

Abhishek Mangal is a Software Engineer for Amazon Personalize and works on architecting software systems to serve customers at scale. In his spare time, he likes to watch anime and believes ‘One Piece’ is the greatest piece of story-telling in recent history.



Source