Recommending movies from implicit feedback

Louis de Bruijn | Aug. 23, 2019 | #Movielens 1m #collaborative filtering #recommender #alternating least squares #implicit

Let's kick off with some descriptive statistics of our data. There are 6040 users and 3706 movies included in this model. Users have rated a minimum of 20 and a maximum of 2314 movies, with a median of 96 and an average of 165 ratings. Every user got recommended 10 movies out of a total of 2199 unique recommendations (56.6% of all movies).

This alternating least squares model was also implemented on a different, much larger dataset, consisting of over one million users and about 136 thousand items, where every user implicitly liked at least one (instead of 20) items. The total amount of unique recommendations for all users was only 4542 items (only about 3% of all movies). The model decreased the amount of 'taste' dimension by 97% and every 'personal' recommendation consisted of a 10-fold unique combination out of only 4542 items.

The graph above also shows that the first-most recommended movie has a range of 763 and the last-most recommended movie a range of 1496 unique items. This indicates that the total amount of unique recommendations decreases as their rank increases. The range of total unique items, becomes even smaller as the likelihood of a recommendation becomes larger.

The interactive sections below make use of the functions of the implicit library for its implementation.

recalculate_user()

Provides recommendations on-the-fly for users that are not included in the pre-calculated model. Add movies to your own watched/clicked/liked list. Under the hood your items get appended to the sparse matrix that was initially used for building the model and is now used to recalculate your user vectors and show the top-10 recommendations.

similar_items()

Search for a movie and find the top-10 most similar ones based on the item vectors in our model.

recommend_all()

Shows top-5 recommendations for 30 randomly selected user profiles. This gives a glimpse of the distribution of recommendations per age, gender or occupation.

This app makes use of the MovieLens 1m dataset and finds suggestions based on an alternating least squares matrix factorization algorithm implemented through the implicit library. Special thanks to Ben Frederickson for his articles and library, Babu Thomas for the urls and posters of the movies and Victor Kohler for his extensive article on the implementation.