Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Lecture 16: Recommender Systems

Lecture 16: Recommender Systems

UBC 2025-26

Imports

import os
import random
import sys
import time

import numpy as np
sys.path.append(os.path.join(os.path.abspath(".."), "code"))

import matplotlib.pyplot as plt
from plotting_functions import *
from plotting_functions_unsup import *
from sklearn.decomposition import PCA
from sklearn.model_selection import cross_validate, train_test_split
from sklearn.preprocessing import StandardScaler

plt.rcParams["font.size"] = 16
import matplotlib.cm as cm

# plt.style.use("seaborn")
%matplotlib inline
pd.set_option("display.max_colwidth", 0)
DATA_DIR = os.path.join(os.path.abspath(".."), "data/")
<Figure size 640x480 with 2 Axes>

Learning outcomes

From this lecture, students are expected to be able to:

  • State the problem of recommender systems.

  • Describe components of a utility matrix.

  • Create a utility matrix given ratings data.

  • Describe a common approach to evaluate recommender systems.

  • Implement some baseline approaches to complete the utility matrix.

  • Explain the idea of collaborative filtering.

  • Formulate the rating prediction problem as a supervised machine learning problem.

  • Create a content-based filter given ratings data and item features to predict missing ratings in the utility matrix.

  • Explain some serious consequences of recommendation systems.



❓❓ Questions for you

What percentage of watch time on YouTube do you think comes from recommendations?

  • (A) 50%

  • (B) 60%

  • (C) 20%

  • (D) 90%

This question is based on this source. The statistics might have changed now.

Recommender systems intro and motivation

What is a recommender system?

  • A recommender or a recommendation system recommends a particular product or service to users they are likely to consume.

Example: Recommender Systems

  • A user goes to Amazon to buy products.

  • Amazon has some information about the user. They also have information about other users buying similar products.

  • What should they recommend to the user, so that they buy more products?

  • There’s no “right” answer (no label).

  • The whole idea is to understand user behavior in order to recommend them products they are likely to consume.



Why should we care about recommendation systems?

  • Almost everything we buy or consume today is in some way or the other influenced by recommendation systems.

    • Music (Spotify), videos (YouTube), news, books and products (Amazon), movies (Netflix), jokes, restaurants, dating , friends (Facebook), professional connections (LinkedIn)

  • Recommendation systems are at the core of the success of many companies such as Amazon and Netflix.

  • Recommendation systems are often presented as powerful tools that significantly reduce the effort users need to put in finding items, effectively mitigating the problem of information overload.

  • This is more or less true in many contexts. Consider, for instance, the experience of shopping an umbrella on Amazon without the help of recommendations or any specific ranking of products.

  • In the absence of a recommendation system, users would be faced with the daunting task of sifting through thousands of available products to find the one that best suits their needs.

  • That said, we should always be mindful of the drawbacks of relying heavily on recommender systems.

  • For example, these systems tend to recommend articles and products that are similar to those a user has previously interacted with or those their friends like.

  • While this can improve user experience by presenting more of what the system predicts the user will like, it can also lead to a phenomenon known as “filter bubbles”.

  • This effect narrows a user’s exposure to diverse viewpoints and information. Such a narrowing of perspective can be detrimental, particularly in the context of scientific research or political discourse, where exposure to a wide range of perspectives is crucial.



Data and main approaches

What kind of data we need to build recommendation systems?

  • Customer purchase history data (We worked with it last week.)

  • User-item interactions (e.g., ratings or clicks) (most common)

  • Features related to items or users

Main approaches

  • Collaborative filtering

    • “Unsupervised” learning

    • We only have labels yijy_{ij} (rating of user ii for item jj).

    • We learn latent features.

  • Content-based recommenders (today’s focus)

    • Supervised learning

    • Extract features xix_i of users and/or items building a model to predict rating yiy_i given xix_i.

    • Apply model to predict for new users/items.

  • Hybrid

    • Combining collaborative filtering with content-based filtering





Recommender systems problem

Problem formulation

  • Most often the data for recommender systems come in as interactions between a set of items and a set of users.

  • We have two entities: NN users and MM items.

  • Users are consumers.

  • Items are the products or services offered.

    • E.g., movies (Netflix), books (Amazon), songs (spotify), people (tinder)

  • A utility matrix is the matrix that captures interactions between NN users and MM items.

  • The interaction may come in different forms:

    • ratings, clicks, purchases

  • Below is a toy utility matrix. Here NN = 6 and MM = 5.

  • Each entry yijy_{ij} (ithi^{th} row and jthj^{th} column) denotes the rating given by the user ii to item jj.

  • We represent users in terms of items and items in terms of users.

  • The utility matrix is very sparse because usually users only interact with a few items.

  • For example:

    • all Netflix users will have rated only a small percentage of content available on Netflix

    • all amazon clients will have rated only a small fraction of items among all items available on Amazon

What do we predict?

Given a utility matrix of NN users and MM items, complete the utility matrix. In other words, predict missing values in the matrix.

  • Once we have predicted ratings, we can recommend items to users they are likely to rate higher.

  • Note: rating prediction \neq Classification or regression

In classification or regression:

  • We have XX and targets for some rows in XX.

  • We want to predict the last column (target column).

[???]\begin{bmatrix} \checkmark & \checkmark & \checkmark & \checkmark & \checkmark\\ \checkmark & \checkmark & \checkmark & \checkmark & \checkmark\\ \checkmark & \checkmark & \checkmark & \checkmark & \checkmark\\ \checkmark & \checkmark & \checkmark & \checkmark & ?\\ \checkmark & \checkmark & \checkmark & \checkmark & ?\\ \checkmark & \checkmark & \checkmark & \checkmark & ?\\ \end{bmatrix}

In rating prediction

  • Ratings data has many missing values in the utility matrix. There is no special target column. We want to predict the missing entries in the matrix.

  • Since our goal is to predict ratings, usually the utility matrix is referred to as YY matrix.

[????????????????????]\begin{bmatrix} ? & ? & \checkmark & ? & \checkmark\\ \checkmark & ? & ? & ? & ?\\ ? & \checkmark & \checkmark & ? & \checkmark\\ ? & ? & ? & ? & ?\\ ? & ? & ? & \checkmark & ?\\ ? & \checkmark & \checkmark & ? & \checkmark \end{bmatrix}



Creating utility matrix

Let’s work with the following toy example.

toy_ratings = pd.read_csv(DATA_DIR + "toy_ratings.csv")
toy_ratings
Loading...
user_key = "user_id"
item_key = "movie_id"
def get_stats(ratings, item_key="movie_id", user_key="user_id"):
    print("Number of ratings:", len(ratings))
    print("Average rating:  %0.3f" % (np.mean(ratings["rating"])))
    N = len(np.unique(ratings[user_key]))
    M = len(np.unique(ratings[item_key]))
    print("Number of users (N): %d" % N)
    print("Number of items (M): %d" % M)
    print("Fraction non-nan ratings: %0.3f" % (len(ratings) / (N * M)))
    return N, M


N, M = get_stats(toy_ratings)
Number of ratings: 21
Average rating:  3.524
Number of users (N): 4
Number of items (M): 12
Fraction non-nan ratings: 0.438
  • Let’s construct utility matrix with number of users rows and number of items columns from the ratings data.

Note we are constructing a non-sparse matrix for demonstration purpose here. In real life it’s recommended that you work with sparse matrices.

user_mapper = dict(zip(np.unique(toy_ratings[user_key]), list(range(N))))
item_mapper = dict(zip(np.unique(toy_ratings[item_key]), list(range(M))))
user_inverse_mapper = dict(zip(list(range(N)), np.unique(toy_ratings[user_key])))
item_inverse_mapper = dict(zip(list(range(M)), np.unique(toy_ratings[item_key])))
  • Why do we need all these mappers?

    • We want to store the rating for user ii and item jj at Y[i,j]Y[i,j] location in the utility matrix.

    • So we define user_mapper and item_mapper which map user and item ids to indices.

    • Once we have predicted ratings for users and items, we want to be able to map it to the original user and item ids so that we recommend the right product to the right user.

    • So we have user_inverse_mapper and item_inverse_mapper which map indices to original user and item ids.

user_key = "user_id"
item_key = "movie_id"
user_mapper = dict(zip(np.unique(toy_ratings[user_key]), list(range(N))))
item_mapper = dict(zip(np.unique(toy_ratings[item_key]), list(range(M))))
user_inverse_mapper = dict(zip(list(range(N)), np.unique(toy_ratings[user_key])))
item_inverse_mapper = dict(zip(list(range(M)), np.unique(toy_ratings[item_key])))

def create_Y_from_ratings(data, N, M):
    Y = np.zeros((N, M))
    Y.fill(np.nan)
    for index, val in data.iterrows():
        n = user_mapper[val[user_key]]
        m = item_mapper[val[item_key]]
        Y[n, m] = val["rating"]

    return Y
Y_mat = create_Y_from_ratings(toy_ratings, N, M)
Y_mat.shape
(4, 12)
pd.DataFrame(Y_mat)
Loading...
Y = create_Y_from_ratings(toy_ratings, N, M)
utility_mat = pd.DataFrame(Y, columns=item_mapper.keys(), index=user_mapper.keys())
utility_mat
Loading...
  • Rows represent users.

  • Columns represent items (movies in our case).

  • Each cell gives the rating given by the user to the corresponding movie.

  • Users are features for movies and movies are features for users.

  • Our goal is to predict missing entries in the utility matrix.



Evaluation

  • We’ll try a number of methods to fill in the missing entries in the utility matrix.

  • Although there is no notion of “accurate” recommendations, we need a way to evaluate our predictions so that we’ll be able to compare different methods.

  • Although we are doing unsupervised learning, we’ll split the data and evaluate our predictions as follows.

Data splitting

  • We split the ratings into train and validation sets.

  • It’s easier to split the ratings data instead of splitting the utility matrix.

  • Don’t worry about y; we’re not really going to use it.

X = toy_ratings.copy()
y = toy_ratings[user_key]
X_train, X_valid, y_train, y_valid = train_test_split(
    X, y, test_size=0.2, random_state=42
)

X_train.shape, X_valid.shape
((16, 3), (5, 3))

Now we will create utility matrices for train and validation splits.

train_mat = create_Y_from_ratings(X_train, N, M)
valid_mat = create_Y_from_ratings(X_valid, N, M)
train_mat.shape, valid_mat.shape
((4, 12), (4, 12))
(len(X_train) / (N * M)) # Fraction of non-nan entries in the train set
0.3333333333333333
(len(X_valid) / (N * M)) # Fraction of non-nan entries in the valid set
0.10416666666666667
pd.DataFrame(train_mat)
Loading...
pd.DataFrame(valid_mat)
Loading...
  • train_mat has only ratings from the train set and valid_mat has only ratings from the valid set.

  • During training we assume that we do not have access to some of the available ratings. We predict these ratings and evaluate them against ratings in the validation set.

Questions for you

  • How do train and validation utility matrices differ?

  • Why are utility matrices for train and validation sets are of the same shape?

  • Now that we have train and validation sets, how do we evaluate our predictions?

  • You can calculate the error between actual ratings and predicted ratings with metrics of your choice.

    • Most common ones are MSE or RMSE.

  • The error function below calculates RMSE and evaluate function prints train and validation RMSE.

  • Lower RMSE \rightarrow predicted ratings are closer to the actual ratings

def error(X1, X2):
    """
    Returns the root mean squared error.
    """
    return np.sqrt(np.nanmean((X1 - X2) ** 2))


def evaluate(pred_X, train_X, valid_X, model_name="Global average"):
    print("%s train RMSE: %0.2f" % (model_name, error(pred_X, train_X)))
    print("%s valid RMSE: %0.2f" % (model_name, error(pred_X, valid_X)))





Baseline Approaches

Let’s first try some simple approaches to predict missing entries.

  1. Global average baseline

  2. Per-user average baseline

  3. Per-item average baseline

  4. Average of 2 and 3

    • Take an average of per-user and per-item averages.

  5. kk-Nearest Neighbours imputation

Global average baseline

  • Let’s examine RMSE of the global average baseline.

  • In this baseline we predict everything as the global average rating.

avg = np.nanmean(train_mat)
pred_g = np.zeros(train_mat.shape) + avg
pd.DataFrame(pred_g).head()
Loading...
evaluate(pred_g, train_mat, valid_mat, model_name="Global average")
Global average train RMSE: 1.41
Global average valid RMSE: 0.83



kk-nearest neighbours imputation

  • Can we try kk-nearest neighbours type imputation?

  • Impute missing values using the mean value from kk nearest neighbours found in the training set.

  • Calculate distances between examples using features where neither value is missing.

pd.DataFrame(train_mat)
Loading...
from sklearn.impute import KNNImputer

imputer = KNNImputer(n_neighbors=2, keep_empty_features=True)
train_mat_imp = imputer.fit_transform(train_mat)
pd.DataFrame(train_mat_imp)
Loading...
evaluate(train_mat_imp, train_mat, valid_mat, model_name="KNN imputer")
KNN imputer train RMSE: 0.00
KNN imputer valid RMSE: 1.48
  • We can look at the nearest neighbours of a query item.



pd.DataFrame(train_mat_imp).head()
Loading...

Question

  • Instead of imputation, what would be the consequences if we replace NaN with zeros so that we can calculate distances between vectors?

Once you have predictions, you can sort them based on ratings and recommend items with highest ratings.





❓❓ Questions for you

Select all of the following statements which are True (iClicker)

  • (A) In the context of recommendation systems, the shapes of validation utility matrix and train utility matrix are the same.

  • (B) RMSE perfectly captures what we want to measure in the context of recommendation systems.

  • (C) It would be reasonable to impute missing values in the utility matrix by taking the average of the ratings given to an item by similar users.

  • (D) In KNN type imputation, if a user has not rated any items yet, a reasonable strategy would be recommending them the most popular item.





Questions for class discussion

Discuss how clustering might be applied to the problem of item recommendation:





Content-based filtering


  • What if a new item or a new user shows up?

    • You won’t have any ratings information for that item or user

  • Content-based filtering is suitable to predict ratings for new items and new users.

  • Content-based filtering is a supervised machine learning approach to recommender systems.

  • In KNN imputation (an example of collaborative filtering) we assumed that we only have ratings data.

  • Usually, there is some information available about items and users.

  • Examples

    • Netflix can describe movies as action, romance, comedy, documentaries.

    • Netflix has some demographic and preference information on users.

    • Amazon could describe books according to topics: math, languages, history.

    • Tinder could describe people according to age, location, employment.

  • Can we use this information to predict ratings in the utility matrix?

    • Yes! Using content-based filtering!

Overview

In content-based filtering,

  • We assume that we are given item or user feature.

  • Given movie information, for instance, we create user profile for each user.

  • We treat ratings prediction problem as a set of regression problems and build regression model for each user.

  • Once we have trained regression models for each user, we complete the utility matrix by predicting ratings for each user using their corresponding models.

Let’s look into each of these steps one by one with a toy example.

Movie features

  • Suppose we also have movie features. In particular, suppose we have information about the genre of each movie.

movie_feats_df = pd.read_csv(DATA_DIR + "toy_movie_feats.csv", index_col=0)
movie_feats_df
Loading...
Z = movie_feats_df.to_numpy()
Z.shape
(12, 6)
  • How can we use these features to predict missing ratings?

  • Using the ratings data and movie features:

    • Build profiles for different users.

    • Train a supervised machine learning model for each user.

    • Predict ratings using the trained models

  • Let’s consider an example user Pat.

  • We don’t know anything about Pat but we know her ratings to movies.

utility_mat.loc["Pat"]
A Beautiful Mind 3.0 Bambi 4.0 Cast Away 3.0 Downfall 2.0 Inception NaN Jerry Maguire 5.0 Lion King 4.0 Malcolm x NaN Man on Wire NaN Roman Holidays NaN The Social Dilemma NaN Titanic 3.0 Name: Pat, dtype: float64
  • We also know about movies and their features.

  • If Pat gave a high rating to Lion King, it means that she liked the features of the movie.

movie_feats_df.loc["Lion King"]
Action 0 Romance 0 Drama 1 Comedy 0 Children 1 Documentary 0 Name: Lion King, dtype: int64



Building user profiles

For each user ii create a user profile as follows.

  • Consider all movies rated by ii and create X and y for the user:

    • Each row in X contains the movie features of movie jj rated by ii.

    • Each value in y is the corresponding rating given to the movie jj by user ii.

  • Fit a regression model using X and y.

  • Apply the model to predict ratings for new items!

As an example, let’s build a profile for pat.

# Which movies are rated by Pat? 

movies_rated_by_pat = toy_ratings[toy_ratings['user_id']=='Pat'][['movie_id', 'rating']]
movies_rated_by_pat
Loading...
Z.shape
(12, 6)
# Get feature vectors of movies rated by Pat. 

pat_X = []
pat_y = []
for (index, val) in movies_rated_by_pat.iterrows():
    # Get the id of this movie rated by Pat       
    m = item_mapper[val['movie_id']]
    
    # Get the feature vector for the movie 
    pat_X.append(Z[m])
    
    # Get the rating for the movie
    pat_y.append(val['rating'])
pd.DataFrame(pat_X, index=movies_rated_by_pat['movie_id'].tolist(), columns = movie_feats_df.columns)
Loading...
pat_y
[3, 4, 4, 3, 5, 2, 3]

Similar to how we created X and y for Pat above, the function below builds X and y for all users.

from collections import defaultdict

def get_lr_data_per_user(ratings_df, d):
    lr_y = defaultdict(list)
    lr_X = defaultdict(list)
    lr_items = defaultdict(list)

    for index, val in ratings_df.iterrows():
        n = user_mapper[val[user_key]]
        m = item_mapper[val[item_key]]
        lr_X[n].append(Z[m])
        lr_y[n].append(val["rating"])
        lr_items[n].append(m)

    for n in lr_X:
        lr_X[n] = np.array(lr_X[n])
        lr_y[n] = np.array(lr_y[n])

    return lr_X, lr_y, lr_items
d = movie_feats_df.shape[1]
X_train_usr, y_train_usr, rated_items = get_lr_data_per_user(toy_ratings, d)
X_train_usr
defaultdict(list, {3: array([[0, 0, 1, 0, 1, 0], [0, 1, 1, 1, 0, 0], [0, 1, 1, 1, 0, 0], [0, 0, 0, 0, 0, 1]]), 0: array([[0, 1, 1, 0, 0, 0], [0, 1, 1, 1, 0, 0], [1, 0, 1, 0, 0, 0], [0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0, 1]]), 2: array([[0, 1, 1, 0, 0, 0], [0, 0, 1, 0, 1, 0], [0, 0, 1, 0, 1, 0], [0, 1, 1, 0, 0, 0], [0, 1, 1, 1, 0, 0], [0, 0, 0, 0, 0, 1], [0, 1, 1, 0, 0, 0]]), 1: array([[0, 1, 1, 0, 0, 0], [0, 0, 1, 0, 1, 0], [0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0, 1], [0, 0, 0, 0, 0, 1]])})

Do you think the shape of X and y for all users would be the same?

Examining user profiles

  • Let’s examine some user profiles.

def get_user_profile(user_name):
    X = X_train_usr[user_mapper[user_name]]
    y = y_train_usr[user_mapper[user_name]]
    items = rated_items[user_mapper[user_name]]
    movie_names = [item_inverse_mapper[item] for item in items]
    print("Profile for user: ", user_name)
    profile_df = pd.DataFrame(X, columns=movie_feats_df.columns, index=movie_names)
    profile_df["ratings"] = y
    return profile_df
get_user_profile("Pat")
Profile for user:  Pat
Loading...
  • Pat seems to like Children’s movies and movies with Comedy.

  • Seems like she’s not so much into romantic movies.

get_user_profile("Eva")
Profile for user:  Eva
Loading...
  • Eva hasn’t rated many movies. There are not many rows.

  • Eva seems to like documentaries and action movies.

  • Seems like she’s not so much into romantic movies.



Supervised approach to rating prediction

Given X and y for each user, we can now build a regression model for each user.

from sklearn.linear_model import Ridge


def train_for_usr(user_name, model=Ridge()):
    X = X_train_usr[user_mapper[user_name]]
    y = y_train_usr[user_mapper[user_name]]
    model.fit(X, y)
    return model


def predict_for_usr(model, movie_names):
    feat_vecs = movie_feats_df.loc[movie_names].values
    preds = model.predict(feat_vecs)
    return preds

A regression model for Pat

user_name = "Pat"
pat_model = train_for_usr(user_name)
  • Since we are training ridge model, we can examine the coefficients

  • What are the regression weights learned for Pat?

col = "Coefficients for %s" % user_name
pd.DataFrame(pat_model.coef_, index=movie_feats_df.columns, columns=[col])
Loading...
  • How would Pat rate some movies she hasn’t seen?

movies_to_pred = ["Roman Holidays", "Malcolm x"]
pred_df = movie_feats_df.loc[movies_to_pred]
pred_df
Loading...
user_name = "Pat"
preds = predict_for_usr(pat_model, movies_to_pred)
pred_df[user_name + "'s predicted ratings"] = preds
pred_df
Loading...

A regression model for Eva

user_name = "Eva"
eva_model = train_for_usr(user_name)
col = "Coefficients for %s" % user_name
pd.DataFrame(eva_model.coef_, index=movie_feats_df.columns, columns=[col])
Loading...
  • What are the predicted ratings for Eva for a list of movies?

user_name = "Eva"
preds = predict_for_usr(eva_model, movies_to_pred)
pred_df[user_name + "'s predicted ratings"] = preds
pred_df
Loading...



Completing the utility matrix with content-based filtering

Here is the original utility matrix.

utility_mat
Loading...
  • Using predictions per user, we can fill in missing entries in the utility matrix.

from sklearn.linear_model import Ridge

models = dict()
pred_lin_reg = np.zeros((N, M))

for n in range(N):
    models[n] = Ridge()
    models[n].fit(X_train_usr[n], y_train_usr[n])
    pred_lin_reg[n] = models[n].predict(Z)
pd.DataFrame(pred_lin_reg, columns=item_mapper.keys(), index=user_mapper.keys())
Loading...
  • In this toy example, we assumed access to item features. Frequently, we also have access to user features, including demographic information.

  • With this data, we can construct item profiles similar to user profiles and train a unique regression model for each item.

  • These models enable us to predict ratings for each item individually.

  • Typically, the final rating is derived from a weighted average that combines the ratings suggested by both item features and user features.

  • In this toy example, we assumed that we had item features. Often we also have access to user features such as their demographic information.

  • When such information is available, we can create item profiles similar to user profiles and train a regression model per item.

  • We can then predict ratings for each item using these models.

  • Often a weighted average of ratings given by item features and user features is used as the final rating.



Miscellaneous comments on content-based filtering

Fine-tuning your regression models

  • The feature matrix for movies can contain different types of features.

    • Example: Plot of the movie (text features), actors (categorical features), year of the movie, budget and revenue of the movie (numerical features).

    • You’ll apply our usual preprocessing techniques to these features.

  • If you have enough data, you could also carry out hyperparameter tuning with cross-validation for each model.

  • Finally, although we have been talking about linear models above, you can use any regression model of your choice.

Advantages of content-based filtering

  • We don’t need many users to provide ratings for an item.

  • Each user is modeled separately, so you might be able to capture uniqueness of taste.

  • Since you can obtain the features of the items, you can immediately recommend new items.

    • This would not have been possible with collaborative filtering.

  • Recommendations are more interpretable (if you use linear models)

    • You can explain to the user why you are recommending an item because you have learned weights.

Disadvantages of content-based filtering

  • Feature acquisition and feature engineering

    • What features should we use to explain the difference in ratings?

    • Obtaining those features for each item might be very expensive.

  • Less diversity: hardly recommend an item outside the user’s profile.

❓❓ Questions for you

Select all of the following statements which are True (iClicker)

  • (A) In content-based filtering we leverage available item features in addition to similarity between users.

  • (B) In content-based filtering you represent each user in terms of known features of items.

  • (C) In the set up of content-based filtering we discussed, if you have a new movie, you would have problems predicting ratings for that movie.

  • (D) In content-based filtering if a user has a number of ratings in the training utility matrix but does not have any ratings in the validation utility matrix then we won’t be able to calculate RMSE for the validation utility matrix.





Beyond error rate in recommendation systems

  • If a system gives the best RMSE it doesn’t necessarily mean that it’s going to give best recommendations.

  • In recommendation systems we do not have ground truth in the sense that there is no notion of “perfect” recommendations.

  • Training your model and evaluating it offline is not ideal.

  • Other aspects such as simplicity, interpretation, code maintainability are equally (if not more) important than best validation error.

  • Winning system of Netflix Challenge was never adopted.

    • Big mess of ensembles was not really maintainable

  • There are other considerations.

Diversity

Are these good recommendations?

You are looking at Education Solar Robot Toy, are these good recommendations?

Now suppose you’ve recently bought Education Solar Robot Toy and rated them highly. Are these good recommendations now?

  • Not really. Even though you really liked the item you don’t need similar items anymore.

  • Diversity is about how different are the recommendations.

    • Another example: Even if you really really like Star Wars, you might want non-Star-Wars suggestions.

  • But be careful. We need a balance here.

Freshness

Are these good recommendations?

  • Some of these books don’t have many ratings but it might be a good idea to recommend “fresh” things.

  • Freshness: people tend to get more excited about new/surprising things.

Trust

  • But again you need a balance here. What would happen if you keep surprising users all the time?

  • There might be trust issues.

  • Another aspect of trust is explaining your recommendation, i.e., telling the user why you made a recommendation. This gives the user an opportunity to understand why your recommendations could be interesting to them.

  • Injecting GPT-4’s reasoning into recommendation systems

Persistence:

  • How long should recommendations last?

  • If the user does not click on a recommendation for a while, should it remain a recommendation?

Social recommendation:

  • What did your friends watch?

  • Many recommenders are now connected to social networks.

  • “Login using you Facebook account”.

  • Often, people like similar movies to their friends.

  • If we get a new user, then recommendations are based on friend’s preferences.





Final comments and summary

Formulating the problem of recommender systems

  • We are given ratings data.

  • We use this data to create utility matrix which encodes interactions between users and items.

  • The utility matrix has many missing entries.

  • We defined recommendation systems problem as matrix completion problem.

What did we cover?

  • There is a big world of recommendation systems out there. We talked about a basic traditional approache to recommender systems.

    • content-based filtering

  • Another common approach we did not cover is collaborative filtering.

If you want to know more advanced approaches to recommender systems, watch this 4-hour summer school tutorial by Xavier Amatriain, Research/Engineering Director @ Netflix.

Evaluation

  • We split the data similar to supervised systems.

  • We evaluate recommendation systems using traditional regression metrics such as MSE or RMSE.

  • But real evaluation of recommender system can be very tricky because there is no ground truth.

  • We have been using RMSE due to the lack of a better measure.

  • What we actually want to measure is the interest that our user has in the recommended items.

Reminder

  • Recommendation systems can have terrible consequences, especially in the context of politics and extremism.

  • They can cause the phenomenon called “filter bubbles”.

  • Ask hard and uncomfortable questions to yourself (and to your employer if possible) before implementing and deploying a recommendation system.