Open In App

Python | Implementation of Movie Recommender System

Recommender System is a system that seeks to predict or filter preferences according to the user’s choices. Recommender systems are utilized in a variety of areas including movies, music, news, books, research articles, search queries, social tags, and products in general. 
Recommender systems produce a list of recommendations in any of the two ways – 
 

Let’s develop a basic recommendation system using Python and Pandas. 
Let’s focus on providing a basic recommendation system by suggesting items that are most similar to a particular item, in this case, movies. It just tells what movies/items are most similar to the user’s movie choice.
To download the files, click on the links – .tsv file, Movie_Id_Titles.csv.
Import dataset with delimiter “\t” as the file is a tsv file (tab-separated file). 
 






# import pandas library
import pandas as pd
  
# Get the data
column_names = ['user_id', 'item_id', 'rating', 'timestamp']
  
  
df = pd.read_csv(path, sep='\t', names=column_names)
  
# Check the head of the data
df.head()

 
 






# Check out all the movies and their respective IDs
movie_titles.head()

  
 




data = pd.merge(df, movie_titles, on='item_id')
data.head()

 
 




# Calculate mean rating of all movies
data.groupby('title')['rating'].mean().sort_values(ascending=False).head()

  
 




# Calculate count rating of all movies
data.groupby('title')['rating'].count().sort_values(ascending=False).head()

  
 




# creating dataframe with 'rating' count values
ratings = pd.DataFrame(data.groupby('title')['rating'].mean()) 
  
ratings['num of ratings'] = pd.DataFrame(data.groupby('title')['rating'].count())
  
ratings.head()

 
  
Visualization imports: 
 




import matplotlib.pyplot as plt
import seaborn as sns
  
sns.set_style('white')
%matplotlib inline

 
 




# plot graph of 'num of ratings column'
plt.figure(figsize =(10, 4))
  
ratings['num of ratings'].hist(bins = 70)

 
 




# plot graph of 'ratings' column
plt.figure(figsize =(10, 4))
  
ratings['rating'].hist(bins = 70)

 
 




# Sorting values according to 
# the 'num of rating column'
moviemat = data.pivot_table(index ='user_id',
              columns ='title', values ='rating')
  
moviemat.head()
  
ratings.sort_values('num of ratings', ascending = False).head(10)

  
 




# analysing correlation with similar movies
starwars_user_ratings = moviemat['Star Wars (1977)']
liarliar_user_ratings = moviemat['Liar Liar (1997)']
  
starwars_user_ratings.head()

 
 




# analysing correlation with similar movies
similar_to_starwars = moviemat.corrwith(starwars_user_ratings)
similar_to_liarliar = moviemat.corrwith(liarliar_user_ratings)
  
corr_starwars = pd.DataFrame(similar_to_starwars, columns =['Correlation'])
corr_starwars.dropna(inplace = True)
  
corr_starwars.head()

  
 

 




# Similar movies like starwars
corr_starwars.sort_values('Correlation', ascending = False).head(10)
corr_starwars = corr_starwars.join(ratings['num of ratings'])
  
corr_starwars.head()
  
corr_starwars[corr_starwars['num of ratings']>100].sort_values('Correlation', ascending = False).head()




# Similar movies as of liarliar
corr_liarliar = pd.DataFrame(similar_to_liarliar, columns =['Correlation'])
corr_liarliar.dropna(inplace = True)
  
corr_liarliar = corr_liarliar.join(ratings['num of ratings'])
corr_liarliar[corr_liarliar['num of ratings']>100].sort_values('Correlation', ascending = False).head()


Article Tags :