User-Based Collaborative Filtering

Last Updated : 22 Jan, 2023

User-Based Collaborative Filtering is a technique used to predict the items that a user might like on the basis of ratings given to that item by other users who have similar taste with that of the target user. Many websites use collaborative filtering for building their recommendation system.

Steps for User-Based Collaborative Filtering:

Step 1: Finding the similarity of users to the target user U. Similarity for any two users ‘a’ and ‘b’ can be calculated from the given formula, $Sim(a,b)=\frac {\sum_p (r_{ap}-\bar r_a)(r_{ab}-\bar r_b)}{\sqrt{\sum (r_{ap}-\bar r_a)^2} \sqrt{\sum (r_{bp}-\bar r_b)^2}}\newline r_{up}:rating \hspace{0.1cm} of\hspace{0.1cm}user\hspace{0.1cm}u\hspace{0.1cm}against\hspace{0.1cm}item\hspace{0.1cm}p \newline p:items$

Step 2: Prediction of missing rating of an item Now, the target user might be very similar to some users and may not be much similar to others. Hence, the ratings given to a particular item by the more similar users should be given more weightage than those given by less similar users and so on. This problem can be solved by using a weighted average approach. In this approach, you multiply the rating of each user with a similarity factor calculated using the above mention formula. The missing rating can be calculated as, $r_{up}=\bar r_u+\frac{\sum_{i\in users}sim(u,i)*r_{ip}}{\sum_{i\in users}|sim(u,i)|}$

Example: Consider a matrix that shows four users Alice, U1, U2 and U3 rating on different news apps. The rating range is from 1 to 5 on the basis of users’ likability of the news app. The ‘?’ indicates that the user has not rated the app.

Name	Inshorts(I1)	HT(I2)	NYT(I3)	TOI(I4)	BBC(I5)
Alice	5	4	1	4	?
U1	3	1	2	3	3
U2	4	3	4	3	5
U3	3	3	1	5	4

Step 1: Calculating the similarity between Alice and all the other users At first we calculate the averages of the ratings of all the user excluding I5 as it is not rated by Alice. Therefore, we calculate the average as , $\bar r_i= \frac {\sum_p r_{ip}}{\sum p}$ Therefore, we have $\bar r_{Alice}=3.5\newline \bar r_{U1}=2.25\newline \bar r_{U2}=3.5\newline \bar r_{U3}=3$ and calculate the new ratings as, $r_{ip}'=r_{ip}-\bar r_i$ Hence, we get the following matrix,

Name	Inshorts(I1)	HT(I2)	NYT(I3)	TOI(I4)
Alice	1.5	0.5	-2.5	0.5
U1	0.75	-1.25	-0.25	0.75
U2	0.5	-0.5	0.5	-0.5
U3	0	0	-2	2

Now, we calculate the similarity between Alice and all the other users. $Sim(Alice,U1)=\frac {((1.5*0.75)+(0.5*-1.25)+(-2.5*-0.25)+(.5*0.75))}{\sqrt{(1.5^2+0.5^2+2.5^2+0.5^2)} \sqrt{(0.75^2+1.25^2+0.25^2+0.75^2)}}=0.301\newline$ [Tex]Sim(Alice,U2)=\frac {((1.5*0.25)+(0.5*-0.5)+(-2.5*0.5)+(.5*-0.5))}{\sqrt{(1.5^2+0.5^2+2.5^2+0.5^2)} \sqrt{(0.5^2+0.5^2+0.5^2+0.5^2)}}=-0.33\newline [/Tex] $Sim(Alice,U3)=\frac {((1.5*0)+(0.5*0)+(-2.5*-2)+(.5*2))}{\sqrt{(1.5^2+0.5^2+2.5^2+0.5^2)} \sqrt{(0^2+0^2+2^2+2^2)}}=0.707\newline$

Step 2: Predicting the rating of the app not rated by Alice Now, we predict Alice’s rating for BBC News App, $r_{(Alice,I5)}=\bar r_{Alice}+ \frac {(sim(Alice,U1)*(r_{U1,I5}-\bar r_{U1}))+(sim(Alice,U2)*(r_{U2,I5}-\bar r_{U2}))+(sim(Alice,U3)*(r_{U3,I5}-\bar r_{U3})}{|sim(Alice,U1)|+|sim(Alice,U2)|+|sim(Alice,U3|}\newline \newline$ [Tex]r_{(Alice,I5)}=3.5 + \frac {(0.301*0.75)+(-0.33*1.5)+(0.707*1)}{|0.301|+|-0.33|+|0.707|}=3.83\newline [/Tex]Hence, with the help of a small example we tried to understand the working of User-based collaborative filtering.

Suggest improvement

Exploring Data Distribution | Set 2

Dropout in Neural Networks

Share your thoughts in the comments

User-Based Collaborative Filtering

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?