User-Based Collaborative Filtering

User-Based Collaborative Filtering is a technique used to predict the items that a user might like on the basis of ratings given to that item by the other users who have similar taste with that of the target user.
Many websites use collaborative filtering for building their recommendation system.

Steps for User-Based Collaborative Filtering:

Step 1: Finding the similarity of users to the target user U.

Similarity for any two users ‘a’ and ‘b’ can be calculated from the given formula,

Sim(a,b)=\frac {\sum_p (r_{ap}-\bar r_a)(r_{ab}-\bar r_b)}{\sqrt{\sum (r_{ap}-\bar r_a)^2} \sqrt{\sum (r_{bp}-\bar r_b)^2}}\newline r_{up}:rating \hspace{0.1cm} of\hspace{0.1cm}user\hspace{0.1cm}u\hspace{0.1cm}against\hspace{0.1cm}item\hspace{0.1cm}p \newline p:items

Step 2: Prediction of missing rating of an item
Now, the target user might be very similar to some users and may not be much similar to the others. Hence, the ratings given to a particular item by the more similar users should be given more weightage than those given by less similar users and so on. This problem can be solved by using a weighted average approach. In this approach, you multiply the rating of each user with a similarity factor calculated using the above mention formula.



The missing rating can be calculated as,

r_{up}=\bar r_u+\frac{\sum_{i\in users}sim(u,i)*r_{ip}}{\sum_{i\in users}|sim(u,i)|}

Example: Consider a matrix which shows four users Alice, U1, U2 and U3 rating on different news apps. The rating range is from 1 to 5 on the basis of users likability of the news app. The ‘?’ indicates that the app has not been rated by the user.

Name Inshorts(I1) HT(I2) NYT(I3) TOI(I4) BBC(I5)
Alice 5 4 1 4 ?
U1 3 1 2 3 3
U2 4 3 4 3 5
U3 3 3 1 5 4

Step 1: Calculating the similarity between Alice and all the other users
At first we calculate the averages of the ratings of all the user excluding I5 as it is not rated by Alice. Therefore, we calculate the average as ,

\bar r_i= \frac {\sum_p r_{ip}}{\sum p}

Therefore, we have
\bar r_{Alice}=3.5\newline  \bar r_{U1}=2.25\newline \bar r_{U2}=3.5\newline \bar r_{U3}=3

and calculate the new ratings as,
r_{ip}'=r_{ip}-\bar r_i

Hence, we get the following matrix,

Name Inshorts(I1) HT(I2) NYT(I3) TOI(I4)
Alice 1.5 0.5 -2.5 0.5
U1 0.75 -1.25 -0.25 0.75
U2 0.5 -0.5 0.5 -0.5
U3 0 0 -2 2

Now, we calculate the similarity between Alice and all the other users.

Sim(Alice,U1)=\frac {((1.5*0.75)+(0.5*-1.25)+(-2.5*-0.25)+(.5*0.75))}{\sqrt{(1.5^2+0.5^2+2.5^2+0.5^2)} \sqrt{(0.75^2+1.25^2+0.25^2+0.75^2)}}=0.301\newline

Sim(Alice,U2)=\frac {((1.5*0.25)+(0.5*-0.5)+(-2.5*0.5)+(.5*-0.5))}{\sqrt{(1.5^2+0.5^2+2.5^2+0.5^2)} \sqrt{(0.5^2+0.5^2+0.5^2+0.5^2)}}=-0.33\newline

Sim(Alice,U3)=\frac {((1.5*0)+(0.5*0)+(-2.5*-2)+(.5*2))}{\sqrt{(1.5^2+0.5^2+2.5^2+0.5^2)} \sqrt{(0^2+0^2+2^2+2^2)}}=0.707\newline

Step 2: Predicting the rating of the app not rated by Alice

Now, we predict Alice’s rating for BBC News App,

r_{(Alice,I5)}=\bar r_{Alice}+ \frac {(sim(Alice,U1)*(r_{U1,I5}-\bar r_{U1}))+(sim(Alice,U2)*(r_{U2,I5}-\bar r_{U2}))+(sim(Alice,U3)*(r_{U3,I5}-\bar r_{U3})}{|r_{U1,I5}|+|r_{U2,I5}|+|r_{U3,I5}|}\newline \newline

r_{(Alice,I5)}=3.5 + \frac {(0.301*0.75)+(-0.33*1.5)+(0.707*1)}{|0.301|+|-0.33|+|0.707|}=3.83\newline

Hence, with the help of a small example we tried to understand the working of User-based collaborative filtering.




My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.