Mathematical explanation of K-Nearest Neighbour
KNN stands for K-nearest neighbour, it’s one of the Supervised learning algorithm mostly used for classification of data on the basis how it’s neighbour are classified. KNN stores all available cases and classifies new cases based on a similarity measure. K in KNN is a parameter that refers to the number of the nearest neighbours to include in the majority voting process.
How do we choose K?
Sqrt(n), where n is a total number of data points(if in case n is even we have to make the value odd by adding 1 or subtracting 1 that helps in select better)
When to use KNN?
We can use KNN when Dataset is labelled and noise-free and it’s must be small because KNN is a “Lazy learner”. Let’s understand KNN algorithm with the help of an example
|NAME||AGE||GENDER||CLASS OF SPORTS|
Here male is denoted with numeric value 0 and female with 1. Let’s find in which class of people Angelina will lie whose k factor is 3 and age is 5. So we have to find out the distance using
d=√((x2-x1)²+(y2-y1)²) to find the distance between any two points.
So let’s find out the distance between Ajay and Angelina using formula
Similarly, we find out all distance one by one.
|Distance between Angelina and||Distance|
So the value of k factor is 3 for Angelina. And the closest to 3 is 9,10,10.5 that is closest to Angelina are Zaira, Smith and Michael.
Zaira 9 cricket
Michael 10 cricket
smith 10.5 football
so according to KNN algorithm, Angelina will be in the class of people who like cricket. So this is how KNN algorithm works.