Open In App

ML | Implement Face recognition using k-NN with scikit-learn

Last Updated : 15 Mar, 2019
Like Article

k-Nearest Neighbors:

k-NN is one of the most basic classification algorithms in machine learning. It belongs to the supervised learning category of machine learning. k-NN is often used in search applications where you are looking for “similar” items. The way we measure similarity is by creating a vector representation of the items, and then compare the vectors using an appropriate distance metric (like the Euclidean distance, for example).

It is generally used in data mining, pattern recognition, recommender systems and intrusion detection.

Libraries used are:


Dataset used:
We used haarcascade_frontalface_default.xml dataset which is easily available online and also you can download it from this link.

scikit-learn provides a range of supervised and unsupervised learning algorithms via a consistent interface in Python.
This library is built upon SciPy that must be installed on your devices in order to use scikit_learn.

Face-Recognition :
This includes three Python files where the first one is used to detect the face and storing it in a list format, second one is used to store the data in ‘.csv’ file format and the third one is used recognize the face.

# this file is used to detect face 
# and then store the data of the face
import cv2
import numpy as np
# import the file where data is
# stored in a csv file format
import npwriter
name = input("Enter your name: ")
# this is used to access the web-cam
# in order to capture frames
cap = cv2.VideoCapture(0)
classifier = cv2.CascadeClassifier("../dataset/haarcascade_frontalface_default.xml")
# this is class used to detect the faces as provided
# with a haarcascade_frontalface_default.xml file as data
f_list = []
while True:
    ret, frame =
    # converting the image into gray
    # scale as it is easy for detection
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    # detect multiscale, detects the face and its coordinates
    faces = classifier.detectMultiScale(gray, 1.5, 5)
    # this is used to detect the face which
    # is closest to the web-cam on the first position
    faces = sorted(faces, key = lambda x: x[2]*x[3],
                                     reverse = True)
    # only the first detected face is used
    faces = faces[:1]  
    # len(faces) is the number of
    # faces showing in a frame
    if len(faces) == 1:   
        # this is removing from tuple format      
        face = faces[0]   
        # storing the coordinates of the
        # face in different variables
        x, y, w, h = face 
        # this is will show the face
        # that is being detected     
        im_face = frame[y:y + h, x:x + w] 
        cv2.imshow("face", im_face)
    if not ret:
    cv2.imshow("full", frame)
    key = cv2.waitKey(1)
    # this will break the execution of the program
    # on pressing 'q' and will click the frame on pressing 'c'
    if key & 0xFF == ord('q'):
    elif key & 0xFF == ord('c'):
        if len(faces) == 1:
            gray_face = cv2.cvtColor(im_face, cv2.COLOR_BGR2GRAY)
            gray_face = cv2.resize(gray_face, (100, 100))
            print(len(f_list), type(gray_face), gray_face.shape)
            # this will append the face's coordinates in f_list
            print("face not found")
        # this will store the data for detected
        # face 10 times in order to increase accuracy
        if len(f_list) == 10:
# declared in npwriter
npwriter.write(name, np.array(f_list)) 
cv2.destroyAllWindows() – Create/Update ‘.csv’: file

import pandas as pd
import numpy as np
import os.path
f_name = "face_data.csv"
# storing the data into a csv file
def write(name, data):
    if os.path.isfile(f_name):
        df = pd.read_csv(f_name, index_col = 0)
        latest = pd.DataFrame(data, columns = map(str, range(10000)))
        latest["name"] = name
        df = pd.concat((df, latest), ignore_index = True, sort = False)
        # Providing range only because the data
        # here is already flattened for when
        # it was store in f_list
        df = pd.DataFrame(data, columns = map(str, range(10000)))
        df["name"] = name
    df.to_csv(f_name) – Face-recognizer

# this one is used to recognize the 
# face after training the model with
# our data stored using knn
import cv2
import numpy as np
import pandas as pd
from npwriter import f_name
from sklearn.neighbors import KNeighborsClassifier
# reading the data
data = pd.read_csv(f_name).values
# data partition
X, Y = data[:, 1:-1], data[:, -1]
print(X, Y)
# Knn function calling with k = 5
model = KNeighborsClassifier(n_neighbors = 5)
# fdtraining of model, Y)
cap = cv2.VideoCapture(0)
classifier = cv2.CascadeClassifier("../dataset/haarcascade_frontalface_default.xml")
f_list = []
while True:
    ret, frame =
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    faces = classifier.detectMultiScale(gray, 1.5, 5)
    X_test = []
    # Testing data
    for face in faces:
        x, y, w, h = face
        im_face = gray[y:y + h, x:x + w]
        im_face = cv2.resize(im_face, (100, 100))
    if len(faces)>0:
        response = model.predict(np.array(X_test))
        # prediction of result using knn
        for i, face in enumerate(faces):
            x, y, w, h = face
            # drawing a rectangle on the detected face
            cv2.rectangle(frame, (x, y), (x + w, y + h),
                                         (255, 0, 0), 3)
            # adding detected/predicted name for the face
            cv2.putText(frame, response[i], (x-50, y-50),
                              cv2.FONT_HERSHEY_DUPLEX, 2,
                                         (0, 255, 0), 3)
    cv2.imshow("full", frame)
    key = cv2.waitKey(1)
    if key & 0xFF == ord("q") :


Similar Reads

Face Comparison Using Face++ and Python
Prerequisites: Python Programming Language Python is a high-level general-purpose language. It is used for multiple purposes like AI, Web Development, Web Scraping, etc. One such use of Python can be Face Comparison. A module name python-facepp can be used for doing the same. This module is for communicating with Face++ facial recognition service.
3 min read
Python | Multiple Face Recognition using dlib
This article aims to quickly build a Python face recognition program to easily train multiple images per person and get started with recognizing known faces in an image. In this article, the code uses ageitgey's face_recognition API for Python. This API is built using dlib's face recognition algorithms and it allows the user to easily implement fac
4 min read
Python | Face recognition using GUI
In this article, a fairly simple way is mentioned to implement facial recognition system using Python and OpenCV module along with the explanation of the code step by step in the comments.Before starting we need to install some libraries in order to implement the code. Below you will see the usage of the library along with the code to install it: O
8 min read
ML | Face Recognition Using PCA Implementation
Face Recognition is one of the most popular and controversial tasks of computer vision. One of the most important milestones is achieved using This approach was first developed by Sirovich and Kirby in 1987 and first used by Turk and Alex Pentland in face classification in 1991. It is easy to implement and thus used in many early face recognition a
6 min read
ML | Face Recognition Using Eigenfaces (PCA Algorithm)
In 1991, Turk and Pentland suggested an approach to face recognition that uses dimensionality reduction and linear algebra concepts to recognize faces. This approach is computationally less expensive and easy to implement and thus used in various applications at that time such as handwritten recognition, lip-reading, medical image analysis, etc. PC
4 min read
Face recognition using Artificial Intelligence
The current technology amazes people with amazing innovations that not only make life simple but also bearable. Face recognition has over time proven to be the least intrusive and fastest form of biometric verification. The software uses deep learning algorithms to compare a live captured image to the stored face print to verify one's identity. Ima
18 min read
Emojify using Face Recognition with Machine Learning
In this article, we will learn how to implement a modification app that will show an emoji of expression which resembles the expression on your face. This is a fun project based on computer vision in which we use an image classification model in reality to classify different expressions of a person. This project will be implemented in two parts: Bu
7 min read
How to Implement Stratified Sampling with Scikit-Learn
In this article, we will learn about How to Implement Stratified Sampling with Scikit-Learn. What is Stratified sampling?Stratified sampling is a sampling technique in which the population is subdivided into groups based on specific characteristics relevant to the problem before sampling. The samples are drawn from this group with ample sizes propo
7 min read
Deep Face Recognition
DeepFace is the facial recognition system used by Facebook for tagging images. It was proposed by researchers at Facebook AI Research (FAIR) at the 2014 IEEE Computer Vision and Pattern Recognition Conference (CVPR). In modern face recognition there are 4 steps: DetectAlignRepresentClassify This approach focuses on alignment and representation of f
8 min read
Face Recognition with Local Binary Patterns (LBPs) and OpenCV
In this article, Face Recognition with Local Binary Patterns (LBPs) and OpenCV is discussed. Let's start with understanding the logic behind performing face recognition using LBPs. A beginner-friendly explanation of LBPs is described below. Local Binary Patterns (LBP)LBP stands for Local Binary Patterns. It's a technique used to describe the textur
13 min read