Yawn Detection using OpenCV and Dlib

  • Last Updated : 07 Mar, 2022
In this article, we’ll cover all the steps required to build a Yawn detection program using OpenCV and dlib packages. But before doing this project should be familiar with the basics of OpenCV and you should also know how to use face detection and landmark detection using dlib module.


  1. Dlib library installed 
  2. Dlib face landmark ‘.dat’ file. Optional: XML file for harcasscade classifier (if you want to use the harcasscade classifier link:-
  3. OpenCV package should be installed in your environment


  1. Initialize the Video Rendering Object using VideoCapture Method in OpenCV
  2. Create a grayscale image
  3. Instantiate Model objects both for face and landmark detection
  4. Detect Faces and then pass the Face as input to the Landmark detection model
  5. Calculate the upper and lower lip distance ( or whatever metric you want to use for yawn detection)
  6. Create an If the condition for the lip distance
  7. Show the frame/image



import numpy as np
import cv2
import dlib
import time 
from scipy.spatial import distance as dist
from imutils import face_utils
def cal_yawn(shape): 
    top_lip = shape[50:53]
    top_lip = np.concatenate((top_lip, shape[61:64]))
    low_lip = shape[56:59]
    low_lip = np.concatenate((low_lip, shape[65:68]))
    top_mean = np.mean(top_lip, axis=0)
    low_mean = np.mean(low_lip, axis=0)
    distance = dist.euclidean(top_mean,low_mean)
    return distance
cam = cv2.VideoCapture('')
face_model = dlib.get_frontal_face_detector()
landmark_model = dlib.shape_predictor('Model\shape_predictor_68_face_landmarks.dat')
yawn_thresh = 35
ptime = 0
while True
    suc,frame =
    if not suc : 
    ctime = time.time() 
    fps= int(1/(ctime-ptime))
    ptime = ctime
    #------Detecting face------#
    img_gray = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)
    faces = face_model(img_gray)
    for face in faces:
        # #------Uncomment the following lines if you also want to detect the face ----------#
        # x1 = face.left()
        # y1 =
        # x2 = face.right()
        # y2 = face.bottom()
        # # print(
        # cv2.rectangle(frame,(x1,y1),(x2,y2),(200,0,00),2)
        #----------Detect Landmarks-----------#
        shapes = landmark_model(img_gray,face)
        shape = face_utils.shape_to_np(shapes)
        #-------Detecting/Marking the lower and upper lip--------#
        lip = shape[48:60]
        cv2.drawContours(frame,[lip],-1,(0, 165, 255),thickness=3)
        #-------Calculating the lip distance-----#
        lip_dist = cal_yawn(shape)
        # print(lip_dist)
        if lip_dist > yawn_thresh : 
            cv2.putText(frame, f'User Yawning!',(frame.shape[1]//2 - 170 ,frame.shape[0]//2),cv2.FONT_HERSHEY_SIMPLEX,2,(0,0,200),2)  
    cv2.imshow('Webcam' , frame)
    if cv2.waitKey(1) & 0xFF == ord('q') : 


What next?

You can try combining this program with an eye blink detection/liveliness detection program for predicting the user state, this can serve as the basis for a real-world application to detect user state and set alarms or reminders accordingly.

