Open In App

Python – Facial and hand recognition using MediaPipe Holistic

Last Updated : 04 Jan, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

What is MediaPipe:  

Object Detection is one of the leading and most popular use cases in the domain of computer vision. Several object detection models are used worldwide for their particular use case applications. Many of these models have been used as an independent solution to a single computer vision task with its own fixed application. Combining several of these tasks into a single end-to-end solution, in real-time, is exactly what MediaPipe does.

MediaPipe is an open-source, cross-platform Machine Learning framework used for building complex and multimodal applied machine learning pipelines. It can be used to make cutting-edge Machine Learning Models like face detection, multi-hand tracking, object detection, and tracking, and many more. MediaPipe basically acts as a mediator for handling the implementation of models for systems running on any platform which helps the developer focus more on experimenting with models, than on the system.

 Possibilities with MediaPipe:

  1. Human Pose Detection and Tracking High-fidelity human body pose tracking, inferring a minimum of 25 2D upper-body landmarks from RGB video frames
  2. Face Mesh 468 face landmarks in 3D with multi-face support
  3. Hand Tracking 21 landmarks in 3D with multi-hand support, based on high-performance palm detection and hand landmark model
  4. Holistic Tracking Simultaneous and semantically consistent tracking of 33 pose, 21 per-hand, and 468 facial landmarks
  5. Hair Segmentation Super realistic real-time hair recoloring
  6. Object Detection and Tracking Detection and tracking of objects in the video in a single pipeline
  7. Face Detection Ultra-lightweight face detector with 6 landmarks and multi-face support
  8. Iris Tracking and Depth Estimation Accurate human iris tracking and metric depth estimation without specialized hardware. Tracks iris, pupil, and eye contour landmarks.
  9. 3D Object Detection Detection and 3D pose estimation of everyday objects like shoes and chairs

MediaPipe Holistic:

Mediapipe Holistic is one of the pipelines which contains optimized face, hands, and pose components which allows for holistic tracking, thus enabling the model to simultaneously detect hand and body poses along with face landmarks. one of the main usages of MediaPipe holistic is to detect face and hands and extract key points to pass on to a computer vision model.

Detect face and hands using Holistic and extract key points

The following code snippet is a function to access image input from system web camera using OpenCV framework, detect hand and facial landmarks and extract key points.

Python3




'''
Install dependencies 
pip install opencv-python 
pip install mediapipe
'''
# Import packages
import cv2
import mediapipe as mp
  
#Build Keypoints using MP Holistic
mp_holistic = mp.solutions.holistic # Holistic model
mp_drawing = mp.solutions.drawing_utils # Drawing utilities
  
def mediapipe_detection(image, model):
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # COLOR CONVERSION BGR 2 RGB
    image.flags.writable = False                  # Image is no longer writable
    results = model.process(image)                 # Make prediction
    image.flags.writable = True                   # Image is now writable 
    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR) # COLOR CONVERSION RGB 2 BGR
    return image, results
    
def draw_landmarks(image, results):
    mp_drawing.draw_landmarks(
      image, results.face_landmarks, mp_holistic.FACE_CONNECTIONS) # Draw face connections
    mp_drawing.draw_landmarks(
      image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS) # Draw pose connections
    mp_drawing.draw_landmarks(
      image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS) # Draw left hand connections
    mp_drawing.draw_landmarks(
      image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS) # Draw right hand connections
      
def draw_styled_landmarks(image, results):
    # Draw face connections
    mp_drawing.draw_landmarks(
      image, results.face_landmarks, mp_holistic.FACE_CONNECTIONS,
      mp_drawing.DrawingSpec(color=(80,110,10), thickness=1, circle_radius=1), 
      mp_drawing.DrawingSpec(color=(80,256,121), thickness=1, circle_radius=1)) 
    # Draw pose connections
    mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS,
                             mp_drawing.DrawingSpec(color=(80,22,10), thickness=2, circle_radius=4), 
                             mp_drawing.DrawingSpec(color=(80,44,121), thickness=2, circle_radius=2)
                             
    # Draw left hand connections
    mp_drawing.draw_landmarks(image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS, 
                             mp_drawing.DrawingSpec(color=(121,22,76), thickness=2, circle_radius=4), 
                             mp_drawing.DrawingSpec(color=(121,44,250), thickness=2, circle_radius=2)
                             
    # Draw right hand connections  
    mp_drawing.draw_landmarks(image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS, 
                             mp_drawing.DrawingSpec(color=(245,117,66), thickness=2, circle_radius=4), 
                             mp_drawing.DrawingSpec(color=(245,66,230), thickness=2, circle_radius=2)
                             
#Main function
cap = cv2.VideoCapture(0)
# Set mediapipe model 
with mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:
    while cap.isOpened():
  
        # Read feed
        ret, frame = cap.read()
  
        # Make detections
        image, results = mediapipe_detection(frame, holistic)
        print(results)
          
        # Draw landmarks
        draw_styled_landmarks(image, results)
  
        # Show to screen
        cv2.imshow('OpenCV Feed', image)
  
        # Break gracefully
        if cv2.waitKey(10) & 0xFF == ord('q'):
            break
    cap.release()
    cv2.destroyAllWindows()


Landmarks that can be detected using Mediapipe Holistic

1. Pose landmarks

Credit: MediaPipe

2. Hand Landmarks

Credit: MediaPipe

3. Face Landmarks

Image generated using the mediapipe function posted above and plotted on a graphical plane using matplotlib.             Source: Author

 



Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads