Skip to content
Related Articles

Related Articles

Improve Article
Save Article
Like Article

Python – Facial and hand recognition using MediaPipe Holistic

  • Difficulty Level : Expert
  • Last Updated : 03 Nov, 2021

What is MediaPipe:  

Object Detection is one of the leading and most popular use cases in the domain of computer vision. Several object detection models are used worldwide for their particular use case applications. Many of these models have been used as an independent solution to a single computer vision task with its own fixed application. Combining several of these tasks into a single end-to-end solution, in real-time, is exactly what MediaPipe does.

MediaPipe is an open-source, cross-platform Machine Learning framework used for building complex and multimodal applied machine learning pipelines. It can be used to make cutting-edge Machine Learning Models like face detection, multi-hand tracking, object detection, and tracking, and many more. MediaPipe basically acts as a mediator for handling the implementation of models for systems running on any platform which helps the developer focus more on experimenting with models, than on the system.

Attention reader! Don’t stop learning now. Get hold of all the important Machine Learning Concepts with the Machine Learning Foundation Course at a student-friendly price and become industry ready.



 Possibilities with MediaPipe:

  1. Human Pose Detection and Tracking High-fidelity human body pose tracking, inferring a minimum of 25 2D upper-body landmarks from RGB video frames
  2. Face Mesh 468 face landmarks in 3D with multi-face support
  3. Hand Tracking 21 landmarks in 3D with multi-hand support, based on high-performance palm detection and hand landmark model
  4. Holistic Tracking Simultaneous and semantically consistent tracking of 33 pose, 21 per-hand, and 468 facial landmarks
  5. Hair Segmentation Super realistic real-time hair recoloring
  6. Object Detection and Tracking Detection and tracking of objects in the video in a single pipeline
  7. Face Detection Ultra-lightweight face detector with 6 landmarks and multi-face support
  8. Iris Tracking and Depth Estimation Accurate human iris tracking and metric depth estimation without specialized hardware. Tracks iris, pupil, and eye contour landmarks.
  9. 3D Object Detection Detection and 3D pose estimation of everyday objects like shoes and chairs

MediaPipe Holistic:

Mediapipe Holistic is one of the pipelines which contains optimized face, hands, and pose components which allows for holistic tracking, thus enabling the model to simultaneously detect hand and body poses along with face landmarks. one of the main usages of MediaPipe holistic is to detect face and hands and extract key points to pass on to a computer vision model.



Detect face and hands using Holistic and extract key points

The following code snippet is a function to access image input from system web camera using OpenCV framework, detect hand and facial landmarks and extract key points.

Python3




'''
Install dependencies
pip install opencv-python
pip install mediapipe
'''
# Import packages
import cv2
import mediapipe as mp
 
#Build Keypoints using MP Holistic
mp_holistic = mp.solutions.holistic # Holistic model
mp_drawing = mp.solutions.drawing_utils # Drawing utilities
 
def mediapipe_detection(image, model):
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # COLOR CONVERSION BGR 2 RGB
    image.flags.writable = False                  # Image is no longer writable
    results = model.process(image)                 # Make prediction
    image.flags.writable = True                   # Image is now writable
    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR) # COLOR CONVERSION RGB 2 BGR
    return image, results
   
def draw_landmarks(image, results):
    mp_drawing.draw_landmarks(
      image, results.face_landmarks, mp_holistic.FACE_CONNECTIONS) # Draw face connections
    mp_drawing.draw_landmarks(
      image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS) # Draw pose connections
    mp_drawing.draw_landmarks(
      image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS) # Draw left hand connections
    mp_drawing.draw_landmarks(
      image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS) # Draw right hand connections
     
def draw_styled_landmarks(image, results):
    # Draw face connections
    mp_drawing.draw_landmarks(
      image, results.face_landmarks, mp_holistic.FACE_CONNECTIONS,
      mp_drawing.DrawingSpec(color=(80,110,10), thickness=1, circle_radius=1),
      mp_drawing.DrawingSpec(color=(80,256,121), thickness=1, circle_radius=1))
    # Draw pose connections
    mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS,
                             mp_drawing.DrawingSpec(color=(80,22,10), thickness=2, circle_radius=4),
                             mp_drawing.DrawingSpec(color=(80,44,121), thickness=2, circle_radius=2)
                             )
    # Draw left hand connections
    mp_drawing.draw_landmarks(image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS,
                             mp_drawing.DrawingSpec(color=(121,22,76), thickness=2, circle_radius=4),
                             mp_drawing.DrawingSpec(color=(121,44,250), thickness=2, circle_radius=2)
                             )
    # Draw right hand connections 
    mp_drawing.draw_landmarks(image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS,
                             mp_drawing.DrawingSpec(color=(245,117,66), thickness=2, circle_radius=4),
                             mp_drawing.DrawingSpec(color=(245,66,230), thickness=2, circle_radius=2)
                             )
#Main function
cap = cv2.VideoCapture(0)
# Set mediapipe model
with mp_holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:
    while cap.isOpened():
 
        # Read feed
        ret, frame = cap.read()
 
        # Make detections
        image, results = mediapipe_detection(frame, holistic)
        print(results)
         
        # Draw landmarks
        draw_styled_landmarks(image, results)
 
        # Show to screen
        cv2.imshow('OpenCV Feed', image)
 
        # Break gracefully
        if cv2.waitKey(10) & 0xFF == ord('q'):
            break
    cap.release()
    cv2.destroyAllWindows()

Landmarks that can be detected using Mediapipe Holistic

1. Pose landmarks

Credit: MediaPipe

2. Hand Landmarks

Credit: MediaPipe

3. Face Landmarks

Image generated using the mediapipe function posted above and plotted on a graphical plane using matplotlib.             Source: Author

 




My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!