Python – Facial and hand recognition using MediaPipe Holistic
What is MediaPipe:
Object Detection is one of the leading and most popular use cases in the domain of computer vision. Several object detection models are used worldwide for their particular use case applications. Many of these models have been used as an independent solution to a single computer vision task with its own fixed application. Combining several of these tasks into a single end-to-end solution, in real-time, is exactly what MediaPipe does.
MediaPipe is an open-source, cross-platform Machine Learning framework used for building complex and multimodal applied machine learning pipelines. It can be used to make cutting-edge Machine Learning Models like face detection, multi-hand tracking, object detection, and tracking, and many more. MediaPipe basically acts as a mediator for handling the implementation of models for systems running on any platform which helps the developer focus more on experimenting with models, than on the system.
Possibilities with MediaPipe:
- Human Pose Detection and Tracking High-fidelity human body pose tracking, inferring a minimum of 25 2D upper-body landmarks from RGB video frames
- Face Mesh 468 face landmarks in 3D with multi-face support
- Hand Tracking 21 landmarks in 3D with multi-hand support, based on high-performance palm detection and hand landmark model
- Holistic Tracking Simultaneous and semantically consistent tracking of 33 pose, 21 per-hand, and 468 facial landmarks
- Hair Segmentation Super realistic real-time hair recoloring
- Object Detection and Tracking Detection and tracking of objects in the video in a single pipeline
- Face Detection Ultra-lightweight face detector with 6 landmarks and multi-face support
- Iris Tracking and Depth Estimation Accurate human iris tracking and metric depth estimation without specialized hardware. Tracks iris, pupil, and eye contour landmarks.
- 3D Object Detection Detection and 3D pose estimation of everyday objects like shoes and chairs
MediaPipe Holistic:
Mediapipe Holistic is one of the pipelines which contains optimized face, hands, and pose components which allows for holistic tracking, thus enabling the model to simultaneously detect hand and body poses along with face landmarks. one of the main usages of MediaPipe holistic is to detect face and hands and extract key points to pass on to a computer vision model.
Detect face and hands using Holistic and extract key points
The following code snippet is a function to access image input from system web camera using OpenCV framework, detect hand and facial landmarks and extract key points.
Python3
''' Install dependencies pip install opencv-python pip install mediapipe ''' # Import packages import cv2 import mediapipe as mp #Build Keypoints using MP Holistic mp_holistic = mp.solutions.holistic # Holistic model mp_drawing = mp.solutions.drawing_utils # Drawing utilities def mediapipe_detection(image, model): image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # COLOR CONVERSION BGR 2 RGB image.flags.writable = False # Image is no longer writable results = model.process(image) # Make prediction image.flags.writable = True # Image is now writable image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR) # COLOR CONVERSION RGB 2 BGR return image, results def draw_landmarks(image, results): mp_drawing.draw_landmarks( image, results.face_landmarks, mp_holistic.FACE_CONNECTIONS) # Draw face connections mp_drawing.draw_landmarks( image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS) # Draw pose connections mp_drawing.draw_landmarks( image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS) # Draw left hand connections mp_drawing.draw_landmarks( image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS) # Draw right hand connections def draw_styled_landmarks(image, results): # Draw face connections mp_drawing.draw_landmarks( image, results.face_landmarks, mp_holistic.FACE_CONNECTIONS, mp_drawing.DrawingSpec(color = ( 80 , 110 , 10 ), thickness = 1 , circle_radius = 1 ), mp_drawing.DrawingSpec(color = ( 80 , 256 , 121 ), thickness = 1 , circle_radius = 1 )) # Draw pose connections mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS, mp_drawing.DrawingSpec(color = ( 80 , 22 , 10 ), thickness = 2 , circle_radius = 4 ), mp_drawing.DrawingSpec(color = ( 80 , 44 , 121 ), thickness = 2 , circle_radius = 2 ) ) # Draw left hand connections mp_drawing.draw_landmarks(image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS, mp_drawing.DrawingSpec(color = ( 121 , 22 , 76 ), thickness = 2 , circle_radius = 4 ), mp_drawing.DrawingSpec(color = ( 121 , 44 , 250 ), thickness = 2 , circle_radius = 2 ) ) # Draw right hand connections mp_drawing.draw_landmarks(image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS, mp_drawing.DrawingSpec(color = ( 245 , 117 , 66 ), thickness = 2 , circle_radius = 4 ), mp_drawing.DrawingSpec(color = ( 245 , 66 , 230 ), thickness = 2 , circle_radius = 2 ) ) #Main function cap = cv2.VideoCapture( 0 ) # Set mediapipe model with mp_holistic.Holistic(min_detection_confidence = 0.5 , min_tracking_confidence = 0.5 ) as holistic: while cap.isOpened(): # Read feed ret, frame = cap.read() # Make detections image, results = mediapipe_detection(frame, holistic) print (results) # Draw landmarks draw_styled_landmarks(image, results) # Show to screen cv2.imshow( 'OpenCV Feed' , image) # Break gracefully if cv2.waitKey( 10 ) & 0xFF = = ord ( 'q' ): break cap.release() cv2.destroyAllWindows() |
Landmarks that can be detected using Mediapipe Holistic
1. Pose landmarks

Credit: MediaPipe
2. Hand Landmarks

Credit: MediaPipe
3. Face Landmarks

Image generated using the mediapipe function posted above and plotted on a graphical plane using matplotlib. Source: Author
Please Login to comment...