Open In App

OpenPose : Human Pose Estimation Method

OpenPose is the first real-time multi-person system to jointly detect human body, hand, facial, and foot key-points (in total 135 key-points) on single images. It was proposed by researchers at Carnegie Mellon University. They have released in the form of Python code, C++ implementation and Unity Plugin. These resources can be downloaded from OpenPose repository.

Architecture: 



Confidence Maps and Part Affinity Fields

where J is the number of body parts locations.
  • Part Affinity Fields: Part Affinity is a set of 2D vector fields that encodes location and orientation of limbs of different people in the image. It encodes the data in  the form of pairwise connections between body parts.
  • Multi Stage CNN:



    The above multi-CNN architecture has three major steps:

    Loss functions:

    An L2-loss function is used to calculate the loss between the predicted confidence maps and Part Affinity fields to the ground truth maps and fields.
    where Lc* is the ground truth part affinity fields, Sj* is the ground truth part confidence map, and W is a binary mask with W(p) = 0 when the annotation is missing at the pixel p. This is to prevent the extra loss that can be generated by these mask. The intermediate supervision at each stage is used to address the problem of vanishing gradient problem by replenishing the gradient periodically.

    Confidence Maps: 

    The Confidence maps for each person k and each body part j is defined by:
    It is a Gaussian curve with gradual changes where sigma controls the spread of the peak. The predicted peak of the network is an aggregation of the individual confidence maps by a max operator.

    Part Affinity Fields:

    The part affinity field is required especially in multi person pose detection we are required to map the correct body parts to its body. Because for multiple persons, there are multiple heads, hands, shoulders etc. Thus it becomes difficult to distinguish sometimes when they closely grouped together. PAF provides a connection between different part of the body that belongs to the same person. A stronger PAF link between body parts represents that high chances that those body parts belong to the same person.

    Changes from CMUPose:

    CMUPose is the earlier version of OpenPose. It is the architecture that won the COCO 2016 Key point detection challenge 2016.

    Foot Detection:

    OpenPose also proposed a foot detection algorithm. It makes OpenPose the first combined body and foot keypoint dataset and detector. By adding that it is able to detect ankle more accurately.

    Vehicle Detection: 

    Similar to body Pose detection, the author of OpenPose experimented this algorithm on Vehicle Detection. It records high Average Precision and Recall on that.

    Results:

    Caveats:

     
     

    References: 


    Article Tags :