Python OpenCV – Connected Component Labeling and Analysis

Last Updated : 03 Jan, 2023

In this article, we’ll learn to implement connected component labeling and analysis using OpenCV in Python.

Connected component labeling

Component labeling is basically extracting a region from the original image, except that we try to find only the components which are “connected” which is determined by the application of the graph theory.

OpenCV provides us with the following 4 functions for this task:

cv2.connectedComponents
cv2.connectedComponentsWithStats
cv2.connectedComponentsWithAlgorithm
cv2.connectedComponentsWithStatsWithAlgorithm

The bottom two are more efficient and faster but run only if you have parallel preprocessing with OpenCV enabled, otherwise it’s wiser to stick to the first two. Both the first and the second methods are the same except in the second method, as the name suggests, we get stats for each of the components, and we’ll use the second method because in most cases you’re going to need those stats.

In this program, we’re going to use a banner image to extract the text components, the following image shows the final output of our program:

Installing Dependencies

Let’s start by installing the necessary packages:

$ pip install opencv-contrib-python

Step 1: Image Loading and Preprocessing

Let’s first load our image and convert it to a grayscale image, this makes the algorithm much more efficient and accurate. After this we’ll also apply a 7×7 Gaussian blur, this helps to remove unwanted edges and helps in a much more clear segmentation, which we’ll do in the next step.

Python3

# Applying threshold 
threshold = cv2.threshold(blurred, 0, 255, 
cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]

Step 2: Thresholding

Thresholding is a very basic image segmentation technique that helps us separate the background and the foreground objects that are of interest to us. After applying the blur we’ll use the cv2.threshold function for image segmentation.

Python3

# Applying threshold 
threshold = cv2.threshold(blurred, 0, 255, 
cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]

Step 3: Applying the Component Analysis Method

We first apply the cv2.connectedComponentsWithStats and then unpack the values it returns in different variables which we will use in the following steps, and let’s also create a new array to store all the components that we find.

Python3

# Apply the Component analysis function 
analysis = cv2.connectedComponentsWithStats(threshold, 
                                            4, 
                                            cv2.CV_32S) 
(totalLabels, label_ids, values, centroid) = analysis 
  
# Initialize a new image to 
# store all the output components 
output = np.zeros(gray_img.shape, dtype="uint8")

Now that we have our components and analysis, let’s loop through each of the components and filter out the useful components.

Step 4: Filter Out Useful Components

Let’s loop through each of the components and use the statistics we got in the last step to filter out useful components. For example, here I have used the Area value to filter out only the characters in the image. And after filtering out the components, we’ll use the label_ids variable to create a mask for the component that we’re looping through and use the bitwise_or operation on the mask to generate our final output. It sounds hard, but you’ll understand it better after implementing the code yourself.

Python3

# Loop through each component 
for i in range(1, totalLabels): 
    area = values[i, cv2.CC_STAT_AREA]   
  
    if (area > 140) and (area < 400): 
        
        # Labels stores all the IDs of the components on the each pixel 
        # It has the same dimension as the threshold 
        # So we'll check the component 
        # then convert it to 255 value to mark it white 
        componentMask = (label_ids == i).astype("uint8") * 255
          
        # Creating the Final output mask 
        output = cv2.bitwise_or(output, componentMask) 

How to select the value for Area (or any other condition like width or height) for filtering?

Add a print statement to print out the value of the statistic that you want to use as a condition, and then for the useful components note down the range of values and use them to create the filter condition.

Step 5: Visualize The Final Output

Now our final step is to simply display our original image and the final mask that we obtained.

Python3

cv2.imshow("Image", img) 
cv2.imshow("Filtered Components", output) 
cv2.waitKey(0)

Below is the implementation:

Python3

import cv2 
import numpy as np 
  
  
# Loading the image 
img = cv2.imread('Images/img5.png') 
  
# preprocess the image 
gray_img = cv2.cvtColor(img , cv2.COLOR_BGR2GRAY) 
  
# Applying 7x7 Gaussian Blur 
blurred = cv2.GaussianBlur(gray_img, (7, 7), 0) 
  
# Applying threshold 
threshold = cv2.threshold(blurred, 0, 255, 
    cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]  
  
# Apply the Component analysis function 
analysis = cv2.connectedComponentsWithStats(threshold, 
                                            4, 
                                            cv2.CV_32S) 
(totalLabels, label_ids, values, centroid) = analysis 
  
# Initialize a new image to store  
# all the output components 
output = np.zeros(gray_img.shape, dtype="uint8") 
  
# Loop through each component 
for i in range(1, totalLabels): 
    
      # Area of the component 
    area = values[i, cv2.CC_STAT_AREA]  
      
    if (area > 140) and (area < 400): 
        componentMask = (label_ids == i).astype("uint8") * 255
        output = cv2.bitwise_or(output, componentMask) 
  
  
cv2.imshow("Image", img) 
cv2.imshow("Filtered Components", output) 
cv2.waitKey(0)

Output:

this is the output mask and the original image

Note: Run the program on a number of images big and small for you to see that the output consisted of a lot of “noise”. Therefore, we applied the “filter” in the last run the final output we obtained only had the text characters that we wanted.

OpenCV Connected Component Labeling and Analysis:

Here is another implementation where I have demonstrated the whole process for each component so that it is easier for you to visualize:

Python3

import cv2 
import numpy as np 
  
  
# Loading the image  
img = cv2.imread('Images/img5.png') 
  
# preprocess the image 
gray_img = cv2.cvtColor(img ,  
                        cv2.COLOR_BGR2GRAY) 
  
# Applying 7x7 Gaussian Blur 
blurred = cv2.GaussianBlur(gray_img, (7, 7), 0) 
  
# Applying threshold 
threshold = cv2.threshold(blurred, 0, 255, 
    cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]  
  
# Apply the Component analysis function 
analysis = cv2.connectedComponentsWithStats(threshold,  
                                            4,  
                                            cv2.CV_32S) 
(totalLabels, label_ids, values, centroid) = analysis 
  
# Initialize a new image to 
# store all the output components 
output = np.zeros(gray_img.shape, dtype="uint8") 
  
# Loop through each component 
for i in range(1, totalLabels): 
    
      # Area of the component 
    area = values[i, cv2.CC_STAT_AREA]  
      
    if (area > 140) and (area < 400): 
        # Create a new image for bounding boxes 
        new_img=img.copy() 
          
        # Now extract the coordinate points 
        x1 = values[i, cv2.CC_STAT_LEFT] 
        y1 = values[i, cv2.CC_STAT_TOP] 
        w = values[i, cv2.CC_STAT_WIDTH] 
        h = values[i, cv2.CC_STAT_HEIGHT] 
          
        # Coordinate of the bounding box 
        pt1 = (x1, y1) 
        pt2 = (x1+ w, y1+ h) 
        (X, Y) = centroid[i] 
          
        # Bounding boxes for each component 
        cv2.rectangle(new_img,pt1,pt2, 
                      (0, 255, 0), 3) 
        cv2.circle(new_img, (int(X), 
                             int(Y)),  
                   4, (0, 0, 255), -1) 
  
        # Create a new array to show individual component 
        component = np.zeros(gray_img.shape, dtype="uint8") 
        componentMask = (label_ids == i).astype("uint8") * 255
  
        # Apply the mask using the bitwise operator 
        component = cv2.bitwise_or(component,componentMask) 
        output = cv2.bitwise_or(output, componentMask) 
          
        # Show the final images 
        cv2.imshow("Image", new_img) 
        cv2.imshow("Individual Component", component) 
        cv2.imshow("Filtered Components", output) 
        cv2.waitKey(0)

Output:

Suggest improvement

How to Detect Shapes in Images in Python using OpenCV?

Evaluate a Polynomial at Points x Broadcast Over the Columns of the Coefficient in Python using NumPy

Share your thoughts in the comments

Python OpenCV – Connected Component Labeling and Analysis

Connected component labeling

Installing Dependencies

Step 1: Image Loading and Preprocessing

Python3

Step 2: Thresholding

Python3

Step 3: Applying the Component Analysis Method

Python3

Step 4: Filter Out Useful Components

Python3

Step 5: Visualize The Final Output

Python3

Python3

Python3

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?