Open In App

Python OpenCV – Connected Component Labeling and Analysis

Last Updated : 03 Jan, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we’ll learn to implement connected component labeling and analysis using OpenCV in Python.

Connected component labeling

Component labeling is basically extracting a region from the original image, except that we try to find only the components which are “connected” which is determined by the application of the graph theory.

OpenCV provides us with the following 4 functions for this task:

  • cv2.connectedComponents
  • cv2.connectedComponentsWithStats
  • cv2.connectedComponentsWithAlgorithm
  • cv2.connectedComponentsWithStatsWithAlgorithm

The bottom two are more efficient and faster but run only if you have parallel preprocessing with OpenCV enabled, otherwise it’s wiser to stick to the first two. Both the first and the second methods are the same except in the second method, as the name suggests, we get stats for each of the components, and we’ll use the second method because in most cases you’re going to need those stats.

In this program, we’re going to use a banner image to extract the text components, the following image shows the final output of our program:

Installing Dependencies

Let’s start by installing the necessary packages:

$ pip install opencv-contrib-python

Step 1: Image Loading and Preprocessing

Let’s first load our image and convert it to a grayscale image, this makes the algorithm much more efficient and accurate. After this we’ll also apply a 7×7 Gaussian blur, this helps to remove unwanted edges and helps in a much more clear segmentation, which we’ll do in the next step.

Python3




# Applying threshold
threshold = cv2.threshold(blurred, 0, 255,
cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]


Step 2: Thresholding

Thresholding is a very basic image segmentation technique that helps us separate the background and the foreground objects that are of interest to us. After applying the blur we’ll use the cv2.threshold function for image segmentation.

Python3




# Applying threshold
threshold = cv2.threshold(blurred, 0, 255,
cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]


Step 3: Applying the Component Analysis Method

We first apply the cv2.connectedComponentsWithStats and then unpack the values it returns in different variables which we will use in the following steps, and let’s also create a new array to store all the components that we find.

Python3




# Apply the Component analysis function
analysis = cv2.connectedComponentsWithStats(threshold,
                                            4,
                                            cv2.CV_32S)
(totalLabels, label_ids, values, centroid) = analysis
  
# Initialize a new image to
# store all the output components
output = np.zeros(gray_img.shape, dtype="uint8")


Now that we have our components and analysis, let’s loop through each of the components and filter out the useful components.

Step 4: Filter Out Useful Components

Let’s loop through each of the components and use the statistics we got in the last step to filter out useful components. For example, here I have used the Area value to filter out only the characters in the image. And after filtering out the components, we’ll use the label_ids variable to create a mask for the component that we’re looping through and use the bitwise_or operation on the mask to generate our final output. It sounds hard, but you’ll understand it better after implementing the code yourself.

Python3




# Loop through each component
for i in range(1, totalLabels):
    area = values[i, cv2.CC_STAT_AREA]  
  
    if (area > 140) and (area < 400):
        
        # Labels stores all the IDs of the components on the each pixel
        # It has the same dimension as the threshold
        # So we'll check the component
        # then convert it to 255 value to mark it white
        componentMask = (label_ids == i).astype("uint8") * 255
          
        # Creating the Final output mask
        output = cv2.bitwise_or(output, componentMask)


How to select the value for Area (or any other condition like width or height) for filtering?

Add a print statement to print out the value of the statistic that you want to use as a condition, and then for the useful components note down the range of values and use them to create the filter condition.

Step 5: Visualize The Final Output

Now our final step is to simply display our original image and the final mask that we obtained.

Python3




cv2.imshow("Image", img)
cv2.imshow("Filtered Components", output)
cv2.waitKey(0)


Below is the implementation:

Python3




import cv2
import numpy as np
  
  
# Loading the image
img = cv2.imread('Images/img5.png')
  
# preprocess the image
gray_img = cv2.cvtColor(img , cv2.COLOR_BGR2GRAY)
  
# Applying 7x7 Gaussian Blur
blurred = cv2.GaussianBlur(gray_img, (7, 7), 0)
  
# Applying threshold
threshold = cv2.threshold(blurred, 0, 255,
    cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1
  
# Apply the Component analysis function
analysis = cv2.connectedComponentsWithStats(threshold,
                                            4,
                                            cv2.CV_32S)
(totalLabels, label_ids, values, centroid) = analysis
  
# Initialize a new image to store 
# all the output components
output = np.zeros(gray_img.shape, dtype="uint8")
  
# Loop through each component
for i in range(1, totalLabels):
    
      # Area of the component
    area = values[i, cv2.CC_STAT_AREA] 
      
    if (area > 140) and (area < 400):
        componentMask = (label_ids == i).astype("uint8") * 255
        output = cv2.bitwise_or(output, componentMask)
  
  
cv2.imshow("Image", img)
cv2.imshow("Filtered Components", output)
cv2.waitKey(0)


Output:

this is the output mask and the original image

Note: Run the program on a number of images big and small for you to see that the output consisted of a lot of “noise”. Therefore, we applied the “filter” in the last run the final output we obtained only had the text characters that we wanted.

OpenCV Connected Component Labeling and Analysis: 

Here is another implementation where I have demonstrated the whole process for each component so that it is easier for you to visualize: 

Python3




import cv2
import numpy as np
  
  
# Loading the image 
img = cv2.imread('Images/img5.png')
  
# preprocess the image
gray_img = cv2.cvtColor(img , 
                        cv2.COLOR_BGR2GRAY)
  
# Applying 7x7 Gaussian Blur
blurred = cv2.GaussianBlur(gray_img, (7, 7), 0)
  
# Applying threshold
threshold = cv2.threshold(blurred, 0, 255,
    cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1
  
# Apply the Component analysis function
analysis = cv2.connectedComponentsWithStats(threshold, 
                                            4
                                            cv2.CV_32S)
(totalLabels, label_ids, values, centroid) = analysis
  
# Initialize a new image to
# store all the output components
output = np.zeros(gray_img.shape, dtype="uint8")
  
# Loop through each component
for i in range(1, totalLabels):
    
      # Area of the component
    area = values[i, cv2.CC_STAT_AREA] 
      
    if (area > 140) and (area < 400):
        # Create a new image for bounding boxes
        new_img=img.copy()
          
        # Now extract the coordinate points
        x1 = values[i, cv2.CC_STAT_LEFT]
        y1 = values[i, cv2.CC_STAT_TOP]
        w = values[i, cv2.CC_STAT_WIDTH]
        h = values[i, cv2.CC_STAT_HEIGHT]
          
        # Coordinate of the bounding box
        pt1 = (x1, y1)
        pt2 = (x1+ w, y1+ h)
        (X, Y) = centroid[i]
          
        # Bounding boxes for each component
        cv2.rectangle(new_img,pt1,pt2,
                      (0, 255, 0), 3)
        cv2.circle(new_img, (int(X),
                             int(Y)), 
                   4, (0, 0, 255), -1)
  
        # Create a new array to show individual component
        component = np.zeros(gray_img.shape, dtype="uint8")
        componentMask = (label_ids == i).astype("uint8") * 255
  
        # Apply the mask using the bitwise operator
        component = cv2.bitwise_or(component,componentMask)
        output = cv2.bitwise_or(output, componentMask)
          
        # Show the final images
        cv2.imshow("Image", new_img)
        cv2.imshow("Individual Component", component)
        cv2.imshow("Filtered Components", output)
        cv2.waitKey(0)


Output:

 



Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads