Python OpenCV – Connected Component Labeling and Analysis
Last Updated :
03 Jan, 2023
In this article, we’ll learn to implement connected component labeling and analysis using OpenCV in Python.
Connected component labeling
Component labeling is basically extracting a region from the original image, except that we try to find only the components which are “connected” which is determined by the application of the graph theory.
OpenCV provides us with the following 4 functions for this task:
- cv2.connectedComponents
- cv2.connectedComponentsWithStats
- cv2.connectedComponentsWithAlgorithm
- cv2.connectedComponentsWithStatsWithAlgorithm
The bottom two are more efficient and faster but run only if you have parallel preprocessing with OpenCV enabled, otherwise it’s wiser to stick to the first two. Both the first and the second methods are the same except in the second method, as the name suggests, we get stats for each of the components, and we’ll use the second method because in most cases you’re going to need those stats.
In this program, we’re going to use a banner image to extract the text components, the following image shows the final output of our program:
Installing Dependencies
Let’s start by installing the necessary packages:
$ pip install opencv-contrib-python
Step 1: Image Loading and Preprocessing
Let’s first load our image and convert it to a grayscale image, this makes the algorithm much more efficient and accurate. After this we’ll also apply a 7×7 Gaussian blur, this helps to remove unwanted edges and helps in a much more clear segmentation, which we’ll do in the next step.
Python3
threshold = cv2.threshold(blurred, 0 , 255 ,
cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[ 1 ]
|
Step 2: Thresholding
Thresholding is a very basic image segmentation technique that helps us separate the background and the foreground objects that are of interest to us. After applying the blur we’ll use the cv2.threshold function for image segmentation.
Python3
threshold = cv2.threshold(blurred, 0 , 255 ,
cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[ 1 ]
|
Step 3: Applying the Component Analysis Method
We first apply the cv2.connectedComponentsWithStats and then unpack the values it returns in different variables which we will use in the following steps, and let’s also create a new array to store all the components that we find.
Python3
analysis = cv2.connectedComponentsWithStats(threshold,
4 ,
cv2.CV_32S)
(totalLabels, label_ids, values, centroid) = analysis
output = np.zeros(gray_img.shape, dtype = "uint8" )
|
Now that we have our components and analysis, let’s loop through each of the components and filter out the useful components.
Step 4: Filter Out Useful Components
Let’s loop through each of the components and use the statistics we got in the last step to filter out useful components. For example, here I have used the Area value to filter out only the characters in the image. And after filtering out the components, we’ll use the label_ids variable to create a mask for the component that we’re looping through and use the bitwise_or operation on the mask to generate our final output. It sounds hard, but you’ll understand it better after implementing the code yourself.
Python3
for i in range ( 1 , totalLabels):
area = values[i, cv2.CC_STAT_AREA]
if (area > 140 ) and (area < 400 ):
componentMask = (label_ids = = i).astype( "uint8" ) * 255
output = cv2.bitwise_or(output, componentMask)
|
How to select the value for Area (or any other condition like width or height) for filtering?
Add a print statement to print out the value of the statistic that you want to use as a condition, and then for the useful components note down the range of values and use them to create the filter condition.
Step 5: Visualize The Final Output
Now our final step is to simply display our original image and the final mask that we obtained.
Python3
cv2.imshow( "Image" , img)
cv2.imshow( "Filtered Components" , output)
cv2.waitKey( 0 )
|
Below is the implementation:
Python3
import cv2
import numpy as np
img = cv2.imread( 'Images/img5.png' )
gray_img = cv2.cvtColor(img , cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray_img, ( 7 , 7 ), 0 )
threshold = cv2.threshold(blurred, 0 , 255 ,
cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[ 1 ]
analysis = cv2.connectedComponentsWithStats(threshold,
4 ,
cv2.CV_32S)
(totalLabels, label_ids, values, centroid) = analysis
output = np.zeros(gray_img.shape, dtype = "uint8" )
for i in range ( 1 , totalLabels):
area = values[i, cv2.CC_STAT_AREA]
if (area > 140 ) and (area < 400 ):
componentMask = (label_ids = = i).astype( "uint8" ) * 255
output = cv2.bitwise_or(output, componentMask)
cv2.imshow( "Image" , img)
cv2.imshow( "Filtered Components" , output)
cv2.waitKey( 0 )
|
Output:
this is the output mask and the original image
Note: Run the program on a number of images big and small for you to see that the output consisted of a lot of “noise”. Therefore, we applied the “filter” in the last run the final output we obtained only had the text characters that we wanted.
OpenCV Connected Component Labeling and Analysis:
Here is another implementation where I have demonstrated the whole process for each component so that it is easier for you to visualize:
Python3
import cv2
import numpy as np
img = cv2.imread( 'Images/img5.png' )
gray_img = cv2.cvtColor(img ,
cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray_img, ( 7 , 7 ), 0 )
threshold = cv2.threshold(blurred, 0 , 255 ,
cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[ 1 ]
analysis = cv2.connectedComponentsWithStats(threshold,
4 ,
cv2.CV_32S)
(totalLabels, label_ids, values, centroid) = analysis
output = np.zeros(gray_img.shape, dtype = "uint8" )
for i in range ( 1 , totalLabels):
area = values[i, cv2.CC_STAT_AREA]
if (area > 140 ) and (area < 400 ):
new_img = img.copy()
x1 = values[i, cv2.CC_STAT_LEFT]
y1 = values[i, cv2.CC_STAT_TOP]
w = values[i, cv2.CC_STAT_WIDTH]
h = values[i, cv2.CC_STAT_HEIGHT]
pt1 = (x1, y1)
pt2 = (x1 + w, y1 + h)
(X, Y) = centroid[i]
cv2.rectangle(new_img,pt1,pt2,
( 0 , 255 , 0 ), 3 )
cv2.circle(new_img, ( int (X),
int (Y)),
4 , ( 0 , 0 , 255 ), - 1 )
component = np.zeros(gray_img.shape, dtype = "uint8" )
componentMask = (label_ids = = i).astype( "uint8" ) * 255
component = cv2.bitwise_or(component,componentMask)
output = cv2.bitwise_or(output, componentMask)
cv2.imshow( "Image" , new_img)
cv2.imshow( "Individual Component" , component)
cv2.imshow( "Filtered Components" , output)
cv2.waitKey( 0 )
|
Output:
Like Article
Suggest improvement
Share your thoughts in the comments
Please Login to comment...