Image Segmentation with Watershed Algorithm – OpenCV Python

Last Updated : 12 Feb, 2024

Image segmentation is a fundamental computer vision task that involves partitioning an image into meaningful and semantically homogeneous regions. The goal is to simplify the representation of an image or make it more meaningful for further analysis. These segments typically correspond to objects or regions of interest within the image.

Watershed Algorithm

The Watershed Algorithm is a classical image segmentation technique that is based on the concept of watershed transformation.The segmentation process will take the similarity with adjacent pixels of the image as an important reference to connect pixels with similar spatial positions and gray values.

When do I use the watershed algorithm?

The Watershed Algorithm is used when segmenting images with touching or overlapping objects. It excels in scenarios with irregular object shapes, gradient-based segmentation requirements, and when marker-guided segmentation is feasible.

Working of Watershed Algorithm

The watershed algorithm divides an image into segments using topographic information. It treats the image as a topographic surface, identifying catchment basins based on pixel intensity. Local minima are marked as starting points, and flooding with colors fills catchment basins until object boundaries are reached. The resulting segmentation assigns unique colors to regions, aiding object recognition and image analysis.

The whole process of the watershed algorithm can be summarized in the following steps:

Marker placement: The first step is to place markers on the local minima, or the lowest points, in the image. These markers serve as the starting points for the flooding process.
Flooding: The algorithm then floods the image with different colors, starting from the markers. As the color spreads, it fills up the catchment basins until it reaches the boundaries of the objects or regions in the image.
Catchment basin formation: As the color spreads, the catchment basins are gradually filled, creating a segmentation of the image. The resulting segments or regions are assigned unique colors, which can then be used to identify different objects or features in the image.
Boundary identification: The watershed algorithm uses the boundaries between the different colored regions to identify the objects or regions in the image. The resulting segmentation can be used for object recognition, image analysis, and feature extraction tasks.

Implementing the watershed algorithm using OpenCV

OpenCV (Open Source Computer Vision Library) is an open-source computer vision and machine learning software library. OpenCV contains hundreds of computer vision algorithms, including object detection, face recognition, image processing, and machine learning.

Here are the implementation steps for the watershed Algorithm using OpenCV:

Import the required libraries

Python3

import cv2
import numpy as np
from IPython.display import Image, display
from matplotlib import pyplot as plt

Loading the image

We define a function “imshow” to display the processed image. The code loads an image named “coin.jpg“.

Python

# Plot the image
def imshow(img, ax=None):
    if ax is None:
        ret, encoded = cv2.imencode(".jpg", img)
        display(Image(encoded))
    else:
        ax.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
        ax.axis('off')
 
#Image loading
img = cv2.imread("Coins.png")
# Show image
imshow(img)

Input coin

Coverting to Grayscale image

We convert the image to grayscale using OpenCV’s “cvtColor” method. The grayscale image is stored in a variable “gray”.

Python3

#image grayscale conversion
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
imshow(gray)

Output:

grey_image

The cv2.cvtColor() function takes two arguments: the image and the conversion flag cv2.COLOR_BGR2GRAY, which specifies the conversion from BGR color space to grayscale.

Implementing thresholding

A crucial step in image segmentation is thresholding, which changes a grayscale image into a binary image. It is essential for distinguishing the items of attention from the backdrop.

When using the cv2.THRESH_BINARY_INV thresholding method in OpenCV, the cv2.THRESH_OTSU parameter is added to apply Otsu’s binarization process. Otsu’s method automatically determines an optimal threshold by maximizing the variance between two classes of pixels in the image. It aims to find a threshold that minimizes intra-class variance and maximizes inter-class variance, effectively separating the image into two groups of pixels with distinct characteristics.

Otsu’s binarization process

Otsu’s binarization is a technique used in image processing to separate the foreground and background of an image into two distinct classes. This is done by finding the optimal threshold value that maximizes the variance between the two classes. Otsu’s method is known for its simplicity and computational efficiency, making it a popular choice in applications such as document analysis, object recognition, and medical imaging.

Python

#Threshold Processing
ret, bin_img = cv2.threshold(gray,
                             0, 255, 
                             cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
imshow(bin_img)

Output:

Threshold Image - Geeksforgeeks

Step 4: Noise Removal

To clean the object’s outline (boundary line), noise is removed using morphological gradient processing.

Morphological Gradient Processing

The morphological gradient is a tool used in morphological image processing to emphasize the edges and boundaries of objects in an image. It is obtained by subtracting the erosion of an image from its dilation. Erosion shrinks bright regions in an image, while dilation expands them, and the morphological gradient represents the difference between the two. This operation is useful in tasks such as object detection and segmentation, and it can also be combined with other morphological operations to enhance or filter specific features in an image.

Python

# noise removal
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
bin_img = cv2.morphologyEx(bin_img, 
                           cv2.MORPH_OPEN,
                           kernel,
                           iterations=2)
imshow(bin_img)

Output:

noise_removal

Detecting the black background and foreground of the image

Next, we need to get a hold of the black area, which is the background part of this image. If the white part is the required area and is well-filled, that means the rest is the background.

We apply several morphological operations on our binary image:

The first operation is dilation using “cv2.dilate” which expands the bright regions of the image, creating the “sure_bg” variable representing the sure background area. This result is displayed using the “imshow” function.
The next operation is “cv2.distanceTransform” which calculates the distance of each white pixel in the binary image to the closest black pixel. The result is stored in the “dist” variable and displayed using “imshow”.
Then, the foreground area is obtained by applying a threshold on the “dist” variable using “cv2.threshold”. The threshold is set to 0.5 times the maximum value of “dist”.
Finally, the unknown area is calculated as the difference between the sure background and sure foreground areas using “cv2.subtract”. The result is stored in the “unknown” variable and displayed using “imshow”.

Python

# Create subplots with 1 row and 2 columns
fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(8, 8))
# sure background area
sure_bg = cv2.dilate(bin_img, kernel, iterations=3)
imshow(sure_bg, axes[0,0])
axes[0, 0].set_title('Sure Background')
 
# Distance transform
dist = cv2.distanceTransform(bin_img, cv2.DIST_L2, 5)
imshow(dist, axes[0,1])
axes[0, 1].set_title('Distance Transform')
 
#foreground area
ret, sure_fg = cv2.threshold(dist, 0.5 * dist.max(), 255, cv2.THRESH_BINARY)
sure_fg = sure_fg.astype(np.uint8)  
imshow(sure_fg, axes[1,0])
axes[1, 0].set_title('Sure Foreground')
 
# unknown area
unknown = cv2.subtract(sure_bg, sure_fg)
imshow(unknown, axes[1,1])
axes[1, 1].set_title('Unknown')
 
plt.show()

Output:

Remove background

Creating marker image

There is a gray area between the white area in this part of the background and the clearly visible white part of the foreground. This is still uncharted territory, an undefined part. Sow we will subtract this area.

Here are the steps:

First, the “connectedcomponents” method from OpenCV is used to find the connected components in the sure foreground image “sure_fg”. The result is stored in “markers”.
To distinguish the background and foreground, the values in “markers” are incremented by 1.
The unknown region, represented by pixels with a value of 255 in “unknown”, is labeled with 0 in “markers”.
Finally, the “markers” image is displayed using Matplotlib’s “imshow” method with a color map of “tab20b”. The result is shown in a figure of size 6×6.

Python

# Marker labelling
# sure foreground 
ret, markers = cv2.connectedComponents(sure_fg)
 
# Add one to all labels so that background is not 0, but 1
markers += 1
# mark the region of unknown with zero
markers[unknown == 255] = 0
 
fig, ax = plt.subplots(figsize=(6, 6))
ax.imshow(markers, cmap="tab20b")
ax.axis('off')
plt.show()

Output:

Marker Labelling

This marker image, created by labeling the sure foreground and marking the unknown region, serves as input to the Watershed Algorithm. It guides the algorithm in segmenting the image based on these labeled regions. Each distinct color or label represents a separate segment or region in the image

Applying Watershed Algorithm to Markers

Applying watershed() function. Steps taken:

The “cv2.watershed” function is applied to the original image “img” and the markers image obtained in the previous step to perform the Watershed algorithm. The result is stored in “markers”.
The “markers” image is displayed using Matplotlib’s “imshow” method with a color map of “tab20b”.
A loop iterates over the labels starting from 2 (ignoring the background and unknown regions) to extract the contours of each object.
The contours of the binary image are found using OpenCV’s “findContours” function, and the first contour is appended to a list of coins.
Finally, the objects’ outlines are drawn on the original image using “cv2.drawContours”. The result is displayed using the “imshow” function.

Python

# watershed Algorithm
markers = cv2.watershed(img, markers)
 
fig, ax = plt.subplots(figsize=(5, 5))
ax.imshow(markers, cmap="tab20b")
ax.axis('off')
plt.show()
 
 
labels = np.unique(markers)
 
coins = []
for label in labels[2:]:  
 
# Create a binary image in which only the area of the label is in the foreground 
#and the rest of the image is in the background   
    target = np.where(markers == label, 255, 0).astype(np.uint8)
   
  # Perform contour extraction on the created binary image
    contours, hierarchy = cv2.findContours(
        target, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE
    )
    coins.append(contours[0])
 
# Draw the outline
img = cv2.drawContours(img, coins, -1, color=(0, 23, 223), thickness=2)
imshow(img)

Output:

Marker

Output Image

The outline of each object is drawn in red in the image.

Hence, the code implements the watershed algorithm using OpenCV to segment an image into separate objects or regions. The code first loads the image and converts it to grayscale, performs some preprocessing steps, places markers on the local minima, floods the image with different colors, and finally identifies the boundaries between the regions. The resulting segmented image is then displayed.

Conclusion

In conclusion, the watershed algorithm is a powerful image segmentation technique that uses topographic information to divide an image into multiple segments or regions. The watershed algorithm is more thoughtful than other segmentation methods, and it is more in line with the impression of the human eye on the image. It is widely used in medical imaging and computer vision applications and is a crucial step in many image processing pipelines. Despite its limitations, the watershed algorithm remains a popular choice for image segmentation tasks due to its ability to handle images with significant amounts of noise and irregular shapes.

Suggest improvement

Region and Edge Based Segmentation

Mask R-CNN | ML

Share your thoughts in the comments

Introduction to Computer Vision

Image Processing & Transformation

Feature Extraction and Description

Deep Learning for Computer Vision

Object Detection and Recognition