Holistically-Nested Edge Detection with OpenCV and Deep Learning

Last Updated : 21 Mar, 2024

Holistically-nested edge detection (HED) is a deep learning model that uses fully convolutional neural networks and deeply-supervised nets to do image-to-image prediction. HED develops rich hierarchical representations automatically (directed by deep supervision on side replies) that are critical for resolving ambiguity in edge and object boundary detection.

Why Holistically-Nested Edge Detection(HED)

The proposed holistically nested edge detector (HED) tackles two critical issues:

Holistic image training and prediction, inspired by fully convolutional neural networks for image-to-image classification (the system takes an image as input, and directly produces the edge map image as output)
Nested multi-scale feature learning, inspired by deeply-supervised nets that performs deep layer supervision to “guide” early classification results.
HED is a powerful technique for edge detection that leverages the capabilities of FCNs and deep supervision to produce accurate and detailed edge predictions in images.

Model Architecture of Holistically-Nested Edge Detection

The model is VGGNet with a few modifications-

The side output layer is connected to the last convolutional layer in each stage, respectively conv1_2, conv2_2, conv3_3, conv4_3,conv5_3. The receptive field size of each of these convolutional layers is identical to the corresponding side-output layer.
The last stage of VGGNet is removed including the 5th pooling layer and all the fully connected layers. By removing these layers, HED focuses on leveraging the convolutional layers for feature extraction and hierarchical representation learning, which is essential for edge detection.

The final HED network architecture has 5 stages, with strides 1, 2, 4, 8, and 16, respectively, and with different receptive field sizes, all nested in the VGGNet. HED modifies VGGNet by connecting side output layers to specific convolutional layers and removing the last stage, resulting in a model architecture that is tailored for holistic and hierarchical edge detection.

Steps Required for Implementing HED

Import all the libraries required.

We will use OpenCV for reading the input image, resizing it, and loading the parameters of the trained network.

Import cv2

Read the image.

img = cv2.imread(“path”)

Create the blob

blob = cv2.dnn.blobFromImage(img, scalefactor=1.0, size=(W, H),swapRB=False, crop=False)

Load the pre-trained Caffe model
- This framework is built on top of publicly available implementations of FCN and DSN and is implemented using the publicly available Caffe Library. From an initialization using the pre-trained VGG-16 Net model, the entire network in our HED system is fine-tuned.
- This Caffe model is encoded into two files
  1. A prototxt file: A text Caffe JSON file that includes the model definition (deploy) (i.e. layers, expected input, …..)
  2. The pre-trained Caffe model: Neural Network weights.

These files can be downloaded from this link. We need these files to train our model and apply predictions on our input image.

net = cv2.dnn.readNetFromCaffe(“path to prototxt file”, “path to model weights file”)

Pass the blob of the image to the model and find the output.

net.setInput(blob)

hed = net.forward()

Format the data in the correct format to display (if required)

hed = cv2.resize(hed[0, 0], (W, H))

hed = (255 * hed).astype(“uint8”)

Display the output

cv2.imshow(“HED”, hed)

Python Implementation of HED

We will use a pre-trained HED model to detect the edge of our input image

Input Image:

Input image for HED

Python3

import cv2 
img = cv2.imread("input.webp") 
(H, W) = img.shape[:2] 
blob = cv2.dnn.blobFromImage(img, scalefactor=1.0, size=(W, H), 
    swapRB=False, crop=False) 
net = cv2.dnn.readNetFromCaffe("deploy.prototxt", "hed_pretrained_bsds.caffemodel") 
net.setInput(blob) 
hed = net.forward() 
hed = cv2.resize(hed[0, 0], (W, H)) 
hed = (255 * hed).astype("uint8") 
cv2.imshow("Input", img) 
cv2.imshow("HED", hed) 
cv2.waitKey(0)

Output:

Output

Using Different Input image

Input image:

Input image for HED

Python3

import cv2 
img = cv2.imread("pexels-ylanite-koppens-2343170(1).jpg") 
(H, W) = img.shape[:2] 
blob = cv2.dnn.blobFromImage(img, scalefactor=1.0, size=(W, H), 
    swapRB=False, crop=False) 
net = cv2.dnn.readNetFromCaffe("deploy.prototxt", "hed_pretrained_bsds.caffemodel") 
net.setInput(blob) 
hed = net.forward() 
hed = cv2.resize(hed[0, 0], (W, H)) 
hed = (255 * hed).astype("uint8") 
cv2.imshow("Input", img) 
cv2.imshow("HED", hed) 
cv2.waitKey(0)