YOLOV5 : Object Tracker In Videos

Last Updated : 15 Jan, 2024

Introduction

let’s understand about object detection and object tracking first.

Object detection: It is defined as the process of locating and recognizing things in a frame of an image or a video. In this process, the presence of numerous objects or items are determined in the input data in order to create bounding boxes around them to represent their locations.

Such as, bounding box coordinates that represent the position of the object, and then its class is also determined like “cat”, “crosswalk”, “bird”, etc. Even a confidence score to express the algorithm’s level of assurance regarding the discovery are all provided by object detection techniques.
It’s vital to keep in mind that object recognition is often done on individual frames or images and does not consider the movement or trajectory of objects across successive frames.

Object tracking: It is defined as following a specific object’s movement through a number of video frames. Basically, the motive is to accurately keep the tracked object’s identification constant even as it travels during the video. This technique is particularly useful in scenarios like surveillance, self-driving automobiles, and video analysis.

What is YOLO ?

YOLO simply stands for ‘You only look once’. It is an object identification technique that enables locating numerous objects within a video or an image in a single pass. YOLO works on dividing an image into a grid and simultaneously predicts bounding boxes, class probabilities, and confidence scores for each grid cell, in contrast to typical detection algorithms that repeatedly scan an image. A few examples of use cases of YOLOV5 are Face Mask Detection, Object Recognition, Speed calculator, Vehicle Tracker, and so on.

To know more about the YOLO Models refer to this link.

In this article, we will study how to use YOLOV5 for object tracking in videos.

The steps to be followed are :

Importing necessary libraries

Python3

import torch 
from IPython.display import Image, clear_output 
from IPython.display import HTML 
from base64 import b64encode 

Let’s now begin with cloning the required repository for this project.

Cloning Repository

So Initially we need to the Github repository of YOLOV5 using the below command.

Python3

!git clone - -recurse-submodules https: // github.com/mikel-brostrom/Yolov5_DeepSort_Pytorch.git 

After this step, we need to change the directory according to the cloned repository.

Python3

%cd Yolov5_DeepSort_Pytorch

Now, we will install the dependencies.

Python3

%pip install - qr requirements.txt

Now, we will get some system information to run this model efficiently.

Python3

# clear the outpt  
clear_output() 
#system information 
print( 
    f"Setup complete. Using torch {torch.__version__}  
  ({torch.cuda.get_device_properties(0).name if torch.cuda.is_available() else 'CPU'})") 

Now, we will use a pre-trained YOLOv5 pre-trained model which is trained on Crowded human dataset.

Python3

# download the pre-trained model  
!wget -nc https://github.com/mikel-brostrom/Yolov5_DeepSort_Pytorch/releases/download/v.2.0/crowdhuman_yolov5m.pt -O /content/Yolov5_DeepSort_Pytorch/yolov5/weights/crowdhuman_yolov5m.pt 

Now, let’s get a video and test it.

Python3

# getting test video 
!wget - nc https: // github.com/mikel-brostrom/Yolov5_DeepSort_Pytorch/releases/download/v.2.0/test.avi 

After this step, we will extract just a few seconds of the starting portion of the video.

Python3

# extracting 2 seconds of video 
!y | ffmpeg - ss 00: 00: 00 - i test.avi - t 00: 00: 02 - c copy out.avi 

Now, let’s get the source video on which we want to do object tracking.

Python3

!python track.py - -yolo_model / content/Yolov5_DeepSort_Pytorch/yolov5/weights/yolov5n.pt - -source out.avi - -save-vid 

In order to run object tracking on the video and display it, we first need to convert it to the MP4 format. We are using ‘ffmpeg’ for this task.

Python3

!ffmpeg - i / content/Yolov5_DeepSort_Pytorch/runs/track/exp3/out.avi output.mp4 

Now, to display the video, we are using HTML player. The HTML video element will first read the binary content of the MP4 video file and then encode it using base64, and then creates a data URL.

Python3

mp4 = open('output.mp4','rb').read() 
data_url = "data:video/mp4;base64," + b64encode(mp4).decode() 
# display with HTML 
HTML(""" 
<video controls> 
      <source src="%s" type="video/mp4"> 
</video> 
""" % data_url)

Output:

Conclusion :

Yolov5 is one of the best and efficient models for object detection and tracking and plays a significant in real-world applications such as- surveillance and security, autonomous vehicles like Tesla, Sports Analytics, etc. For more enhancement, we can also utilize this model for custom dataset training.

Suggest improvement

YOLO v2 - Object Detection

Share your thoughts in the comments

YOLOV5 : Object Tracker In Videos

Introduction

What is YOLO ?

Python3

Python3

Python3

Python3

Python3

Python3

Python3

Python3

Python3

Python3

Python3

Conclusion :

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?