Microsoft Azure – Computer Vision Service

Last Updated : 30 Mar, 2023

Computer vision comes under artificial intelligence and gives the power to the system to recognize and see the world to make some logical sense of it. To any particular system, an image or video is just an array of pixels with some values that define their colors. These numeric values can be used as features by the machine learning models to process, analyze them to give some meaningful relations, and logic, and make predictions about the image and the content it contains. This helps in many several cases listed as cases:

Organization of content: This helps in identifying people, and objects in the photos or videos and organizing them or giving relevant information. This is used in many social media websites that help tag a particular person.
Extracting Text: This helps in analyzing the text in different formats (handwritten, PDF) and extracting the relevant text in a structured form. This is used in the form, of bill processing.
Analyzing given space: This helps in analyzing a particular space and gives information about the people, objects in that space, and their movements with time.

Microsoft Azure – Computer Vision Service

Microsoft Azure Computer Vision Service is a cognitive service that helps analyze images and get detailed information about them without much effort because of the pre-trained computer vision capabilities that it offers. This helps create models that can analyze images and give relevant information in seconds.

Resources Required to Use the Service

To use this service we need to create some resources in Microsoft Azure. We can use either of the following:

A. Computer Vision: It is a separate resource available in Microsoft Azure and specifically only for computer vision-related tasks. We should use this resource only when we don’t wish to use any other cognitive services provided by Azure with our current service in our product. We should also use this service whenever we wish to track the cost and utilization of our computer vision tasks separately.

B. Cognitive Service: It is a general service that includes all the cognitive services provided by Microsoft Azure to help us carry out our tasks. It includes computer vision, translator text, text analytics, form processing, and many others. We should use this service when we are using many cognitive services and want it under one to simplify the administration and development of our product.

Whenever we create a resource ( any one of the above mentioned ) it will come with two pieces of information that will allow us to use the resource we created.

Key: It will be used to authenticate the client applications and requests.
Endpoint: It will provide us with an HTTP address using which we can access our resources.

Analyzing images with the Service

The computer vision service has many pre-trained machine learning models to help us analyze images and gain information from them effectively.

Prerequisites: You need a Azure subscription for which you can avail free 12-month subscription.

Creating a Cognitive Service Resource

We can use this service by either creating a computer vision resource or a cognitive service resource. Follow the below-given steps to create the cognitive service resource.

Step 1: Navigate to the Azure portal

Step 2: Sign in to your Microsoft Account and navigate to the subscription where you wish to create the resource.

Step 3: Click the “+Create a resource” button and search for cognitive services by typing it in the search bar.

Step 4: Click on the ‘Create’ option.

Step 5: In the panel that opens enter the following details:

Subscription: The Azure subscription where you are creating the resource.
Resource group: Select a resource group where you wish to create the resource or create a new one by clicking on Create New option and entering a unique name.

Region: Choose any available region present in the list.
Name: Enter a unique name for your resource.
Pricing: Choose a pricing tier. For now, we will go with S0.
Select the check box appearing right to ‘I confirm I have read and understood the notices’

Step 6: Click on review and create and review all the details. If all are correct and you get validation passed click on create.

Step 7: The deployment process will begin, waiting for the process to finish.

Step 8: After it has been deployed successfully, go to the deployed resource and go-to keys and endpoint in resource management. Note down the key and endpoint for the resource which will be used to connect client applications to the service.

Now, we can use the key and endpoint to integrate the created resource in our application to submit images to the service and perform a wide range of analytical tasks on images and videos.

Several Tasks that Can be Performed Using Service

1. Describe an Image: Computer Vision Service gives us the capability to analyze the image, process and evaluate the different objects that are detected and generate some human-readable statements that describe the image. Depending on the input image there will be many phrases with an associated confidence score indicating the confidence of the algorithm. The statements are sorted based on the confidence (highest to lowest)

A black and white photo of a large city.
A black and white photo of a city.
Tag features.

Computer vision service suggests tags for the image based on object detection. The tags help in summarizing the different attributes of an image and are very useful when we wish to index the images with key terms and can be used to search for images with specific attributes or contents. For the above image the tags can be as follows:

Tower
Skyscraper
Building

2. Object detection: Computer vision service helps in detecting objects present in the image and returns a bounding box with a set of coordinates that indicate the left, top, width, and height of the objects detected.

3. Brand detection: Computer vision service helps us analyze the images and identify commercial brands. It has an existing database of thousands of globally recognized logos of commercial brands which can be compared with the logo in the image. If a brand matches, the service returns a response that contains the brand name, a confidence score (from 0 to 1), 0 indicating 0% confidence, and 1 representing 100% confidence with a bounding box near the image where the brand is detected.

4. Face detection: Computer vision service helps in analyzing and detecting faces in the image, with the capability to have a bounding box near the image where the faces are detected.

Tip: There are many other features as follows:

Recognizing landmarks and celebrities: The service helps in recognizing and detecting landmarks and celebrities from the image.
Image categorization: Helps in the categorization of the images based on the objects present in the image.
Content moderation: Helps in detecting content that contains adult content or has violent or gory scenes.
Detecting the image type: It helps in detecting the image type.
Thumbnail generation: It helps in creating a small version of the input image.