Building a Custom Image Recognition Solution with Azure Cognitive Services

In recent years, the field of computer vision has seen significant advancements, and image recognition has become an essential tool for various industries. With the help of deep learning techniques, we can train powerful neural networks that can recognize objects, faces, and other patterns from images. Azure Cognitive Services offers pre-built APIs that simplify the process of building a custom image recognition solution. In this tutorial, we will learn how to use Azure Cognitive Services to build an image recognition solution that can detect and classify objects in a set of images.

Prerequisites

To follow along with this tutorial, you should have a few things already set up:

An Azure account with an active subscription
A Microsoft Azure Storage account
Basic knowledge of programming language (Python, in this case)
Latest version of Visual Studio Code installed
A Python environment

Set up Azure Cognitive Services

Azure Cognitive Services is a suite of pre-built APIs and SDKs that enable developers to integrate intelligent features into their applications. These APIs can be used to recognize faces, analyze text, and extract insights from images.

To set up Azure Cognitive Services, follow these steps:

Go to the Azure Portal (portal.azure.com) and sign in with your credentials.
Create a new resource group and give it a name.
Click on “Create a resource” and type “Cognitive Services” in the search bar. Click on the result that appears.
Click on “Create” to create a new Cognitive Services resource.
Fill in the required information, such as Subscription, Resource group, Name, Pricing tier (S0 is sufficient for this tutorial), and Location. Click on “Create” when done.
Once the resource is created, navigate to the resource and select “Keys and Endpoint” from the left-hand menu. Copy both the Key1 and Endpoint values as these will be used to access the API in future.

Set up Azure Storage

Azure Storage is a cloud-based storage solution that allows you to store and manage data in the cloud. We will be using Azure Storage to store the images that we will use to train our model.

To set up Azure Storage, follow these steps:

Go to the Azure Portal (portal.azure.com) and sign in with your credentials.
Create a new resource group and give it a name.
Click on “Create a resource” and search for “Storage Account”. Click on the result that appears.
Click on “Create” to create a new Storage Account.
Fill in the required information, such as Subscription, Resource group, Name, and Location. You can keep the rest of the settings as default. Click on “Review + create”.
Review the details and click on “Create” to create the storage account.
Once the storage account is created, click on “Containers” on the left-hand menu, and create a new container to hold the images that we will use to train the model.

Upload the Images to Azure Storage

Now that we have set up Azure Storage, we can upload the images that we will use to train our model. For this tutorial, we will be using the “Pascal VOC” dataset that contains images of objects from 20 different categories. You can download the dataset from http://host.robots.ox.ac.uk/pascal/VOC/voc2012/.

To upload the images to Azure Storage, follow these steps:

Download the “Pascal VOC” dataset and extract the files to a folder.
Open Visual Studio Code and create a new Python file.
Install the Azure Storage Blob package by opening a new terminal window and running the following command:

pip install azure-storage-blob

Import the necessary libraries in your Python file:

import os, uuid
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient

Set up the BlobServiceClient to access your Azure Storage account by replacing the connection string placeholders in the code below:

connect_str = "DefaultEndpointsProtocol=https;AccountName=[Name];AccountKey=[Key];EndpointSuffix=core.windows.net"
blob_service_client = BlobServiceClient.from_connection_string(connect_str)

Create a ContainerClient object by replacing the container name placeholder in the code below:

container_client = blob_service_client.get_container_client("[Container name]")

Upload the images to your Azure Storage account by running the following code:

path_to_images = "path/to/images/folder"
for filename in os.listdir(path_to_images):
  with open(os.path.join(path_to_images, filename), "rb") as data:
    container_client.upload_blob(name=filename, data=data)

Train the Custom Vision Model

Azure Cognitive Services offers a Custom Vision API that allows you to train your own custom image recognition model. To train the model, we will use the images that we uploaded to Azure Storage in the previous step.

To train the Custom Vision Model, follow these steps:

Go to the Custom Vision website (https://www.customvision.ai/) and sign in with your Azure account credentials.
Click on “Create new project” and give it a name and description.
Choose “Object Detection (preview)” as the project type and “General (compact)” as the domain.
Set up the project by selecting your Azure subscription, resource group, and location. Choose an existing storage account or create a new one if you haven’t already.
You will now be taken to the “Images” tab. Click on “Add images” and select the images that you uploaded to Azure Storage.
After the images have been uploaded, you can start tagging them by drawing bounding boxes around the objects in the images and assigning them to the appropriate categories.
Once you have tagged the images, click on the “Train” button to start training the model.
The Custom Vision API will now start training the model using the tagged images. This may take several minutes depending on the number of images and the complexity of the model.
Once the model has finished training, you can test it by clicking on the “Quick Test” button and uploading an image to see if the model can detect and classify the objects in the image.

Using the Model in Your Application

Now that we have trained the custom image recognition model, we can use it in our application. To use the model in our application, we will make use of the Azure Cognitive Services API.

To use the model, follow these steps:

Create a new Python file in Visual Studio Code.
Import the necessary libraries:

import requests
import json

Set up the API endpoint and subscription key by replacing the placeholders in the code below with the values that you copied from the Azure portal:

subscription_key = "[Subscription key]"
endpoint = "[Endpoint]"

Create a function that will make a POST request to the Custom Vision API, passing in the image file. The API will return a JSON response containing the predicted objects and their coordinates.

def predict_image(image_path):
    url = endpoint + "/vision/v3.0-preview/detect"
    headers = {"Prediction-Key": subscription_key, "Content-Type": "application/octet-stream"}
    data = open(image_path, "rb").read()
    response = requests.post(url, headers=headers, data=data)
    response.raise_for_status()
    return response.json()

Create a function that will display the predicted objects and their coordinates on the original image.

import matplotlib.pyplot as plt
import matplotlib.patches as patches

def display_results(image_path, results):
    img = plt.imread(image_path)
    fig, ax = plt.subplots(1)
    ax.imshow(img)

    for prediction in results["predictions"]:
        bbox = prediction["boundingBox"]
        rect = patches.Rectangle((bbox["left"],bbox["top"]),bbox["width"],bbox["height"],linewidth=2,edgecolor='r',facecolor='none')
        ax.add_patch(rect)
        ax.text(bbox["left"],bbox["top"],prediction["tagName"],fontsize=8,color='w')

    plt.show()

Call the functions to make a prediction on an image and display the results.

image_path = "path/to/image.jpg"
results = predict_image(image_path)
display_results(image_path, results)

Conclusion

In this tutorial, we learned how to build a custom image recognition solution with Azure Cognitive Services. We set up Azure Cognitive Services and Azure Storage, trained a custom image recognition model with the Custom Vision API, and used the model in our application with the Azure Cognitive Services API. With the help of Azure Cognitive Services, we can quickly and easily build intelligent features into our applications without having to start from scratch.