Real-time analytics with Azure Data Explorer

Azure Data Explorer (ADX) is a fast, reliable, and highly scalable real-time analytics platform provided by Microsoft. It’s designed to collect, analyze, and visualize massive volumes of data in real-time. This tutorial will walk you through the steps to set up a simple real-time analytics pipeline using Azure Data Explorer.

Prerequisites

Before you begin, you’ll need to have the following:

  • An Azure account with sufficient permissions to create resources. You can sign up for a free trial account from here.
  • The Azure CLI installed on your machine. You can download it from here.

Create an Azure Data Explorer Cluster

The first step in setting up your real-time analytics pipeline is to create an Azure Data Explorer cluster. A cluster is a collection of Azure Data Explorer resources that work together to enable real-time data processing.

To create a new cluster, start by logging in to your Azure account and opening a new terminal window. Then, use the following command to create a new Azure Data Explorer cluster.

az kusto cluster create --resource-group <resource-group> --name <cluster-name> --sku D13_v2 --capacity 2 --data-retention 30
  • Replace <resource-group> with the name of the resource group in which you want to create the cluster.
  • Replace <cluster-name> with the name you want to give your new cluster.

In this example, we’re creating a cluster with the D13_v2 SKU, which is suitable for moderate workloads. The capacity parameter determines how many nodes are provisioned for your cluster, and the data-retention parameter specifies how long to retain ingested data.

It can take several minutes for the cluster to be provisioned. Once the cluster is created, you can use the following command to check its status.

az kusto cluster show --resource-group <resource-group> --name <cluster-name>

Create an Azure Data Explorer Database

The next step is to create a database within your Azure Data Explorer cluster. A database is a logical container for your data and enables you to manage access, retention, and querying of your data.

To create a new database, use the following command.

az kusto database create --cluster-name <cluster-name> --name <database-name>
  • Replace <cluster-name> with the name of your Azure Data Explorer cluster.
  • Replace <database-name> with a name to give your new database.

Once the database is created, you can verify its status with the following command.

az kusto database show --cluster-name <cluster-name> --name <database-name>

Ingest Data into Azure Data Explorer

With the Azure Data Explorer cluster and database created, let’s start ingesting data. There are several ways you can ingest data into Azure Data Explorer, including Azure Event Hubs, Azure Blob Storage, and Azure Data Factory. For this tutorial, we’ll use Event Grid to trigger an Azure Function, which will then ingest data into our Azure Data Explorer database.

Create an Azure Function

First, we need to create a function app that will ingest data into our Azure Data Explorer database.

To create a new function app, follow these steps:

  1. Open a new terminal window and navigate to a local directory where your Function code will be stored.
  2. Use the following command to create a new function app.
az functionapp create --name <function-name> --resource-group <resource-group> --consumption-plan-location eastus --runtime dotnet
  • Replace <function-name> with the name of your new Function app.
  • Replace <resource-group> with the name of the resource group in which you want to create the Function app.

This command creates a new Function app with the eastus region and dotnet runtime. It can take several minutes for the Function app to be provisioned.

  1. Once the Function app is provisioned, use the following command to create a new HTTP-triggered function.
az functionapp function create --name <function-name> --function-name <http-function-name> --consumption-plan-location eastus --runtime dotnet --trigger-http
  • Replace <http-function-name> with the name of your new HTTP-triggered function.

This command creates a new function that’s triggered by an HTTP request.

  1. Next, open your Function app in the Azure Portal and navigate to the “Platform features” tab. Click on the “Deployment Center” option, and then follow the prompts to deploy your code to Azure.

Create an Azure Event Grid Topic

With our Function app created, let’s create an Event Grid topic to trigger our Function when new events are published.

To create a new Event Grid topic, follow these steps:

  1. Navigate to the Azure Portal and open a new terminal window.
  2. Use the following command to create a new Event Grid topic.

az eventgrid topic create --name <topic-name> --resource-group <resource-group>
  • Replace <topic-name> with the name of your new Event Grid topic.
  • Replace <resource-group> with the name of the resource group in which you want to create the Event Grid topic.

This command creates a new Event Grid topic.

  1. Next, use the following command to retrieve the endpoint URL for your new Event Grid topic.
az eventgrid topic show --name <topic-name> --resource-group <resource-group> --query "endpoint" --output tsv
  1. Copy the endpoint URL to your clipboard.

Create an Azure Event Grid Subscription

The last step in setting up our Event Grid pipeline is to create a new subscription that triggers our Function app when new events are published to our Event Grid topic.

To create a new subscription, follow these steps:

  1. Navigate back to your Function app and go to the “Integrate” tab.
  2. Click on the “New Input” button, and then select “Azure Event Grid” from the list of available triggers.
  3. Enter the name of your new subscription in the “Subscription Name” field.
  4. Paste the endpoint URL for your Event Grid topic into the “Event Grid Topic Endpoint Url” field.
  5. Choose the desired options for your subscription. For this tutorial, we’ll use the default options.

Query Data in Azure Data Explorer

With our real-time analytics pipeline fully operational, let’s take a look at how we can query data in Azure Data Explorer. Azure Data Explorer provides a powerful query language called KQL (Kusto Query Language) that enables us to perform complex queries on our data.

Navigate to the Query Portal

To navigate to the Azure Data Explorer query portal, follow these steps:

  1. Navigate to the Azure Portal and open your Azure Data Explorer cluster.
  2. Go to the “Data Explorer” tab.
  3. Click the “Launch Query UI” button.

Create a Simple Query

Once you’re in the query portal, you can start writing queries to retrieve and analyze your data.

For example, let’s say our Function app is ingesting telemetry data from a fleet of vehicles. We can use the following query to retrieve the number of messages received from each vehicle in the fleet.

telemetry
| summarize count() by vehicleId

Analyze Data with Visualizations

Azure Data Explorer also enables you to create interactive visualizations of your data. To create a new visualization, follow these steps:

  1. Write your query and ensure that it returns the data you want to visualize.
  2. Click on the “New Visualization” button in the query editor.
  3. Select the desired chart type for your visualization.
  4. Configure the chart properties as desired.
  5. Click the “Run” button.

Conclusion

In this tutorial, we’ve seen how to set up a simple real-time analytics pipeline using Azure Data Explorer. We started by creating an Azure Data Explorer cluster and database, and then ingested data into our database using Azure Functions and Event Grid. Finally, we explored how to retrieve and analyze our data using Azure Data Explorer’s powerful query language and visualization tools.

With Azure Data Explorer, you can easily scale your real-time analytics to handle millions of data points per second, so you can make critical business decisions in real-time.

Related Post