Create an Azure Data Explorer cluster and database by using Python
In this article, you create an Azure Data Explorer cluster and database by using Python. Azure Data Explorer is a fast, fully managed data analytics service for real-time analysis on large volumes of data streaming from applications, websites, IoT devices, and more. To use Azure Data Explorer, first create a cluster, and create one or more databases in that cluster. Then ingest, or load, data into a database so that you can run queries against it.
An Azure account with an active subscription. Create one for free.
An Azure AD Application and service principal that can access resources. Get values for
Directory (tenant) ID,
Application ID, and
Install Python package
To install the Python package for Azure Data Explorer (Kusto), open a command prompt that has Python in its path. Run this command:
pip install azure-common pip install azure-mgmt-kusto
For running the examples in this article, we need an Azure AD Application and service principal that can access resources. Check create an Azure AD application to create a free Azure AD Application and add role assignment at the subscription scope. It also shows how to get the
Directory (tenant) ID,
Application ID, and
Create the Azure Data Explorer cluster
Create your cluster by using the following command:
from azure.mgmt.kusto import KustoManagementClient from azure.mgmt.kusto.models import Cluster, AzureSku from azure.common.credentials import ServicePrincipalCredentials #Directory (tenant) ID tenant_id = "xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxx" #Application ID client_id = "xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxx" #Client Secret client_secret = "xxxxxxxxxxxxxx" subscription_id = "xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxx" credentials = ServicePrincipalCredentials( client_id=client_id, secret=client_secret, tenant=tenant_id ) location = 'Central US' sku_name = 'Standard_D13_v2' capacity = 5 tier = "Standard" resource_group_name = 'testrg' cluster_name = 'mykustocluster' cluster = Cluster(location=location, sku=AzureSku(name=sku_name, capacity=capacity, tier=tier)) kustoManagementClient = KustoManagementClient(credentials, subscription_id) cluster_operations = kustoManagementClient.clusters poller = cluster_operations.create_or_update(resource_group_name, cluster_name, cluster)
Setting Suggested value Field description cluster_name mykustocluster The desired name of your cluster. sku_name Standard_D13_v2 The SKU that will be used for your cluster. tier Standard The SKU tier. capacity number The number of instances of the cluster. resource_group_name testrg The resource group name where the cluster will be created.
Create a cluster is a long running operation. Method create_or_update returns an instance of LROPoller, see LROPoller class to get more information.
Run the following command to check whether your cluster was successfully created:
cluster_operations.get(resource_group_name = resource_group_name, cluster_name= clusterName, custom_headers=None, raw=False)
If the result contains
provisioningState with the
Succeeded value, then the cluster was successfully created.
Create the database in the Azure Data Explorer cluster
Create your database by using the following command:
from azure.mgmt.kusto.models import Database from datetime import timedelta softDeletePeriod = timedelta(days=3650) hotCachePeriod = timedelta(days=3650) databaseName="mykustodatabase" database_operations = kusto_management_client.databases _database = ReadWriteDatabase(location=location, soft_delete_period=softDeletePeriod, hot_cache_period=hotCachePeriod) #Returns an instance of LROPoller, see https://docs.microsoft.com/python/api/msrest/msrest.polling.lropoller?view=azure-python poller =database_operations.create_or_update(resource_group_name = resource_group_name, cluster_name = clusterName, database_name = databaseName, parameters = _database)
[!NOTE] If you are using Python version 0.4.0 or below, use Database instead of ReadWriteDatabase.
Setting Suggested value Field description cluster_name mykustocluster The name of your cluster where the database will be created. database_name mykustodatabase The name of your database. resource_group_name testrg The resource group name where the cluster will be created. soft_delete_period 3650 days, 0:00:00 The amount of time that data will be kept available to query. hot_cache_period 3650 days, 0:00:00 The amount of time that data will be kept in cache.
Run the following command to see the database that you created:
database_operations.get(resource_group_name = resource_group_name, cluster_name = clusterName, database_name = databaseName)
You now have a cluster and a database.
Clean up resources
If you plan to follow our other articles, keep the resources you created.
To clean up resources, delete the cluster. When you delete a cluster, it also deletes all the databases in it. Use the following command to delete your cluster:
cluster_operations.delete(resource_group_name = resource_group_name, cluster_name = clusterName)