使用 Python 建立 Azure 資料總管叢集與資料庫Create an Azure Data Explorer cluster and database by using Python

在本文中,您會使用 Python 建立 Azure 資料總管叢集和資料庫。In this article, you create an Azure Data Explorer cluster and database by using Python. Azure 資料總管是快速、完全受控的資料分析服務,可即時分析來自應用程式、網站、IoT 裝置等的大量資料流。Azure Data Explorer is a fast, fully managed data analytics service for real-time analysis on large volumes of data streaming from applications, websites, IoT devices, and more. 若要使用 Azure 資料總管,請先建立叢集,然後在該叢集中建立一或多個資料庫。To use Azure Data Explorer, first create a cluster, and create one or more databases in that cluster. 然後將資料內嵌或載入至資料庫,讓您可以對其執行查詢。Then ingest, or load, data into a database so that you can run queries against it.

PrerequisitesPrerequisites

安裝 Python 套件Install Python package

若要為 Azure 資料總管 (Kusto) 安裝 Python 套件,請開啟在其路徑中有 Python 的命令提示字元。To install the Python package for Azure Data Explorer (Kusto), open a command prompt that has Python in its path. 請執行這個命令:Run this command:

pip install azure-common
pip install azure-mgmt-kusto

驗證Authentication

若要執行本文中的範例,我們需要 Azure AD 應用程式和服務主體,才能存取資源。For running the examples in this article, we need an Azure AD Application and service principal that can access resources. 核取 [建立 Azure AD 應用程式] 以建立免費的 Azure AD 應用程式,並在訂用帳戶範圍中新增角色指派。Check create an Azure AD application to create a free Azure AD Application and add role assignment at the subscription scope. 它也會說明如何取得 Directory (tenant) IDApplication IDClient SecretIt also shows how to get the Directory (tenant) ID, Application ID, and Client Secret.

建立 Azure 資料總管叢集Create the Azure Data Explorer cluster

  1. 使用下列命令建立您的叢集:Create your cluster by using the following command:

    from azure.mgmt.kusto import KustoManagementClient
    from azure.mgmt.kusto.models import Cluster, AzureSku
    from azure.common.credentials import ServicePrincipalCredentials
    
    #Directory (tenant) ID
    tenant_id = "xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxx"
    #Application ID
    client_id = "xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxx"
    #Client Secret
    client_secret = "xxxxxxxxxxxxxx"
    subscription_id = "xxxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxx"
    credentials = ServicePrincipalCredentials(
        client_id=client_id,
        secret=client_secret,
        tenant=tenant_id
    )
    
    location = 'Central US'
    sku_name = 'Standard_D13_v2'
    capacity = 5
    tier = "Standard"
    resource_group_name = 'testrg'
    cluster_name = 'mykustocluster'
    cluster = Cluster(location=location, sku=AzureSku(name=sku_name, capacity=capacity, tier=tier))
    
    kustoManagementClient = KustoManagementClient(credentials, subscription_id)
    
    cluster_operations = kustoManagementClient.clusters
    
    poller = cluster_operations.create_or_update(resource_group_name, cluster_name, cluster)
    
    設定Setting 建議的值Suggested value 欄位描述Field description
    cluster_namecluster_name mykustoclustermykustocluster 所需的叢集名稱。The desired name of your cluster.
    sku_namesku_name Standard_D13_v2Standard_D13_v2 將用於叢集的 SKU。The SKU that will be used for your cluster.
    tiertier StandardStandard SKU 層。The SKU tier.
    capacitycapacity numbernumber 叢集的實例數目。The number of instances of the cluster.
    resource_group_nameresource_group_name testrgtestrg 將在其中建立叢集的資源群組名稱。The resource group name where the cluster will be created.

    注意

    建立叢集是長時間執行的作業。Create a cluster is a long running operation. 方法create_or_update會傳回 LROPoller 的實例,請參閱LROPoller 類別以取得詳細資訊。Method create_or_update returns an instance of LROPoller, see LROPoller class to get more information.

  2. 執行下列命令來檢查是否已成功建立叢集:Run the following command to check whether your cluster was successfully created:

    cluster_operations.get(resource_group_name = resource_group_name, cluster_name= clusterName, custom_headers=None, raw=False)
    

如果結果中包含有 provisioningState 值的 Succeeded,表示已成功建立叢集。If the result contains provisioningState with the Succeeded value, then the cluster was successfully created.

在 Azure 資料總管叢集中建立資料庫Create the database in the Azure Data Explorer cluster

  1. 使用下列命令建立您的資料庫:Create your database by using the following command:

    from azure.mgmt.kusto.models import Database
    from datetime import timedelta
    
    softDeletePeriod = timedelta(days=3650)
    hotCachePeriod = timedelta(days=3650)
    databaseName="mykustodatabase"
    
    database_operations = kusto_management_client.databases 
    _database = ReadWriteDatabase(location=location,
                        soft_delete_period=softDeletePeriod,
                        hot_cache_period=hotCachePeriod)
    
    #Returns an instance of LROPoller, see https://docs.microsoft.com/python/api/msrest/msrest.polling.lropoller?view=azure-python
    poller =database_operations.create_or_update(resource_group_name = resource_group_name, cluster_name = clusterName, database_name = databaseName, parameters = _database)
    
     [!NOTE]
     If you are using Python version 0.4.0 or below, use Database instead of ReadWriteDatabase.
    
    設定Setting 建議的值Suggested value 欄位描述Field description
    cluster_namecluster_name mykustoclustermykustocluster 將在其中建立資料庫的叢集名稱。The name of your cluster where the database will be created.
    database_namedatabase_name mykustodatabasemykustodatabase 您的資料庫名稱。The name of your database.
    resource_group_nameresource_group_name testrgtestrg 將在其中建立叢集的資源群組名稱。The resource group name where the cluster will be created.
    soft_delete_periodsoft_delete_period 3650 天,0:00:003650 days, 0:00:00 將保留資料以供查詢的時間長度。The amount of time that data will be kept available to query.
    hot_cache_periodhot_cache_period 3650 天,0:00:003650 days, 0:00:00 資料將保留在快取中的時間長度。The amount of time that data will be kept in cache.
  2. 執行下列命令以查看您所建立的資料庫:Run the following command to see the database that you created:

    database_operations.get(resource_group_name = resource_group_name, cluster_name = clusterName, database_name = databaseName)
    

您此時有一個叢集和一個資料庫。You now have a cluster and a database.

清除資源Clean up resources

  • 如果您打算遵循其他文章,請保留您建立的資源。If you plan to follow our other articles, keep the resources you created.

  • 若要清除資源,請刪除叢集。To clean up resources, delete the cluster. 您刪除叢集時,也會刪除其中的所有資料庫。When you delete a cluster, it also deletes all the databases in it. 使用下列命令刪除您的叢集:Use the following command to delete your cluster:

    cluster_operations.delete(resource_group_name = resource_group_name, cluster_name = clusterName)
    

後續步驟Next steps