管理 Azure Data Lake Analytics .NET 應用程式Manage Azure Data Lake Analytics a .NET app

本文說明如何使用以 Azure .NET SDK 所撰寫的應用程式,來管理 Azure Data Lake Analytics 帳戶、資料來源、使用者和作業。This article describes how to manage Azure Data Lake Analytics accounts, data sources, users, and jobs using an app written using the Azure .NET SDK.

先決條件Prerequisites

  • 已安裝 Visual Studio 2015、Visual Studio 2013 更新 4,或具有 Visual C++ 的 Visual Studio 2012Visual Studio 2015, Visual Studio 2013 update 4, or Visual Studio 2012 with Visual C++ Installed.
  • Microsoft Azure SDK for .NET 2.5 版或更新版本Microsoft Azure SDK for .NET version 2.5 or above. 使用 Web Platform Installer來進行安裝。Install it using the Web platform installer.
  • 必要的 NuGet 套件Required NuGet Packages

安裝 NuGet 套件Install NuGet packages

套件Package VersionVersion
Microsoft.Rest.ClientRuntime.Azure.AuthenticationMicrosoft.Rest.ClientRuntime.Azure.Authentication 2.3.12.3.1
Microsoft.Azure.Management.DataLake.AnalyticsMicrosoft.Azure.Management.DataLake.Analytics 3.0.03.0.0
Microsoft.Azure.Management.DataLake.StoreMicrosoft.Azure.Management.DataLake.Store 2.2.02.2.0
Microsoft.Azure.Management.ResourceManagerMicrosoft.Azure.Management.ResourceManager 1.6.0-preview1.6.0-preview
Microsoft.Azure.Graph.RBACMicrosoft.Azure.Graph.RBAC 3.4.0-preview3.4.0-preview

您可以透過 NuGet 命令列,使用下列命令來安裝這些套件:You can install these packages via the NuGet command line with the following commands:

Install-Package -Id Microsoft.Rest.ClientRuntime.Azure.Authentication  -Version 2.3.1
Install-Package -Id Microsoft.Azure.Management.DataLake.Analytics  -Version 3.0.0
Install-Package -Id Microsoft.Azure.Management.DataLake.Store  -Version 2.2.0
Install-Package -Id Microsoft.Azure.Management.ResourceManager  -Version 1.6.0-preview
Install-Package -Id Microsoft.Azure.Graph.RBAC -Version 3.4.0-preview

常用變數Common variables

string subid = "<Subscription ID>"; // Subscription ID (a GUID)
string tenantid = "<Tenant ID>"; // AAD tenant ID or domain. For example, "contoso.onmicrosoft.com"
string rg == "<value>"; // Resource  group name
string clientid = "1950a258-227b-4e31-a9cf-717495945fc2"; // Sample client ID (this will work, but you should pick your own)

驗證Authentication

您有多個可登入 Azure Data Lake Analytics 的選項。You have multiple options for logging on to Azure Data Lake Analytics. 下列程式碼片段說明使用快顯視窗進行互動式使用者驗證的驗證範例。The following snippet shows an example of authentication with interactive user authentication with a pop-up.

using System;
using System.IO;
using System.Threading;
using System.Security.Cryptography.X509Certificates;

using Microsoft.Rest;
using Microsoft.Rest.Azure.Authentication;
using Microsoft.Azure.Management.DataLake.Analytics;
using Microsoft.Azure.Management.DataLake.Analytics.Models;
using Microsoft.Azure.Management.DataLake.Store;
using Microsoft.Azure.Management.DataLake.Store.Models;
using Microsoft.IdentityModel.Clients.ActiveDirectory;
using Microsoft.Azure.Graph.RBAC;

public static Program
{
   public static string TENANT = "microsoft.onmicrosoft.com";
   public static string CLIENTID = "1950a258-227b-4e31-a9cf-717495945fc2";
   public static System.Uri ARM_TOKEN_AUDIENCE = new System.Uri( @"https://management.core.windows.net/");
   public static System.Uri ADL_TOKEN_AUDIENCE = new System.Uri( @"https://datalake.azure.net/" );
   public static System.Uri GRAPH_TOKEN_AUDIENCE = new System.Uri( @"https://graph.windows.net/" );

   static void Main(string[] args)
   {
      string MY_DOCUMENTS= System.Environment.GetFolderPath( System.Environment.SpecialFolder.MyDocuments);
      string TOKEN_CACHE_PATH = System.IO.Path.Combine(MY_DOCUMENTS, "my.tokencache");

      var tokenCache = GetTokenCache(TOKEN_CACHE_PATH);
      var armCreds = GetCreds_User_Popup(TENANT, ARM_TOKEN_AUDIENCE, CLIENTID, tokenCache);
      var adlCreds = GetCreds_User_Popup(TENANT, ADL_TOKEN_AUDIENCE, CLIENTID, tokenCache);
      var graphCreds = GetCreds_User_Popup(TENANT, GRAPH_TOKEN_AUDIENCE, CLIENTID, tokenCache);
   }
}

如需 GetCreds_User_Popup 的原始程式碼及其他驗證選項的程式碼,請參閱 Data Lake Analytics .NET 驗證選項 (英文)The source code for GetCreds_User_Popup and the code for other options for authentication are covered in Data Lake Analytics .NET authentication options

建立用戶端管理物件Create the client management objects

var resourceManagementClient = new ResourceManagementClient(armCreds) { SubscriptionId = subid };

var adlaAccountClient = new DataLakeAnalyticsAccountManagementClient(armCreds);
adlaAccountClient.SubscriptionId = subid;

var adlsAccountClient = new DataLakeStoreAccountManagementClient(armCreds);
adlsAccountClient.SubscriptionId = subid;

var adlaCatalogClient = new DataLakeAnalyticsCatalogManagementClient(adlCreds);
var adlaJobClient = new DataLakeAnalyticsJobManagementClient(adlCreds);

var adlsFileSystemClient = new DataLakeStoreFileSystemManagementClient(adlCreds);

var  graphClient = new GraphRbacManagementClient(graphCreds);
graphClient.TenantID = domain;

管理帳戶Manage accounts

建立 Azure 資源群組Create an Azure Resource Group

如果您尚未建立,您必須具有 Azure 資源群組來建立 Data Lake Analytics 元件。If you haven't already created one, you must have an Azure Resource Group to create your Data Lake Analytics components. 您將需要您的驗證認證、訂用帳戶 ID 及一個位置。You need your authentication credentials, subscription ID, and a location. 下列程式碼示範如何建立資源群組:The following code shows how to create a resource group:

var resourceGroup = new ResourceGroup { Location = location };
resourceManagementClient.ResourceGroups.CreateOrUpdate(groupName, rg);

如需詳細資訊,請參閱「Azure 資源群組和 Data Lake Analytics」。For more information, see Azure Resource Groups and Data Lake Analytics.

建立 Data Lake Store 帳戶Create a Data Lake Store account

每個 ADLA 帳戶都需要一個 ADLS 帳戶。Ever ADLA account requires an ADLS account. 如果您還沒有可用的帳戶,可以使用下列程式碼來建立一個:If you don't already have one to use, you can create one with the following code:

var new_adls_params = new DataLakeStoreAccount(location: _location);
adlsAccountClient.Account.Create(rg, adls, new_adls_params);

建立 Data Lake Analytics 帳戶Create a Data Lake Analytics account

下列程式碼會建立一個 ADLS 帳戶The following code creates an ADLS account

var new_adla_params = new DataLakeAnalyticsAccount()
{
   DefaultDataLakeStoreAccount = adls,
   Location = location
};

adlaClient.Account.Create(rg, adla, new_adla_params);

列出 Data Lake Store 帳戶List Data Lake Store accounts

var adlsAccounts = adlsAccountClient.Account.List().ToList();
foreach (var adls in adlsAccounts)
{
   Console.WriteLine($"ADLS: {0}", adls.Name);
}

列出 Data Lake Analytics 帳戶List Data Lake Analytics accounts

var adlaAccounts = adlaClient.Account.List().ToList();

for (var adla in AdlaAccounts)
{
   Console.WriteLine($"ADLA: {0}, adla.Name");
}

檢查帳戶是否存在Checking if an account exists

bool exists = adlaClient.Account.Exists(rg, adla));

取得帳戶的相關資訊Get information about an account

bool exists = adlaClient.Account.Exists(rg, adla));
if (exists)
{
   var adla_accnt = adlaClient.Account.Get(rg, adla);
}

刪除帳戶Delete an account

if (adlaClient.Account.Exists(rg, adla))
{
   adlaClient.Account.Delete(rg, adla);
}

取得預設的 Data Lake Store 帳戶Get the default Data Lake Store account

每個 Data Lake Analytics 帳戶都必須擁有預設 Data Lake Store 帳戶。Every Data Lake Analytics account requires a default Data Lake Store account. 您可以使用此程式碼來判斷 Analytics 帳戶的預設 Store 帳戶。Use this code to determine the default Store account for an Analytics account.

if (adlaClient.Account.Exists(rg, adla))
{
  var adla_accnt = adlaClient.Account.Get(rg, adla);
  string def_adls_account = adla_accnt.DefaultDataLakeStoreAccount;
}

管理資料來源Manage data sources

Data Lake Analytics 目前支援下列資料來源:Data Lake Analytics currently supports the following data sources:

您可以建立「Azure 儲存體」帳戶的連結。You can create links to Azure Storage accounts.

string storage_key = "xxxxxxxxxxxxxxxxxxxx";
string storage_account = "mystorageaccount";
var addParams = new AddStorageAccountParameters(storage_key);            
adlaClient.StorageAccounts.Add(rg, adla, storage_account, addParams);

列出 Azure 儲存體資料來源List Azure Storage data sources

var stg_accounts = adlaAccountClient.StorageAccounts.ListByAccount(rg, adla);

if (stg_accounts != null)
{
  foreach (var stg_account in stg_accounts)
  {
      Console.WriteLine($"Storage account: {0}", stg_account.Name);
  }
}

列出 Data Lake Store 資料來源List Data Lake Store data sources

var adls_accounts = adlsClient.Account.List();

if (adls_accounts != null)
{
  foreach (var adls_accnt in adls_accounts)
  {
      Console.WriteLine($"ADLS account: {0}", adls_accnt.Name);
  }
}

上傳及下載資料夾和檔案Upload and download folders and files

您可以使用 Data Lake Store 檔案系統用戶端管理物件,透過下列方法,將個別的檔案或資料夾上傳至 Azure 及從 Azure 下載到您的本機電腦:You can use the Data Lake Store file system client management object to upload and download individual files or folders from Azure to your local computer, using the following methods:

  • UploadFolderUploadFolder
  • UploadFileUploadFile
  • DownloadFolderDownloadFolder
  • DownloadFileDownloadFile

這些方法的第一個參數是 Data Lake Store 帳戶的名稱,後面接著來源路徑和目的地路徑的參數。The first parameter for these methods is the name of the Data Lake Store Account, followed by parameters for the source path and the destination path.

下列範例示範如何下載 Data Lake Store 中的資料夾。The following example shows how to download a folder in the Data Lake Store.

adlsFileSystemClient.FileSystem.DownloadFolder(adls, sourcePath, destinationPath);

在 Data Lake Store 帳戶中建立檔案Create a file in a Data Lake Store account

using (var memstream = new MemoryStream())
{
   using (var sw = new StreamWriter(memstream, UTF8Encoding.UTF8))
   {
      sw.WriteLine("Hello World");
      sw.Flush();
      
      memstream.Position = 0;

      adlsFileSystemClient.FileSystem.Create(adls, "/Samples/Output/randombytes.csv", memstream);
   }
}

確認 Azure 儲存體帳戶路徑Verify Azure Storage account paths

下列程式碼會檢查 Azure 儲存體帳戶 (storageAccntName) 是否存在於 Data Lake Analytics 帳戶 (analyticsAccountName),以及容器 (containerName) 是否存在於 Azure 儲存體帳戶。The following code checks if an Azure Storage account (storageAccntName) exists in a Data Lake Analytics account (analyticsAccountName), and if a container (containerName) exists in the Azure Storage account.

string storage_account = "mystorageaccount";
string storage_container = "mycontainer";
bool accountExists = adlaClient.Account.StorageAccountExists(rg, adla, storage_account));
bool containerExists = adlaClient.Account.StorageContainerExists(rg, adla, storage_account, storage_container));

管理目錄和作業Manage catalog and jobs

DataLakeAnalyticsCatalogManagementClient 物件會提供方法,用以管理為每個 Azure Data Lake Analytics 帳戶提供的 SQL 資料庫。The DataLakeAnalyticsCatalogManagementClient object provides methods for managing the SQL database provided for each Azure Data Lake Analytics account. DataLakeAnalyticsJobManagementClient 會提供方法來提交及管理使用 U-SQL 指令碼在資料庫上執行的作業。The DataLakeAnalyticsJobManagementClient provides methods to submit and manage jobs run on the database with U-SQL scripts.

列出資料庫和結構描述List databases and schemas

在您可以列出的幾個項目中,最常見的是資料庫和其結構描述。Among the several things you can list, the most common are databases and their schema. 下列程式碼會取得資料庫的集合,然後列舉每個資料庫的結構描述。The following code obtains a collection of databases, and then enumerates the schema for each database.

var databases = adlaCatalogClient.Catalog.ListDatabases(adla);
foreach (var db in databases)
{
  Console.WriteLine($"Database: {db.Name}");
  Console.WriteLine(" - Schemas:");
  var schemas = adlaCatalogClient.Catalog.ListSchemas(adla, db.Name);
  foreach (var schm in schemas)
  {
      Console.WriteLine($"\t{schm.Name}");
  }
}

列出資料表資料行List table columns

下列程式碼示範如何使用 Data Lake Analytics 目錄管理用戶端存取資料庫,以列出指定資料表的資料行。The following code shows how to access the database with a Data Lake Analytics Catalog management client to list the columns in a specified table.

var tbl = adlaCatalogClient.Catalog.GetTable(adla, "master", "dbo", "MyTableName");
IEnumerable<USqlTableColumn> columns = tbl.ColumnList;

foreach (USqlTableColumn utc in columns)
{
  Console.WriteLine($"\t{utc.Name}");
}

提交 U-SQL 作業Submit a U-SQL job

下列程式碼示範如何使用 Data Lake Analytics 作業管理用戶端來提交作業。The following code shows how to use a Data Lake Analytics Job management client to submit a job.

string scriptPath = "/Samples/Scripts/SearchResults_Wikipedia_Script.txt";
Stream scriptStrm = adlsFileSystemClient.FileSystem.Open(_adlsAccountName, scriptPath);
string scriptTxt = string.Empty;
using (StreamReader sr = new StreamReader(scriptStrm))
{
    scriptTxt = sr.ReadToEnd();
}

var jobName = "SR_Wikipedia";
var jobId = Guid.NewGuid();
var properties = new USqlJobProperties(scriptTxt);
var parameters = new JobInformation(jobName, JobType.USql, properties, priority: 1, degreeOfParallelism: 1, jobId: jobId);
var jobInfo = adlaJobClient.Job.Create(adla, jobId, parameters);
Console.WriteLine($"Job {jobName} submitted.");

列出失敗的作業List failed jobs

下列程式碼列出失敗作業的相關資訊。The following code lists information about jobs that failed.

var odq = new ODataQuery<JobInformation> { Filter = "result eq 'Failed'" };
var jobs = adlaJobClient.Job.List(adla, odq);
foreach (var j in jobs)
{
   Console.WriteLine($"{j.Name}\t{j.JobId}\t{j.Type}\t{j.StartTime}\t{j.EndTime}");
}

列出管線List pipelines

下列程式碼會列出提交給帳戶之作業的每個管線相關資訊。The following code lists information about each pipeline of jobs submitted to the account.

var pipelines = adlaJobClient.Pipeline.List(adla);
foreach (var p in pipelines)
{
   Console.WriteLine($"Pipeline: {p.Name}\t{p.PipelineId}\t{p.LastSubmitTime}");
}

列出週期List recurrences

下列程式碼會列出提交給帳戶之作業的每個週期相關資訊。The following code lists information about each recurrence of jobs submitted to the account.

var recurrences = adlaJobClient.Recurrence.List(adla);
foreach (var r in recurrences)
{
   Console.WriteLine($"Recurrence: {r.Name}\t{r.RecurrenceId}\t{r.LastSubmitTime}");
}

常見的圖形案例Common graph scenarios

查詢 AAD 目錄中的使用者Look up user in the AAD directory

var userinfo = graphClient.Users.Get( "bill@contoso.com" );

取得 AAD 目錄中使用者的 ObjectIdGet the ObjectId of a user in the AAD directory

var userinfo = graphClient.Users.Get( "bill@contoso.com" );
Console.WriteLine( userinfo.ObjectId )

管理計算原則Manage compute policies

DataLakeAnalyticsAccountManagementClient 物件會提供方法,用以管理 Data Lake Analytics 帳戶的計算原則。The DataLakeAnalyticsAccountManagementClient object provides methods for managing the compute policies for a Data Lake Analytics account.

列出計算原則List compute policies

下列程式碼會擷取 Data Lake Analytics 帳戶的計算原則清單。The following code retrieves a list of compute policies for a Data Lake Analytics account.

var policies = adlaAccountClient.ComputePolicies.ListByAccount(rg, adla);
foreach (var p in policies)
{
   Console.WriteLine($"Name: {p.Name}\tType: {p.ObjectType}\tMax AUs / job: {p.MaxDegreeOfParallelismPerJob}\tMin priority / job: {p.MinPriorityPerJob}");
}

建立新的計算原則Create a new compute policy

下列程式碼會為 Data Lake Analytics 帳戶建立新的計算原則,其中是將指定使用者可用的 AU 上限設定為 50,而將作業最低優先順序設定為 250。The following code creates a new compute policy for a Data Lake Analytics account, setting the maximum AUs available to the specified user to 50, and the minimum job priority to 250.

var userAadObjectId = "3b097601-4912-4d41-b9d2-78672fc2acde";
var newPolicyParams = new ComputePolicyCreateOrUpdateParameters(userAadObjectId, "User", 50, 250);
adlaAccountClient.ComputePolicies.CreateOrUpdate(rg, adla, "GaryMcDaniel", newPolicyParams);

後續步驟Next steps