使用 HDInsight 上的 Apache Storm 處理 Azure 事件中樞的事件 (C#)Process events from Azure Event Hubs with Apache Storm on HDInsight (C#)

了解如何從 HDInsight 上的 Apache Storm 使用 Azure 事件中樞。Learn how to work with Azure Event Hubs from Apache Storm on HDInsight. 本文件使用 C# Storm 拓撲從事件中樞讀取和寫入資料This document uses a C# Storm topology to read and write data from Event Hubs

SCP.NETSCP.NET

本文件中的步驟使用 SCP.NET,其為 NuGet 套件,可方便您建立 C# 拓撲和元件來與 Storm on HDInsight 搭配使用。The steps in this document use SCP.NET, a NuGet package that makes it easy to create C# topologies and components for use with Storm on HDInsight.

重要

雖然本文件中的步驟依賴 Windows 開發環境與 Visual Studio,但已編譯的專案可以提交到使用 Linux 之 HDInsight 叢集上的 Storm。While the steps in this document rely on a Windows development environment with Visual Studio, the compiled project can be submitted to a Storm on HDInsight cluster that uses Linux. 只有在 2016 年 10 月 28 日之後所建立之以 Linux 為基礎的叢集可支援 SCP.NET 拓撲。Only Linux-based clusters created after October 28, 2016, support SCP.NET topologies.

HDInsight 3.4 和更高版本使用單聲道來執行 C# 拓撲。HDInsight 3.4 and greater use Mono to run C# topologies. 本文件中使用的範例會使用 HDInsight 3.6。The example used in this document works with HDInsight 3.6. 如果您打算針對 HDInsight 來建立您自己的 .NET 解決方案,請參閱 Mono 相容性文件,以了解可能不相容之處。If you plan on creating your own .NET solutions for HDInsight, check the Mono compatibility document for potential incompatibilities.

叢集版本控制Cluster versioning

用於專案的 Microsoft.SCP.Net.SDK NuGet 套件必須符合安裝在 HDInsight 上的 Storm 主要版本。The Microsoft.SCP.Net.SDK NuGet package you use for your project must match the major version of Storm installed on HDInsight. HDInsight 3.5 版及 3.6 版使用 Storm 1.x 版,因此您必須搭配使用 SCP.NET 1.0.x.x 版與這些叢集。HDInsight versions 3.5 and 3.6 use Storm 1.x, so you must use SCP.NET version 1.0.x.x with these clusters.

重要

本文件中的範例預期使用 HDInsight 3.5 或 3.6 叢集。The example in this document expects an HDInsight 3.5 or 3.6 cluster.

Linux 是唯一使用於 HDInsight 3.4 版或更新版本的作業系統。Linux is the only operating system used on HDInsight version 3.4 or greater.

C# 拓撲也必須以 .NET 4.5 為目標。C# topologies must also target .NET 4.5.

如何使用事件中樞How to work with Event Hubs

Microsoft 提供一組可用來從 Storm 拓撲與事件中樞通訊的 Java 元件。Microsoft provides a set of Java components that can be used to communicate with Event Hubs from a Storm topology. 您可以在 https://github.com/hdinsight/mvn-repo/raw/master/org/apache/storm/storm-eventhubs/1.1.0.1/storm-eventhubs-1.1.0.1.jar 上找到包含這些元件的 HDInsight 3.6 相容版本的 Java 封存檔案 (JAR)。You can find the Java archive (JAR) file that contains an HDInsight 3.6 compatible version of these components at https://github.com/hdinsight/mvn-repo/raw/master/org/apache/storm/storm-eventhubs/1.1.0.1/storm-eventhubs-1.1.0.1.jar.

重要

雖然元件是以 Java 所撰寫,您可以輕鬆地從 C# 拓撲使用它們。While the components are written in Java, you can easily use them from a C# topology.

在此範例中會使用下列元件:The following components are used in this example:

  • EventHubSpout:從事件中樞讀取資料。EventHubSpout: Reads data from Event Hubs.
  • EventHubBolt:將資料寫入至事件中樞。EventHubBolt: Writes data to Event Hubs.
  • EventHubSpoutConfig:用來設定 EventHubSpout。EventHubSpoutConfig: Used to configure EventHubSpout.
  • EventHubBoltConfig:用來設定 EventHubBolt。EventHubBoltConfig: Used to configure EventHubBolt.

範例 Spout 使用方式Example spout usage

SCP.NET 會提供將 EventHubSpout 新增至您拓撲的方法。SCP.NET provides methods for adding an EventHubSpout to your topology. 這些方法比使用泛型方法新增 Java 元件更容易新增 Spout。These methods make it easier to add a spout than using the generic methods for adding a Java component. 下列範例示範如何使用 SCP.NET 所提供的 SetEventHubSpoutEventHubSpoutConfig 方法建立 Spout:The following example demonstrates how to create a spout by using the SetEventHubSpout and EventHubSpoutConfig methods provided by SCP.NET:

 topologyBuilder.SetEventHubSpout(
    "EventHubSpout",
    new EventHubSpoutConfig(
        ConfigurationManager.AppSettings["EventHubSharedAccessKeyName"],
        ConfigurationManager.AppSettings["EventHubSharedAccessKey"],
        ConfigurationManager.AppSettings["EventHubNamespace"],
        ConfigurationManager.AppSettings["EventHubEntityPath"],
        eventHubPartitions),
    eventHubPartitions);

前一個範例會建立名為 EventHubSpout 的新 Spout 元件,並設定它與事件中樞通訊。The previous example creates a new spout component named EventHubSpout, and configures it to communicate with an event hub. 元件的平行處理原則提示設定為事件中樞的資料分割數目。The parallelism hint for the component is set to the number of partitions in the event hub. 此設定可讓 Storm 針對每個資料分割建立元件執行個體。This setting allows Storm to create an instance of the component for each partition.

範例 Bolt 使用方式Example bolt usage

使用 JavaComponmentConstructor 方法來建立 Bolt 的執行個體。Use the JavaComponmentConstructor method to create an instance of the bolt. 下列範例示範如何建立及設定 EventHubBolt 的新執行個體:The following example demonstrates how to create and configure a new instance of the EventHubBolt:

// Java construcvtor for the Event Hub Bolt
JavaComponentConstructor constructor = JavaComponentConstructor.CreateFromClojureExpr(
    String.Format(@"(org.apache.storm.eventhubs.bolt.EventHubBolt. (org.apache.storm.eventhubs.bolt.EventHubBoltConfig. " +
        @"""{0}"" ""{1}"" ""{2}"" ""{3}"" ""{4}"" {5}))",
        ConfigurationManager.AppSettings["EventHubPolicyName"],
        ConfigurationManager.AppSettings["EventHubPolicyKey"],
        ConfigurationManager.AppSettings["EventHubNamespace"],
        "servicebus.windows.net",
        ConfigurationManager.AppSettings["EventHubName"],
        "true"));

// Set the bolt to subscribe to data from the spout
topologyBuilder.SetJavaBolt(
    "eventhubbolt",
    constructor,
    partitionCount)
        .shuffleGrouping("Spout");

注意

這個範例會使用以字串傳遞的 Clojure 運算式,而不是如 Spout 範例一樣使用 JavaComponentConstructor 來建立 EventHubBoltConfigThis example uses a Clojure expression passed as a string, instead of using JavaComponentConstructor to create an EventHubBoltConfig, as the spout example did. 兩種方法均可。Either method works. 使用覺得較適合您的方法。Use the method that feels best to you.

下載完成的專案Download the completed project

您可以從GitHub下載本文中所建立之專案的完整版本。You can download a complete version of the project created in this article from GitHub. 不過,您仍然需要遵循本文中的步驟來提供設定設定。However, you still need to provide configuration settings by following the steps in this article.

必要條件Prerequisites

  • HDInsight 上的 Apache Storm 叢集。An Apache Storm cluster on HDInsight. 請參閱使用 Azure 入口網站建立 Apache Hadoop 叢集,然後選取 [Storm] 作為 [叢集類型]。See Create Apache Hadoop clusters using the Azure portal and select Storm for Cluster type.

    警告

    此文件中所使用的範例需要 Storm on HDInsight version 3.5 或 3.6 版。The example used in this document requires Storm on HDInsight version 3.5 or 3.6. 由於重大類別名稱變更,這不適用舊版的 HDInsight。This does not work with older versions of HDInsight, due to breaking class name changes. 如需這個範例適用於較舊叢集的版本,請參閱 GitHubFor a version of this example that works with older clusters, see GitHub.

  • Azure 事件中樞An Azure event hub.

  • Azure .NET SDKThe Azure .NET SDK.

  • HDInsight Tools for Visual StudioThe HDInsight tools for Visual Studio.

  • 開發環境有 Java JDK 1.8 或更新版本。Java JDK 1.8 or later on your development environment. JDK 下載檔可從 Oracle 取得。JDK downloads are available from Oracle.

    • JAVA_HOME 環境變數必須指向包含 Java 的目錄。The JAVA_HOME environment variable must point to the directory that contains Java.
    • %JAVA_HOME%/bin 目錄必須在此路徑中。The %JAVA_HOME%/bin directory must be in the path.

下載事件中樞元件Download the Event Hubs components

https://github.com/hdinsight/mvn-repo/raw/master/org/apache/storm/storm-eventhubs/1.1.0.1/storm-eventhubs-1.1.0.1.jar 下載事件中樞 Spout 和 Bolt 元件。Download the Event Hubs spout and bolt component from https://github.com/hdinsight/mvn-repo/raw/master/org/apache/storm/storm-eventhubs/1.1.0.1/storm-eventhubs-1.1.0.1.jar.

請建立名為 eventhubspout 的目錄,並將檔案儲存到該目錄。Create a directory named eventhubspout, and save the file into the directory.

設定事件中樞Configure Event Hubs

事件中樞是此範例的資料來源。Event Hubs is the data source for this example. 請使用開始使用事件中樞的<建立事件中樞>一節中的資訊。Use the information in the "Create an event hub" section of Get started with Event Hubs.

  1. 在建立事件中樞之後,檢視 Azure 入口網站中的 [事件中樞] 設定,然後選取 [共用存取原則]。After the event hub has been created, view the EventHub settings in the Azure portal, and select Shared access policies. 選取 [+ 新增] 連結來新增下列原則︰Select + Add to add the following policies:

    NameName PermissionsPermissions
    寫入器writer 傳送Send
    讀取器reader 待命Listen

    [共用存取原則] 視窗的螢幕擷取畫面

  2. 選取 [讀取器] 和 [寫入器] 原則。Select the reader and writer policies. 複製並儲存這兩個原則的主索引鍵值,因為稍後將使用這些值。Copy and save the primary key value for both policies, as these values are used later.

設定 EventHubWriterConfigure the EventHubWriter

  1. 如果您尚未安裝最新版本的 HDInsight Tools for Visual Studio,請參閱開始使用 HDInsight Tools for Visual StudioIf you have not already installed the latest version of the HDInsight tools for Visual Studio, see Get started using HDInsight tools for Visual Studio.

  2. eventhub-storm-hybrid 下載方案。Download the solution from eventhub-storm-hybrid.

  3. EventHubWriter 專案中,開啟 App.config 檔案。In the EventHubWriter project, open the App.config file. 使用事件中樞內您稍早設定的資訊填入下列索引鍵的值:Use the information from the event hub that you configured earlier to fill in the value for the following keys:

    KeyKey Value
    EventHubPolicyNameEventHubPolicyName 寫入器 (如果您為具有傳送權限的原則使用了不同名稱,請改用該名稱。)writer (If you used a different name for the policy with Send permission, use it instead.)
    EventHubPolicyKeyEventHubPolicyKey 寫入器原則的索引鍵。The key for the writer policy.
    EventHubNamespaceEventHubNamespace 包含事件中樞的命名空間。The namespace that contains your event hub.
    EventHubNameEventHubName 事件中樞名稱。Your event hub name.
    EventHubPartitionCountEventHubPartitionCount 事件中樞內的資料分割數目。The number of partitions in your event hub.
  4. 儲存並關閉 App.config 檔案。Save and close the App.config file.

設定 EventHubReaderConfigure the EventHubReader

  1. 開啟 EventHubReader 專案。Open the EventHubReader project.

  2. 開啟 EventHubReaderApp.config 檔案。Open the App.config file for the EventHubReader. 使用事件中樞內您稍早設定的資訊填入下列索引鍵的值:Use the information from the event hub that you configured earlier to fill in the value for the following keys:

    KeyKey Value
    EventHubPolicyNameEventHubPolicyName 讀取器 (如果您為具有接聽權限的原則使用了不同名稱,請改用該名稱。)reader (If you used a different name for the policy with listen permission, use it instead.)
    EventHubPolicyKeyEventHubPolicyKey 讀取器原則的索引鍵。The key for the reader policy.
    EventHubNamespaceEventHubNamespace 包含事件中樞的命名空間。The namespace that contains your event hub.
    EventHubNameEventHubName 事件中樞名稱。Your event hub name.
    EventHubPartitionCountEventHubPartitionCount 事件中樞內的資料分割數目。The number of partitions in your event hub.
  3. 儲存並關閉 App.config 檔案。Save and close the App.config file.

部署拓撲Deploy the topologies

  1. 在 [方案總管] 中,以滑鼠右鍵按一下 [EventHubReader] 專案,然後選取 [提交到 Storm on HDInsight]。From Solution Explorer, right-click the EventHubReader project, and select Submit to Storm on HDInsight.

    [方案總管] 的螢幕擷取畫面,已反白顯示 [提交到 Storm on HDInsight]

  2. 在 [提交拓撲] 對話方塊中,選取您的 [Storm 叢集]。On the Submit Topology dialog box, select your Storm Cluster. 展開 [其他設定],依序選取 [Java 檔案路徑]、[...],然後選取包含您稍早下載之 JAR 檔的目錄。Expand Additional Configurations, select Java File Paths, select ..., and select the directory that contains the JAR file that you downloaded earlier. 最後,按一下 [提交]。Finally, click Submit.

    [提交拓撲] 對話方塊的螢幕擷取畫面

  3. 提交拓撲之後,[Storm 拓撲檢視器] 便會隨即出現。When the topology has been submitted, the Storm Topologies Viewer appears. 若要檢視拓撲的相關資訊,請選取左窗格中的 EventHubReader 拓撲。To view information about the topology, select the EventHubReader topology in the left pane.

    Storm 拓撲檢視器的螢幕擷取畫面

  4. 在 [方案總管] 中,以滑鼠右鍵按一下 [EventHubWriter] 專案,然後選取 [提交到 Storm on HDInsight]。From Solution Explorer, right-click the EventHubWriter project, and select Submit to Storm on HDInsight.

  5. 在 [提交拓撲] 對話方塊中,選取您的 [Storm 叢集]。On the Submit Topology dialog box, select your Storm Cluster. 展開 [其他設定],依序選取 [Java 檔案路徑]、[...],然後選取包含您稍早下載之 JAR 檔的目錄。Expand Additional Configurations, select Java File Paths, select ..., and select the directory that contains the JAR file you downloaded earlier. 最後,按一下 [提交]。Finally, click Submit.

  6. 提交拓撲之後,請重新整理 [Storm 拓撲檢視器] 中的拓撲清單,以確認兩個拓撲皆在叢集上執行。When the topology has been submitted, refresh the topology list in the Storm Topologies Viewer to verify that both topologies are running on the cluster.

  7. 在 [Storm 拓撲檢視器] 中,選取 [EventHubReader] 拓撲。In Storm Topologies Viewer, select the EventHubReader topology.

  8. 若要開啟 Bolt 的元件摘要,請按兩下圖表中的 LogBolt 元件。To open the component summary for the bolt, double-click the LogBolt component in the diagram.

  9. 在 [執行程式] 區段中,選取 [連接埠] 資料行內的其中一個連結。In the Executors section, select one of the links in the Port column. 這會顯示該元件記錄的資訊。This displays information logged by the component. 所記錄的資訊類似下列文字︰The logged information is similar to the following text:

     2017-03-02 14:51:29.255 m.s.p.TaskHost [INFO] Received C# STDOUT: 2017-03-02 14:51:29,255 [1] INFO  EventHubReader_LogBolt [(null)] - Received data: {"deviceValue":1830978598,"deviceId":"8566ccbc-034d-45db-883d-d8a31f34068e"}
     2017-03-02 14:51:29.283 m.s.p.TaskHost [INFO] Received C# STDOUT: 2017-03-02 14:51:29,283 [1] INFO  EventHubReader_LogBolt [(null)] - Received data: {"deviceValue":1756413275,"deviceId":"647a5eff-823d-482f-a8b4-b95b35ae570b"}
     2017-03-02 14:51:29.313 m.s.p.TaskHost [INFO] Received C# STDOUT: 2017-03-02 14:51:29,312 [1] INFO  EventHubReader_LogBolt [(null)] - Received data: {"deviceValue":1108478910,"deviceId":"206a68fa-8264-4d61-9100-bfdb68ee8f0a"}
    

停止拓撲Stop the topologies

若要停止拓撲,請在 [Storm 拓撲檢視器] 中選取每個拓撲,然後選取 [終止]。To stop the topologies, select each topology in the Storm Topology Viewer, then click Kill.

Storm 拓撲檢視器的螢幕擷取畫面,已反白顯示 [終止] 按鈕

刪除叢集Delete your cluster

警告

不論使用與否,HDInsight 叢集都是按分鐘計費。Billing for HDInsight clusters is prorated per minute, whether you use them or not. 請務必在使用完叢集後將它刪除。Be sure to delete your cluster after you finish using it. 請參閱如何刪除 HDInsight 叢集See how to delete an HDInsight cluster.

後續步驟Next steps

在本文件中,您已經了解如何使用 Java 事件中樞 Spout 和 Bolt,以利用 C# 拓撲來使用 Azure 事件中樞的資料。In this document, you have learned how to use the Java Event Hubs spout and bolt from a C# topology to work with data in Azure Event Hubs. 若要深入了解如何建立 C# 拓撲,請參閱下列內容:To learn more about creating C# topologies, see the following: