您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

在 HDInsight 上为 Apache Kafka 配置存储和可伸缩性Configure storage and scalability for Apache Kafka on HDInsight

了解如何在 HDInsight 上配置 Apache Kafka 使用的托管磁盘数。Learn how to configure the number of managed disks used by Apache Kafka on HDInsight.

Kafka on HDInsight 在 HDInsight 群集中使用虚拟机的本地磁盘。Kafka on HDInsight uses the local disk of the virtual machines in the HDInsight cluster. 由于 Kafka 的 I/O 很高,因此会使用 Azure 托管磁盘提供高吞吐量,并为每个节点提供更多存储。Since Kafka is very I/O heavy, Azure Managed Disks is used to provide high throughput and provide more storage per node. 如果将传统虚拟硬盘 (VHD) 用于 Kafka,每个节点将被限制为 1 TB。If traditional virtual hard drives (VHD) were used for Kafka, each node is limited to 1 TB. 使用托管磁盘,可以使用多个磁盘,实现群集中每个节点 16 TB。With managed disks, you can use multiple disks to achieve 16 TB for each node in the cluster.

下图提供不带托管磁盘的 Kafka on HDInsight 与带托管磁盘的 Kafka on HDInsight 之间的比较:The following diagram provides a comparison between Kafka on HDInsight before managed disks, and Kafka on HDInsight with managed disks:

具有托管磁盘体系结构的 kafka

配置托管磁盘:Azure 门户Configure managed disks: Azure portal

  1. 按照创建 HDInsight 群集中的步骤操作,了解使用门户创建群集的常用步骤。Follow the steps in the Create an HDInsight cluster to understand the common steps to create a cluster using the portal. 请勿完成门户创建过程。Don't complete the portal creation process.

  2. 从 "配置 & 定价" 部分中,使用 "节点数" 字段来配置磁盘的数量。From the Configuration & Pricing section, use the Number of Nodes field to configure the number of disks.

    备注

    托管磁盘的类型可以为“标准”(HDD) 或“高级”(SSD)。The type of managed disk can be either Standard (HDD) or Premium (SSD). 高级磁盘可与 DS 和 GS 系列 VM 一起使用。Premium disks are used with DS and GS series VMs. 所有其他的 VM 类型使用“标准”。All other VM types use standard.

    "群集大小" 部分,其中突出显示了每个工作节点的磁盘

配置托管磁盘:Resource Manager 模板Configure managed disks: Resource Manager template

若要控制 Kafka 群集中辅助节点使用的磁盘数,请使用模板的以下部分:To control the number of disks used by the worker nodes in a Kafka cluster, use the following section of the template:

"dataDisksGroups": [
    {
        "disksPerNode": "[variables('disksPerWorkerNode')]"
    }
    ],

可以在 https://hditutorialdata.blob.core.windows.net/armtemplates/create-linux-based-kafka-mirror-cluster-in-vnet-v2.1.json 找到演示如何配置托管磁盘的完整模板。You can find a complete template that demonstrates how to configure managed disks at https://hditutorialdata.blob.core.windows.net/armtemplates/create-linux-based-kafka-mirror-cluster-in-vnet-v2.1.json.

后续步骤Next steps

有关使用 Apache Kafka on HDInsight 的详细信息,请参阅以下文档:For more information on working with Apache Kafka on HDInsight, see the following documents: