您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

动态将分区添加到 Azure 事件中心的事件中心(Apache Kafka 主题)Dynamically add partitions to an event hub (Apache Kafka topic) in Azure Event Hubs

事件中心通过分区使用者模式提供消息流式处理功能,在此模式下,每个使用者只读取消息流的特定子集或分区。Event Hubs provides message streaming through a partitioned consumer pattern in which each consumer only reads a specific subset, or partition, of the message stream. 此模式支持事件处理的水平缩放,同时提供队列和主题中不可用的其他面向流的功能。This pattern enables horizontal scale for event processing and provides other stream-focused features that are unavailable in queues and topics. 分区是事件中心内保留的有序事件。A partition is an ordered sequence of events that is held in an event hub. 当较新的事件到达时,它们将添加到此序列的末尾。As newer events arrive, they're added to the end of this sequence. 有关一般分区的详细信息,请参阅分区For more information about partitions in general, see Partitions

可在创建事件中心时指定分区数。You can specify the number of partitions at the time of creating an event hub. 在某些情况下,可能需要在创建事件中心后添加分区。In some scenarios, you may need to add partitions after the event hub has been created. 本文介绍如何向现有事件中心动态添加分区。This article describes how to dynamically add partitions to an existing event hub.

重要

动态添加分区仅适用于专用事件中心群集。Dynamic additions of partitions is available only on Dedicated Event Hubs clusters.

备注

对于 Apache Kafka 客户端,事件中心映射到 Kafka 主题 。For Apache Kafka clients, an event hub maps to a Kafka topic. 有关 Azure 事件中心与 Apache Kafka 之间映射的更多信息,请参阅 Kafka 和事件中心概念映射For more mappings between Azure Event Hubs and Apache Kafka, see Kafka and Event Hubs conceptual mapping

更新分区计数Update the partition count

本部分说明如何以不同的方式(PowerShell、CLI 等)更新事件中心的分区计数。This section shows you how to update partition count of an event hub in different ways (PowerShell, CLI, and so on.).

PowerShellPowerShell

使用 Set-AzureRmEventHub PowerShell 命令更新事件中心中的分区。Use the Set-AzureRmEventHub PowerShell command to update partitions in an event hub.

Set-AzureRmEventHub -ResourceGroupName MyResourceGroupName -Namespace MyNamespaceName -Name MyEventHubName -partitionCount 12

CLICLI

使用 az eventhubs eventhub update CLI 命令更新事件中心中的分区。Use the az eventhubs eventhub update CLI command to update partitions in an event hub.

az eventhubs eventhub update --resource-group MyResourceGroupName --namespace-name MyNamespaceName --name MyEventHubName --partition-count 12

资源管理器模板Resource Manager template

更新资源管理器模板中 partitionCount 属性的值,并重新部署模板以更新资源。Update value of the partitionCount property in the Resource Manager template and redeploy the template to update the resource.

    {
        "apiVersion": "2017-04-01",
        "type": "Microsoft.EventHub/namespaces/eventhubs",
        "name": "[concat(parameters('namespaceName'), '/', parameters('eventHubName'))]",
        "location": "[parameters('location')]",
        "dependsOn": [
            "[resourceId('Microsoft.EventHub/namespaces', parameters('namespaceName'))]"
        ],
        "properties": {
            "messageRetentionInDays": 7,
            "partitionCount": 12
        }
    }

Apache KafkaApache Kafka

使用 AlterTopics API(例如,通过 kafka-topics CLI 工具)增加分区计数。Use the AlterTopics API (for example, via kafka-topics CLI tool) to increase the partition count. 有关详细信息,请参阅修改 Kafka 主题For details, see Modifying Kafka topics.

事件中心客户端Event Hubs clients

让我们看看事件中心上的分区计数更新时,事件中心客户端的行为。Let's look at how Event Hubs clients behave when the partition count is updated on an event hub.

向现有的事件中心添加分区后,事件中心客户端会从服务接收到 MessagingException,其作用是通知客户端:实体元数据(实体是事件中心,元数据是分区信息)已更改。When you add a partition to an existing even hub, the event hub client receives a MessagingException from the service informing the clients that entity metadata (entity is your event hub and metadata is the partition information) has been altered. 客户端将自动重新打开 AMQP 链接,然后该链接会选取已更改的元数据信息。The clients will automatically reopen the AMQP links, which would then pick up the changed metadata information. 然后客户端正常运行。The clients then operate normally.

发送方/生成者客户端Sender/producer clients

事件中心提供三个发送方选项:Event Hubs provides three sender options:

  • 分区发送方 - 在此方案中,客户端直接向分区发送事件。Partition sender – In this scenario, clients send events directly to a partition. 尽管分区是可识别的并且可以直接向其发送事件,但我们不建议采用这种模式。Although partitions are identifiable and events can be sent directly to them, we don't recommend this pattern. 添加分区不会影响此方案。Adding partitions doesn't impact this scenario. 建议重启应用程序,以便其能够检测新添加的分区。We recommend that you restart applications so that they can detect newly added partitions.
  • 分区密钥发送方 - 在此方案中,客户端使用密钥发送事件,以便属于该密钥的所有事件最终位于同一分区。Partition key sender – in this scenario, clients sends the events with a key so that all events belonging to that key end up in the same partition. 在这种情况下,服务将对密钥进行哈希处理,并路由到相应的分区。In this case, service hashes the key and routes to the corresponding partition. 由于哈希更改,分区计数更新可能导致出现乱序问题。The partition count update can cause out-of-order issues because of hashing change. 因此,如果在意排序,请在增加分区计数之前确保应用程序使用现有分区中的所有事件。So, if you care about ordering, ensure that your application consumes all events from existing partitions before you increase the partition count.
  • 轮循机制发送方(默认) - 在此方案中,事件中心服务以轮询方式在分区间发送事件。Round-robin sender (default) – In this scenario, the Event Hubs service round robins the events across partitions. 事件中心服务可感知分区计数更改,并在分区计数更改后的几秒钟内发送到新的分区。Event Hubs service is aware of partition count changes and will send to new partitions within seconds of altering partition count.

接收方/使用者客户端Receiver/consumer clients

事件中心提供直接接收方和称为事件处理器主机(旧 SDK)事件处理器(新 SDK)的简单使用者库。Event Hubs provides direct receivers and an easy consumer library called the Event Processor Host (old SDK) or Event Processor (new SDK).

  • 直接接收方 - 直接接收方侦听特定分区。Direct receivers – The direct receivers listen to specific partitions. 当为事件中心横向扩展分区时,该接收方的运行时行为不受影响。Their runtime behavior isn't affected when partitions are scaled out for an event hub. 使用直接接收方的应用程序需要负责选择新分区并相应地分配接收方。The application that uses direct receivers needs to take care of picking up the new partitions and assigning the receivers accordingly.

  • 事件处理器主机 - 此客户端不会自动刷新实体元数据。Event processor host – This client doesn't automatically refresh the entity metadata. 因此,它不会感知分区计数的增加。So, it wouldn't pick up on partition count increase. 重新创建事件处理器实例将导致提取实体元数据,进而会为新添加的分区创建新的 Blob。Recreating an event processor instance will cause an entity metadata fetch, which in turn will create new blobs for the newly added partitions. 预先存在的 Blob 不会受到影响。Pre-existing blobs won't be affected. 建议重新启动所有事件处理器实例,以确保所有实例都能感知新添加的分区,并在使用者之间正确处理和分配负载平衡。Restarting all event processor instances is recommended to ensure that all instances are aware of the newly added partitions, and load-balancing is handled correctly among consumers.

    如果使用旧版本的 .NET SDK (WindowsAzure.ServiceBus),则当检查点中的分区计数与从服务中提取的分区计数不匹配时,事件处理器主机会在重新启动时删除现有的检查点。If you're using the old version of .NET SDK (WindowsAzure.ServiceBus), the event processor host removes an existing checkpoint upon restart if partition count in the checkpoint doesn't match the partition count fetched from the service. 此行为可能会影响你的应用程序。This behavior may have an impact on your application.

Apache Kafka 客户端Apache Kafka clients

本部分介绍在更新事件中心的分区计数时,使用 Azure 事件中心的 Kafka 终结点的 Apache Kafka 客户端的行为。This section describes how Apache Kafka clients that use the Kafka endpoint of Azure Event Hubs behave when the partition count is updated for an event hub.

将 Apache Kafka 协议与事件中心配合使用的 Kafka 客户端与使用 AMQP 协议的事件中心客户端的行为有所不同。Kafka clients that use Event Hubs with the Apache Kafka protocol behave differently from event hub clients that use AMQP protocol. Kafka 客户端每 metadata.max.age.ms 毫秒更新一次其元数据。Kafka clients update their metadata once every metadata.max.age.ms milliseconds. 在客户端配置中指定此值。You specify this value in the client configurations. librdkafka 库也使用相同的配置。The librdkafka libraries also use the same configuration. 元数据更新会就服务更改通知客户端,包括分区计数增加。Metadata updates inform the clients of service changes including the partition count increases. 有关配置的列表,请参阅事件中心的 Apache Kafka 配置For a list of configurations, see Apache Kafka configurations for Event Hubs.

发送方/生成者客户端Sender/producer clients

生成者始终规定发送请求包含每组生产记录的分区目标。Producers always dictate that send requests contain the partition destination for each set of produced records. 因此,所有生成分区操作都是在客户端上使用生成者的中转站元数据视图完成的。So, all produce partitioning is done on client-side with producer’s view of broker's metadata. 在将新分区添加到生成者的元数据视图后,它们将可用于生成者请求。Once the new partitions are added to the producer’s metadata view, they'll be available for producer requests.

使用者/接收方客户端Consumer/receiver clients

当使用者组成员执行元数据刷新并选取新创建的分区时,该成员将启动组重新均衡。When a consumer group member performs a metadata refresh and picks up the newly created partitions, that member initiates a group rebalance. 然后,将为所有组成员刷新使用者元数据,并且新的分区将按分配的重新均衡列表来分配。Consumer metadata then will be refreshed for all group members, and the new partitions will be assigned by the allotted rebalance leader.

建议Recommendations

  • 如果将分区密钥与生成者应用程序一起使用,并且依赖密钥哈希来确保分区中的排序,则不建议动态添加分区。If you use partition key with your producer applications and depend on key hashing to ensure ordering in a partition, dynamically adding partitions isn't recommended.

    重要

    虽然现有数据保持了排序状态,但由于添加了分区,分区计数更改,因此在这之后哈希的消息的分区哈希将中断。While the existing data preserves ordering, partition hashing will be broken for messages hashed after the partition count changes due to addition of partitions.

  • 在以下情况下,建议将分区添加到现有主题或事件中心实例:Adding partition to an existing topic or event hub instance is recommended in the following cases:

    • 使用轮循机制(默认)的方法来发送事件时When you use the round robin (default) method of sending events
    • Kafka 默认分区策略,例如“粘滞分配器”策略Kafka default partitioning strategies, example – Sticky Assignor strategy

后续步骤Next steps

有关分区的详细信息,请参阅分区For more information about partitions, see Partitions.