作业因群集管理器核心实例请求限制而失败Job fails due to cluster manager core instance request limit

问题Problem

Azure Databricks 笔记本或作业 API 返回以下错误:An Azure Databricks Notebook or Job API returns the following error:

Unexpected failure while creating the cluster for the job. Cause REQUEST_LIMIT_EXCEEDED: Your request was rejected due to API rate limit. Please retry your request later, or choose a larger node type instead.

原因Cause

此错误表示已超出群集管理器服务核心实例请求限制The error indicates the Cluster Manager Service core instance request limit was exceeded.

群集管理器核心实例最多可以支持1000请求。A Cluster Manager core instance can support a maximum of 1000 requests.

解决方案Solution

请联系 Azure Databricks 支持人员以增加核心实例中设置的限制。Contact Azure Databricks Support to increase the limit set in the core instance.

Azure Databricks 可以将作业限制增加 maxBurstyUpsizePerOrg 到2000, upsizeTokenRefillRatePerMin 最高可达120。Azure Databricks can increase the job limit maxBurstyUpsizePerOrg up to 2000, and upsizeTokenRefillRatePerMin up to 120. 如果限制增加,当前正在运行的作业将受到影响。Current running jobs are affected when the limit is increased.

增大这些值可以停止限制问题,但也可能导致 CPU 使用率较高。Increasing these values can stop the throttling issue, but can also cause high CPU utilization.

此问题的最佳解决方法是将群集管理器核心实例替换为一个更大的实例,该实例可支持最大数据传输速率。The best solution for this issue is to replace the Cluster Manager core instance with a larger instance that can support maximum data transmission rates.

Azure Databricks 支持可以将当前的群集管理器实例类型更改为较大的实例类型。Azure Databricks Support can change the current Cluster Manager instance type to a larger one.