您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

自动缩放Autoscaling

自动缩放是动态分配资源以满足性能需求的过程。Autoscaling is the process of dynamically allocating resources to match performance requirements. 当工作量增大时,应用程序可能需要额外的资源来维持所需的性能级别和满足服务级别协议 (SLA)。As the volume of work grows, an application may need additional resources to maintain the desired performance levels and satisfy service-level agreements (SLAs). 当需求降低,不再需要额外的资源时,可以取消分配资源,最大程度地降低成本。As demand slackens and the additional resources are no longer needed, they can be de-allocated to minimize costs.

自动缩放可以利用云托管环境的弹性,同时可以降低管理开销。Autoscaling takes advantage of the elasticity of cloud-hosted environments while easing management overhead. 操作员不必持续监视系统性能,只在需要时才做有关添加或删除资源的决策。It reduces the need for an operator to continually monitor the performance of a system and make decisions about adding or removing resources.

应用程序缩放有两种主要的方式:There are two main ways that an application can scale:

  • 垂直缩放,也称为增加和减少,表示改变资源的容量。Vertical scaling, also called scaling up and down, means changing the capacity of a resource. 例如,可将应用程序移动到更大的 VM 中。For example, you could move an application to a larger VM size. 垂直缩放在重新部署时通常会要求系统暂时不可用。Vertical scaling often requires making the system temporarily unavailable while it is being redeployed. 因此,自动化垂直缩放并不常见。Therefore, it's less common to automate vertical scaling.
  • 水平缩放,也称为扩大和缩小,表示添加或删除资源的实例。Horizontal scaling, also called scaling out and in, means adding or removing instances of a resource. 在预配新资源时,应用程序无需中断,可持续运行。The application continues running without interruption as new resources are provisioned. 当预配过程完成时,解决方案就已部署在这些额外资源上。When the provisioning process is complete, the solution is deployed on these additional resources. 如果需求降低,额外的资源可以完全关闭并解除分配。If demand drops, the additional resources can be shut down cleanly and deallocated.

许多基于云的系统(包括 Microsoft Azure)支持自动水平缩放。Many cloud-based systems, including Microsoft Azure, support automatic horizontal scaling. 本文的余下内容重点介绍水平缩放。The rest of this article focuses on horizontal scaling.

备注

自动缩放主要适用于计算资源。Autoscaling mostly applies to compute resources. 虽然可以水平缩放数据库或消息队列,但这通常涉及非自动的数据分区While it's possible to horizontally scale a database or message queue, this usually involves data partitioning, which is generally not automated.

概述Overview

自动缩放策略通常包括以下部分:An autoscaling strategy typically involves the following pieces:

  • 位于应用程序、服务和基础结构级别的检测和监视系统。Instrumentation and monitoring systems at the application, service, and infrastructure levels. 这些系统可捕获响应时间、队列长度、CPU 利用率和内存使用量等关键指标。These systems capture key metrics, such as response times, queue lengths, CPU utilization, and memory usage.
  • 根据预定义的阈值或计划来评估这些指标并决定是否缩放的决策逻辑。Decision-making logic that evaluates these metrics against predefined thresholds or schedules, and decides whether to scale.
  • 缩放系统的组件。Components that scale the system.
  • 测试、监视和优化自动缩放策略,以确保它按预期工作。Testing, monitoring, and tuning of the autoscaling strategy to ensure that it functions as expected.

Azure 提供用于处理常见方案的内置自动缩放机制。Azure provides built-in autoscaling mechanisms that address common scenarios. 如果某个特定服务或技术没有内置自动缩放功能,或者你有超出其功能的特定自动缩放需求,则可考虑自定义实现。If a particular service or technology does not have built-in autoscaling functionality, or if you have specific autoscaling requirements beyond its capabilities, you might consider a custom implementation. 自定义实现将收集操作和系统指标,分析这些指标,然后相应地缩放资源。A custom implementation would collect operational and system metrics, analyze the metrics, and then scale resources accordingly.

配置 Azure 解决方案的自动缩放Configure autoscaling for an Azure solution

Azure 为大多数计算选项提供内置自动缩放功能。Azure provides built-in autoscaling for most compute options.

这些计算选项都使用 Azure Monitor 自动缩放来提供一组通用的自动缩放功能。These compute options all use Azure Monitor autoscale to provide a common set of autoscaling functionality.

  • Azure Functions 与以前的计算选项不同,因为无需配置任何自动缩放规则。Azure Functions differs from the previous compute options, because you don't need to configure any autoscale rules. 相反,当代码正在运行时,Azure Functions 会自动分配计算能力,根据需要进行横向扩展,以处理负载。Instead, Azure Functions automatically allocates compute power when your code is running, scaling out as necessary to handle load. 有关详细信息,请参阅为 Azure Functions 选择正确的托管计划For more information, see Choose the correct hosting plan for Azure Functions.

最后,自定义自动缩放解决方案有时非常有用。Finally, a custom autoscaling solution can sometimes be useful. 例如,可使用 Azure 诊断和基于应用程序的指标,以及自定义代码来监视和导出应用程序指标。For example, you could use Azure diagnostics and application-based metrics, along with custom code to monitor and export the application metrics. 然后,可根据这些指标定义自定义规则,并使用资源管理器 REST API 来触发自动缩放。Then you could define custom rules based on these metrics, and use Resource Manager REST APIs to trigger autoscaling. 但是,自定义解决方案并不容易实施,只应在前述方法都无法满足要求时才加以考虑。However, a custom solution is not simple to implement, and should be considered only if none of the previous approaches can fulfill your requirements.

如果平台的内置自动缩放功能可以符合要求,就使用此内置功能。Use the built-in autoscaling features of the platform, if they meet your requirements. 否则,请仔细考虑是否真正需要更复杂的缩放功能。If not, carefully consider whether you really need more complex scaling features. 其他要求的示例包括更高粒度的控制、检测缩放触发事件的其他方式、跨订阅缩放和缩放其他类型的资源。Examples of additional requirements may include more granularity of control, different ways to detect trigger events for scaling, scaling across subscriptions, and scaling other types of resources.

使用 Azure Monitor 自动缩放Use Azure Monitor autoscale

Azure Monitor 自动缩放为 VM 规模集、Azure 应用服务和 Azure 云服务提供一套通用的自动缩放功能。Azure Monitor autoscale provide a common set of autoscaling functionality for VM Scale Sets, Azure App Service, and Azure Cloud Service. 可按计划,也可根据运行时指标(如 CPU 或内存使用率)执行缩放。Scaling can be performed on a schedule, or based on a runtime metric, such as CPU or memory usage. 示例:Examples:

  • 在工作日扩大到 10 个实例,在周六和周日缩小到 4 个实例。Scale out to 10 instances on weekdays, and scale in to 4 instances on Saturday and Sunday.
  • 如果 CPU 平均使用率在 70% 以上,则扩大一个实例;如果 CPU 使用率低于 50%,则缩小一个实例。Scale out by one instance if average CPU usage is above 70%, and scale in by one instance if CPU usage falls below 50%.
  • 如果队列中消息数量超过特定阈值,则扩大一个实例。Scale out by one instance if the number of messages in a queue exceeds a certain threshold.

有关内置指标列表,请参阅 Azure Monitor 自动缩放常用指标For a list of built-in metrics, see Azure Monitor autoscaling common metrics. 还可通过使用 Application Insights 来实现自定义指标。You can also implement custom metrics by using Application Insights.

可通过使用 PowerShell、Azure CLI、Azure 资源管理器模板或 Azure 门户来配置自动缩放。You can configure autoscaling by using PowerShell, the Azure CLI, an Azure Resource Manager template, or the Azure portal. 要实现更细化的控制,请使用 Azure 资源管理器 REST APIFor more detailed control, use the Azure Resource Manager REST API. Azure 监视服务管理库Microsoft Insights 库(预览版)是可让用户从不同的资源收集指标以及利用 REST API 执行自动缩放的 SDK。The Azure Monitoring Service Management Library and the Microsoft Insights Library (in preview) are SDKs that allow collecting metrics from different resources, and perform autoscaling by making use of the REST APIs. 对于不支持 Azure 资源管理器的资源,或者使用 Azure 云服务时,可以使用服务管理 REST API 进行自动缩放。For resources where Azure Resource Manager support isn't available, or if you are using Azure Cloud Services, the Service Management REST API can be used for autoscaling. 在其他所有情况下,请使用 Azure 资源管理器。In all other cases, use Azure Resource Manager.

使用 Azure 自动缩放时,请注意以下几点:Consider the following points when using Azure autoscale:

  • 考虑是否可以足够精确地预测应用程序负载,使用计划的自动缩放添加和删除实例以满足需求的预期高峰。Consider whether you can predict the load on the application well enough to use scheduled autoscaling, adding and removing instances to meet anticipated peaks in demand. 如果不可行,请根据运行时指标使用被动自动缩放,以处理无法预测的需求变化。If this isn't possible, use reactive autoscaling based on runtime metrics, in order to handle unpredictable changes in demand. 通常情况下,可以组合使用这些方法。Typically, you can combine these approaches. 例如,如果知道应用程序何时最繁忙,则可以创建一个策略,以根据时间计划添加资源。For example, create a strategy that adds resources based on a schedule of the times when you know the application is most busy. 这有助于确保容量在需要时可供使用,并且不会在启动新实例时发生延迟。This helps to ensure that capacity is available when required, without any delay from starting new instances. 对于每个计划的规则,请定义在该期间允许被动自动缩放的指标,以确保应用程序能够处理持续但无法预测的需求高峰。For each scheduled rule, define metrics that allow reactive autoscaling during that period to ensure that the application can handle sustained but unpredictable peaks in demand.
  • 通常很难了解指标与容量要求之间的关系,尤其是在最初部署应用程序后。It's often difficult to understand the relationship between metrics and capacity requirements, especially when an application is initially deployed. 在一开始多预配一些附加容量,监视并调整自动缩放规则,使容量更接近实际负载的需要。Provision a little extra capacity at the beginning, and then monitor and tune the autoscaling rules to bring the capacity closer to the actual load.
  • 配置自动缩放规则,然后监视应用程序在一段时间内的性能。Configure the autoscaling rules, and then monitor the performance of your application over time. 如果需要,请使用这种监视的结果来调整系统的缩放方式。Use the results of this monitoring to adjust the way in which the system scales if necessary. 但请记住,自动缩放不是即时起效的过程。However, keep in mind that autoscaling is not an instantaneous process. 它需要时间来对指标(例如平均 CPU 利用率超过或低于指定的阈值)做出反应。It takes time to react to a metric such as average CPU utilization exceeding (or falling below) a specified threshold.
  • 使用基于测得触发器属性(例如 CPU 使用量或队列长度)的检测机制的自动缩放规则使用一段时间内的聚合值而不是即时值来触发自动缩放操作。Autoscaling rules that use a detection mechanism based on a measured trigger attribute (such as CPU usage or queue length) use an aggregated value over time, rather than instantaneous values, to trigger an autoscaling action. 默认情况下,聚合是值的平均值。By default, the aggregate is an average of the values. 这可以防止系统反应太快,或导致快速震荡。This prevents the system from reacting too quickly, or causing rapid oscillation. 这还可以使自动启动的新实例顺利进入运行模式,避免当新实例正在启动时,又发生其他自动缩放操作。It also allows time for new instances that are auto-started to settle into running mode, preventing additional autoscaling actions from occurring while the new instances are starting up. 对于 Azure 云服务和 Azure 虚拟机,聚合的默认期限为 45 分钟,指标需要经过这段时间后才为了响应需求高峰而触发自动缩放。For Azure Cloud Services and Azure Virtual Machines, the default period for the aggregation is 45 minutes, so it can take up to this period of time for the metric to trigger autoscaling in response to spikes in demand. 可以使用 SDK 更改聚合期限,但请注意,低于 25 分钟可能导致不可预测的结果(有关详细信息,请参阅 Auto Scaling Cloud Services on CPU Percentage with the Azure Monitoring Services Management Library(使用 Azure 监视服务管理库根据 CPU 百分比自动缩放云服务))。You can change the aggregation period by using the SDK, but be aware that periods of fewer than 25 minutes may cause unpredictable results (for more information, see Auto Scaling Cloud Services on CPU Percentage with the Azure Monitoring Services Management Library). 对于 Web 应用,平均期限要短得多,这样便可以在平均触发测量值更改约五分钟后提供新实例。For Web Apps, the averaging period is much shorter, allowing new instances to be available in about five minutes after a change to the average trigger measure.
  • 如果使用 SDK 而不是门户配置自动缩放,则可以指定更详细的计划,在执行该计划期间,规则将处于活动状态。If you configure autoscaling using the SDK rather than the portal, you can specify a more detailed schedule during which the rules are active. 还可以创建自己的度量值,并将其与自动缩放规则中的现有度量值一起使用,或单独使用。You can also create your own metrics and use them with or without any of the existing ones in your autoscaling rules. 例如,建议使用备选计数器,如每秒的请求数或平均内存可用性,或使用测量特定业务流程的自定义计数器。For example, you may wish to use alternative counters, such as the number of requests per second or the average memory availability, or use custom counters that measure specific business processes.
  • 自动缩放 Service Fabric 时,由于群集中的节点类型由后端的 VM 规模集构成,因此需要为每个节点类型设置自动缩放规则。When autoscaling Service Fabric, the node types in your cluster are made of VM scale sets at the backend, so you need to set up auto-scale rules for each node type. 在设置自动缩放之前请考虑必须具有的节点数。Take into account the number of nodes that you must have before you set up auto-scaling. 对于主节点类型所必须具有的最小节点数受所选择的可靠性级别影响。The minimum number of nodes that you must have for the primary node type is driven by the reliability level you have chosen. 有关详细信息,请参阅使用自动缩放规则扩大或缩小 Service Fabric 群集For more info, see scale a Service Fabric cluster in or out using auto-scale rules.
  • 可以使用门户将 SQL 数据库实例和队列等资源链接到云服务实例。You can use the portal to link resources such as SQL Database instances and queues to a Cloud Service instance. 这样,便可以更轻松地访问每个链接资源的各个手动和自动缩放配置选项。This allows you to more easily access the separate manual and automatic scaling configuration options for each of the linked resources. 有关详细信息,请参阅如何:将资源链接到云服务For more information, see How to: Link a resource to a cloud service.
  • 配置多个策略和规则时,它们可能相互冲突。When you configure multiple policies and rules, they could conflict with each other. 自动缩放使用以下冲突解决规则来确保始终有足够的实例在运行状态:Autoscale uses the following conflict resolution rules to ensure that there is always a sufficient number of instances running:
    • 向外缩放操作始终优先于向内缩放操作。Scale out operations always take precedence over scale in operations.
    • 当向外缩放操作发生冲突时,使实例数增幅最大的规则优先。When scale out operations conflict, the rule that initiates the largest increase in the number of instances takes precedence.
    • 当向内缩放操作发生冲突时,使实例数降幅最小的规则优先。When scale in operations conflict, the rule that initiates the smallest decrease in the number of instances takes precedence.
  • 在应用服务环境中,可使用任何辅助池或前端指标来定义自动缩放规则。In an App Service Environment any worker pool or front-end metrics can be used to define autoscale rules. 有关详细信息,请参阅自动缩放和应用服务环境For more information, see Autoscaling and App Service Environment.

应用程序设计注意事项Application design considerations

自动缩放不是即时见效的解决方案。Autoscaling isn't an instant solution. 只是将资源添加到系统或运行进程的更多实例并不能保证提高系统性能。Simply adding resources to a system or running more instances of a process doesn't guarantee that the performance of the system will improve. 设计自动缩放策略时,请注意以下几点:Consider the following points when designing an autoscaling strategy:

  • 系统必须设计为支持水平缩放。The system must be designed to be horizontally scalable. 不要在实例相关性方面做出假设;不要设计需要代码始终在特定的进程实例中运行的解决方案。Avoid making assumptions about instance affinity; do not design solutions that require that the code is always running in a specific instance of a process. 水平缩放云服务或网站时,不要假设一系列来自同一源的请求始终路由到同一实例。When scaling a cloud service or web site horizontally, don't assume that a series of requests from the same source will always be routed to the same instance. 出于相同原因,请将服务设计为无状态,以避免需要将一系列来自应用程序的请求始终路由到同一服务实例。For the same reason, design services to be stateless to avoid requiring a series of requests from an application to always be routed to the same instance of a service. 在设计从队列读取并处理消息的服务时,不要假设哪个服务实例处理哪个特定消息。When designing a service that reads messages from a queue and processes them, don't make any assumptions about which instance of the service handles a specific message. 自动缩放可能会在队列长度增大时启动其他服务实例。Autoscaling could start additional instances of a service as the queue length grows. 使用者竞争模式说明了如何解决这种情况。The Competing Consumers Pattern describes how to handle this scenario.
  • 如果解决方案实施长时间运行的任务,请将此任务设计为同时支持向外和向内缩放。If the solution implements a long-running task, design this task to support both scaling out and scaling in. 如果不保持应有的谨慎,这种任务在系统向内缩放时会阻止进程实例完全关闭;如果进程被强行终止,则可能会丢失数据。Without due care, such a task could prevent an instance of a process from being shut down cleanly when the system scales in, or it could lose data if the process is forcibly terminated. 理想的情况是,重构长时间运行的任务并分解处理,使其以较小且不连续的块执行。Ideally, refactor a long-running task and break up the processing that it performs into smaller, discrete chunks. 管道和筛选器模式提供了有关如何实现此目的的示例。The Pipes and Filters Pattern provides an example of how you can achieve this.
  • 或者,可以实施检查点机制,用于定期记录任务的状态信息,并将此状态保存在运行任务的任何进程实例可以访问的持久性存储中。Alternatively, you can implement a checkpoint mechanism that records state information about the task at regular intervals, and save this state in durable storage that can be accessed by any instance of the process running the task. 这样,如果进程关闭,它所执行的工作可以使用另一个实例,从最后一个检查点继续进行。In this way, if the process is shutdown, the work that it was performing can be resumed from the last checkpoint by using another instance.
  • 当后台任务在独立的计算实例(例如云服务托管应用程序的辅助角色)上运行时,可能需要使用不同的缩放策略来缩放应用程序的不同部分。When background tasks run on separate compute instances, such as in worker roles of a cloud services hosted application, you may need to scale different parts of the application using different scaling policies. 例如,可能需要部署其他用户界面 (UI) 计算实例而不增加后台计算实例数,或是相反。For example, you may need to deploy additional user interface (UI) compute instances without increasing the number of background compute instances, or the opposite of this. 如果提供不同级别的服务(例如基本和高级服务包),可能需要使用比基本服务包更主动的高级服务包的计算资源才能符合 SLA。If you offer different levels of service (such as basic and premium service packages), you may need to scale out the compute resources for premium service packages more aggressively than those for basic service packages in order to meet SLAs.
  • 考虑使用 UI 和后台计算实例通信的队列长度作为自动缩放策略的条件。Consider using the length of the queue over which UI and background compute instances communicate as a criterion for your autoscaling strategy. 这可以最好地反映当前负载与后台任务处理容量之间的不平衡或差异。This is the best indicator of an imbalance or difference between the current load and the processing capacity of the background task.
  • 如果自动缩放策略基于度量业务进程(例如每小时订单数或复杂事务的平均执行时间)的计数器,请确保完全了解这些计数器类型的结果与实际计算容量要求之间的关系。If you base your autoscaling strategy on counters that measure business processes, such as the number of orders placed per hour or the average execution time of a complex transaction, ensure that you fully understand the relationship between the results from these types of counters and the actual compute capacity requirements. 可能需要缩放多个组件或计算单位来应对业务进程计数器的变化。It may be necessary to scale more than one component or compute unit in response to changes in business process counters.
  • 若要防止系统过度地向外缩放并避免由于运行数千个实例而带来的成本开销,请考虑限制可以自动添加的实例数上限。To prevent a system from attempting to scale out excessively, and to avoid the costs associated with running many thousands of instances, consider limiting the maximum number of instances that can be automatically added. 大多数自动缩放机制允许指定规则的实例数上限和下限。Most autoscaling mechanisms allow you to specify the minimum and maximum number of instances for a rule. 此外,如果部署的实例数已达到上限而系统仍然过载,请考虑适当地降低系统提供的功能。In addition, consider gracefully degrading the functionality that the system provides if the maximum number of instances have been deployed, and the system is still overloaded.
  • 请记住,自动缩放可能不是处理工作负荷中突发高峰的最适当机制。Keep in mind that autoscaling might not be the most appropriate mechanism to handle a sudden burst in workload. 设置并启动新服务实例或将资源添加到系统都需要花费时间,而当这些附加资源可供使用时,高峰需求可能已成为过去。It takes time to provision and start new instances of a service or add resources to a system, and the peak demand may have passed by the time these additional resources have been made available. 在这种情况下,限制服务可能更适合。In this scenario, it may be better to throttle the service. 有关详细信息,请参阅限制模式For more information, see the Throttling Pattern.
  • 相比之下,如果希望在事务量快速波动时有足够的容量处理所有请求,并且成本不是主要考虑因素,那么,请考虑使用激进的自动缩放策略来更快速地启动附加实例。Conversely, if you do need the capacity to process all requests when the volume fluctuates rapidly, and cost isn't a major contributing factor, consider using an aggressive autoscaling strategy that starts additional instances more quickly. 还可以使用计划的策略在最大负载来临前预先启动足量的实例。You can also use a scheduled policy that starts a sufficient number of instances to meet the maximum load before that load is expected.
  • 自动缩放机制应该监视自动缩放过程,并记录每个自动缩放事件的详细信息(触发的事件、添加或删除了哪些资源,以及时间)。The autoscaling mechanism should monitor the autoscaling process, and log the details of each autoscaling event (what triggered it, what resources were added or removed, and when). 在创建自定义自动缩放机制时,请确保它包含此功能。If you create a custom autoscaling mechanism, ensure that it incorporates this capability. 分析信息以帮助度量自动缩放策略的有效性,并根据需要进行优化。Analyze the information to help measure the effectiveness of the autoscaling strategy, and tune it if necessary. 在短时间内使用模式变得明显,以及长期业务拓展或对应用程序的要求变化时,可以进行优化。You can tune both in the short term, as the usage patterns become more obvious, and over the long term, as the business expands or the requirements of the application evolve. 如果应用程序达到定义的自动缩放上限,机制可以提醒操作人员,让操作人员手动启动其他资源(如果必要)。If an application reaches the upper limit defined for autoscaling, the mechanism might also alert an operator who could manually start additional resources if necessary. 请注意,在这种情况下,操作人员可能还要负责在工作负荷减轻后手动删除这些资源。Note that, under these circumstances, the operator may also be responsible for manually removing these resources after the workload eases.

实施自动缩放时,以下模式和指南也可能与方案相关:The following patterns and guidance may also be relevant to your scenario when implementing autoscaling:

  • 限制模式Throttling Pattern. 此模式描述当需求增大而对资源产生极大负载时,应用程序如何继续工作并满足 SLA。This pattern describes how an application can continue to function and meet SLAs when an increase in demand places an extreme load on resources. 限制可与自动缩放配合使用,以避免系统向外缩放时失控。Throttling can be used with autoscaling to prevent a system from being overwhelmed while the system scales out.
  • 使用者竞争模式Competing Consumers Pattern. 此模式描述如何实施服务实例池,以便处理来自任何应用程序实例的消息。This pattern describes how to implement a pool of service instances that can handle messages from any application instance. 自动缩放可用于启动和停止服务实例,以符合预期的工作负荷。Autoscaling can be used to start and stop service instances to match the anticipated workload. 此模式可让系统同时处理多个消息,以优化吞吐量、提高伸缩性和可用性,以及平衡工作负荷。This approach enables a system to process multiple messages concurrently to optimize throughput, improve scalability and availability, and balance the workload.
  • 监视和诊断Monitoring and diagnostics. 检测和遥测在收集信息以促成自动缩放过程方面至关重要。Instrumentation and telemetry are vital for gathering the information that can drive the autoscaling process.