建立自動調整公式來調整 Batch 集區中的計算節點Create an automatic scaling formula for scaling compute nodes in a Batch pool

Azure Batch 可以根據您定義的參數自動調整集區。Azure Batch can automatically scale pools based on parameters that you define. 使用自動調整,Batch 會隨著工作需求增加,動態地將節點新增至集區,以及隨著工作需求減少而移除計算節點。With automatic scaling, Batch dynamically adds nodes to a pool as task demands increase, and removes compute nodes as they decrease. 自動調整 Batch 應用程式所使用的計算節點數目,可讓您節省時間與金錢。You can save both time and money by automatically adjusting the number of compute nodes used by your Batch application.

您可讓您定義的「自動調整公式」與計算節點的集區產生關聯,以在該集區上啟用自動調整。You enable automatic scaling on a pool of compute nodes by associating with it an autoscale formula that you define. Batch 服務會使用自動調整公式來判斷要執行您的工作負載所需的計算節點數目。The Batch service uses the autoscale formula to determine the number of compute nodes that are needed to execute your workload. 計算節點可以是專用節點或低優先順序節點Compute nodes may be dedicated nodes or low-priority nodes. Batch 會回應定期收集的服務計量資料。Batch responds to service metrics data that is collected periodically. 使用此計量資料,Batch 會根據您的公式以可設定的間隔調整集區中的計算節點數目。Using this metrics data, Batch adjusts the number of compute nodes in the pool based on your formula and at a configurable interval.

您可以在建立集區時或在現有的集區上啟用自動調整。You can enable automatic scaling either when a pool is created, or on an existing pool. 您也可以變更已針對自動調整設定之集區上的現有公式。You can also change an existing formula on a pool that is configured for autoscaling. Batch 可讓您在將公式指派給集區之前評估公式,以及監視自動調整回合的狀態。Batch enables you to evaluate your formulas before assigning them to pools and to monitor the status of automatic scaling runs.

本文會討論構成自動調整公式的各種實體,包括變數、運算子、作業和函式。This article discusses the various entities that make up your autoscale formulas, including variables, operators, operations, and functions. 我們討論如何取得 Batch 內的各種計算資源和工作計量。We discuss how to obtain various compute resource and task metrics within Batch. 您可以使用這些計量,根據資源使用量和工作狀態調整集區的節點計數。You can use these metrics to adjust your pool's node count based on resource usage and task status. 然後,我們會說明如何藉由使用 Batch REST 和 .NET API,建立公式以及對集區啟用自動調整。We then describe how to construct a formula and enable automatic scaling on a pool by using both the Batch REST and .NET APIs. 最後,我們會完成幾個範例公式。Finally, we finish up with a few example formulas.

重要

當您建立 Batch 帳戶時,您可以指定帳戶設定,該設定會決定集區是配置在 Batch 服務訂用帳戶 (預設值),或是您的使用者訂用帳戶。When you create a Batch account, you can specify the account configuration, which determines whether pools are allocated in a Batch service subscription (the default), or in your user subscription. 如果您使用預設 Batch 服務設定來建立 Batch 帳戶,則您的帳戶受限為可用於處理的核心數目上限。If you created your Batch account with the default Batch Service configuration, then your account is limited to a maximum number of cores that can be used for processing. Batch 服務只會將計算節點調整為最多達到該核心限制。The Batch service scales compute nodes only up to that core limit. 基於這個理由,Batch 服務不會達到自動調整公式所指定的目標計算節點數目。For this reason, the Batch service may not reach the target number of compute nodes specified by an autoscale formula. 如需檢視和增加帳戶配額的相關資訊,請參閱 Azure Batch 服務的配額和限制See Quotas and limits for the Azure Batch service for information on viewing and increasing your account quotas.

如果您以使用者訂用帳戶設定建立您的帳戶,則您的帳戶會共用訂用帳戶的核心配額。If you created your account with the User Subscription configuration, then your account shares in the core quota for the subscription. 如需詳細資訊,請參閱 Azure 訂用帳戶和服務限制、配額與條件約束中的 虛擬機器限制For more information, see Virtual Machines limits in Azure subscription and service limits, quotas, and constraints.

自動調整公式Automatic scaling formulas

自動調整公式是您定義的字串值,其中包含一或多個陳述式。An automatic scaling formula is a string value that you define that contains one or more statements. 自動調整公式會指派給集區的autoScaleFormula元素 (batch REST) 或CloudPool. AutoScaleFormula屬性 (batch .net)。The autoscale formula is assigned to a pool's autoScaleFormula element (Batch REST) or CloudPool.AutoScaleFormula property (Batch .NET). Batch 集區會使用您的公式來決定集區中可供下一個間隔處理的目標計算節點數目。The Batch service uses your formula to determine the target number of compute nodes in the pool for the next interval of processing. 公式字串不得超過 8 KB、最多只能包含 100 個陳述式 (以分號隔開),而且可以包含換行和註解。The formula string cannot exceed 8 KB, can include up to 100 statements that are separated by semicolons, and can include line breaks and comments.

您可以將自動調整公式視為 Batch 自動調整「語言」。You can think of automatic scaling formulas as a Batch autoscale "language." 公式陳述式是自由格式的運算式,可以包括服務定義的變數 (Batch 服務所定義的變數) 和使用者定義的變數 (您所定義的變數)。Formula statements are free-formed expressions that can include both service-defined variables (variables defined by the Batch service) and user-defined variables (variables that you define). 它們可以使用內建類型、運算子和函式對這些值執行各種作業。They can perform various operations on these values by using built-in types, operators, and functions. 例如,陳述式可能會採用下列格式:For example, a statement might take the following form:

$myNewVariable = function($ServiceDefinedVariable, $myCustomVariable);

公式通常包含多個陳述式,其可對在先前陳述式中取得的值執行作業。Formulas generally contain multiple statements that perform operations on values that are obtained in previous statements. 例如,首先我們會取得 variable1 的值,然後將它傳遞至函式以填入 variable2For example, first we obtain a value for variable1, then pass it to a function to populate variable2:

$variable1 = function1($ServiceDefinedVariable);
$variable2 = function2($OtherServiceDefinedVariable, $variable1);

在您的自動調整公式中包含這些陳述式,以達成目標計算節點數目。Include these statements in your autoscale formula to arrive at a target number of compute nodes. 專用節點與低優先順序節點各有自己的目標設定,因此您可以為每個節點類型定義目標。Dedicated nodes and low-priority nodes each have their own target settings, so that you can define a target for each type of node. 自動調整公式可以包含專用節點目標值、低優先順序節點目標值,或同時包含兩者。An autoscale formula can include a target value for dedicated nodes, a target value for low-priority nodes, or both.

目標節點數目可能會更高、更低,或與集區中該類型目前的節點數目相同。The target number of nodes may be higher, lower, or the same as the current number of nodes of that type in the pool. Batch 服務會在特定間隔評估集區的自動調整公式 (請參閱自動調整間隔)。Batch evaluates a pool's autoscale formula at a specific interval (see automatic scaling intervals). Batch 會將集區中每個節點類型的目標數目調整為自動調整公式在評估時指定的數目。Batch adjusts the target number of each type of node in the pool to the number that your autoscale formula specifies at the time of evaluation.

自動調整公式範例Sample autoscale formulas

以下是兩個自動調整公式的範例, 可加以調整以適用于大部分的案例。Below are examples of two autoscale formulas, which can be adjusted to work for most scenarios. 範例公式startingNumberOfVMsmaxNumberofVMs的變數和可以依據您的需求進行調整。The variables startingNumberOfVMs and maxNumberofVMs in the example formulas can be adjusted to your needs.

暫止工作Pending tasks

startingNumberOfVMs = 1;
maxNumberofVMs = 25;
pendingTaskSamplePercent = $PendingTasks.GetSamplePercent(180 * TimeInterval_Second);
pendingTaskSamples = pendingTaskSamplePercent < 70 ? startingNumberOfVMs : avg($PendingTasks.GetSample(180 * TimeInterval_Second));
$TargetDedicatedNodes=min(maxNumberofVMs, pendingTaskSamples);

使用此自動調整公式,一開始會建立包含單一 VM 的集區。With this autoscale formula, the pool is initially created with a single VM. $PendingTasks 計量會定義執行中或已排入佇列的工作數目。The $PendingTasks metric defines the number of tasks that are running or queued. 公式會尋找過去 180 秒內的平均擱置中工作數目,並據以設定 $TargetDedicatedNodes 變數。The formula finds the average number of pending tasks in the last 180 seconds and sets the $TargetDedicatedNodes variable accordingly. 公式會確保目標專用節點數目絕不會超出 25 部 VM。The formula ensures that the target number of dedicated nodes never exceeds 25 VMs. 集區會隨著新工作的提交而自動成長。As new tasks are submitted, the pool automatically grows. 隨著工作完成,VM 會逐一變成可用,且自動調整公式會縮小集區。As tasks complete, VMs become free one by one and the autoscaling formula shrinks the pool.

此公式會調整專用節點,但是可加以修改,套用來調整低優先順序節點。This formula scales dedicated nodes, but can be modified to apply to scale low-priority nodes as well.

搶先節點Preempted nodes

maxNumberofVMs = 25;
$TargetDedicatedNodes = min(maxNumberofVMs, $PreemptedNodeCount.GetSample(180 * TimeInterval_Second));
$TargetLowPriorityNodes = min(maxNumberofVMs , maxNumberofVMs - $TargetDedicatedNodes);

這個範例會建立一個以25個低優先順序節點開頭的集區。This example creates a pool that starts with 25 low-priority nodes. 每次佔用低優先順序節點時, 就會將其取代為專用節點。Every time a low-priority node is preempted, it is replaced with a dedicated node. 如同第一個範例, maxNumberofVMs變數會防止集區超過25部 vm。As with the first example, the maxNumberofVMs variable prevents the pool from exceeding 25 VMs. 這個範例適用于利用低優先順序的 Vm, 同時也確保集區的存留期只會發生固定數目的 preemptions。This example is useful for taking advantage of low-priority VMs while also ensuring that only a fixed number of preemptions will occur for the lifetime of the pool.

變數Variables

您可以在自動調整公式中同時使用服務定義使用者定義的變數。You can use both service-defined and user-defined variables in your autoscale formulas. 服務定義的變數內建在 Batch 服務中。The service-defined variables are built in to the Batch service. 有些服務定義的變數是讀寫,有些是唯讀。Some service-defined variables are read-write, and some are read-only. 使用者定義的變數是您定義的變數。User-defined variables are variables that you define. 在上一節中所示的範例公式中,$TargetDedicatedNodes$PendingTasks 是服務定義的變數。In the example formula shown in the previous section, $TargetDedicatedNodes and $PendingTasks are service-defined variables. startingNumberOfVMsmaxNumberofVMs 變數是使用者定義的變數。Variables startingNumberOfVMs and maxNumberofVMs are user-defined variables.

注意

服務定義的變數前面一律會加上貨幣符號 ($)。Service-defined variables are always preceded by a dollar sign ($). 針對使用者定義的變數而言,貨幣符號是選擇性的。For user-defined variables, the dollar sign is optional.

下表顯示 Batch 服務所定義的讀寫和唯讀變數。The following tables show both read-write and read-only variables that are defined by the Batch service.

您可以取得並設定這些這些服務定義的變數值,以管理集區中的計算節點:You can get and set the values of these service-defined variables to manage the number of compute nodes in a pool:

讀寫服務定義變數Read-write service-defined variables 描述Description
$TargetDedicatedNodes$TargetDedicatedNodes 集區之專用計算節點的目標數目。The target number of dedicated compute nodes for the pool. 專用節點數目被指定作為目標,因為集區不一定會達到想要的節點數目。The number of dedicated nodes is specified as a target because a pool may not always achieve the desired number of nodes. 例如,如果在集區達到初始目標之前,自動調整評估修改目標專用節點數目,則集區可能未達到目標。For example, if the target number of dedicated nodes is modified by an autoscale evaluation before the pool has reached the initial target, then the pool may not reach the target.

如果目標超過 Batch 帳戶節點或核心配額,以 Batch 服務設定建立之帳戶中的集區可能未達到其目標。A pool in an account created with the Batch Service configuration may not achieve its target if the target exceeds a Batch account node or core quota. 如果目標超過訂用帳戶的共用核心配額,以使用者訂用帳戶設定建立之帳戶中的集區可能未達到其目標。A pool in an account created with the User Subscription configuration may not achieve its target if the target exceeds the shared core quota for the subscription.
$TargetLowPriorityNodes$TargetLowPriorityNodes 集區之低優先順序計算節點的目標數目。The target number of low-priority compute nodes for the pool. 低優先順序節點數目被指定作為目標,因為集區不一定會達到想要的節點數目。The number of low-priority nodes is specified as a target because a pool may not always achieve the desired number of nodes. 例如,如果在集區達到初始目標之前,自動調整評估修改目標低優先順序節點數目,則集區可能未達到目標。For example, if the target number of low-priority nodes is modified by an autoscale evaluation before the pool has reached the initial target, then the pool may not reach the target. 如果目標超過 Batch 帳戶節點或核心配額,則集區也可能未達到其目標。A pool may also not achieve its target if the target exceeds a Batch account node or core quota.

如需有關低優先順序計算節點的詳細資訊,請參閱搭配 Batch 使用低優先順序 VM (預覽)For more information on low-priority compute nodes, see Use low-priority VMs with Batch (Preview).
$NodeDeallocationOption$NodeDeallocationOption 計算節點從集區移除時所發生的動作。The action that occurs when compute nodes are removed from a pool. 可能的值包括:Possible values are:
  • requeue:立即終止工作,並將這些工作放回工作佇列,為它們重新排程。requeue--Terminates tasks immediately and puts them back on the job queue so that they are rescheduled.
  • terminate:立即終止工作,並從作業佇列移除這些工作。terminate--Terminates tasks immediately and removes them from the job queue.
  • taskcompletion:等待目前執行中的工作完成,然後再從集區中移除該節點。taskcompletion--Waits for currently running tasks to finish and then removes the node from the pool.
  • retaineddata:等待所有本機工作保留在節點上的資料先清除,再從集區移除節點。retaineddata--Waits for all the local task-retained data on the node to be cleaned up before removing the node from the pool.

您可以取得這些服務定義的變數值,以根據 Batch 服務提供的計量進行調整:You can get the value of these service-defined variables to make adjustments that are based on metrics from the Batch service:

唯讀服務定義變數Read-only service-defined variables 描述Description
$CPUPercent$CPUPercent CPU 使用量的平均百分比。The average percentage of CPU usage.
$WallClockSeconds$WallClockSeconds 已耗用的秒數。The number of seconds consumed.
$MemoryBytes$MemoryBytes 已使用的平均 MB 數目。The average number of megabytes used.
$DiskBytes$DiskBytes 已在本機磁碟上使用的平均 GB 數目。The average number of gigabytes used on the local disks.
$DiskReadBytes$DiskReadBytes 已讀取的位元組數目。The number of bytes read.
$DiskWriteBytes$DiskWriteBytes 已寫入的位元組數目。The number of bytes written.
$DiskReadOps$DiskReadOps 已執行的讀取磁碟作業計數。The count of read disk operations performed.
$DiskWriteOps$DiskWriteOps 已執行的寫入磁碟作業計數。The count of write disk operations performed.
$NetworkInBytes$NetworkInBytes 輸入位元組的數目。The number of inbound bytes.
$NetworkOutBytes$NetworkOutBytes 輸出位元組的數目。The number of outbound bytes.
$SampleNodeCount$SampleNodeCount 計算節點的計數。The count of compute nodes.
$ActiveTasks$ActiveTasks 準備好執行但還未執行的工作數目。The number of tasks that are ready to execute but are not yet executing. $ActiveTasks 計數包含處於作用中狀態,而且已滿足其相依性的所有工作。The $ActiveTasks count includes all tasks that are in the active state and whose dependencies have been satisfied. 處於作用中狀態但未滿足其相依性的任何工作會從 $ActiveTasks 計數排除。Any tasks that are in the active state but whose dependencies have not been satisfied are excluded from the $ActiveTasks count.
$RunningTasks$RunningTasks 處於執行中狀態的工作數目。The number of tasks in a running state.
$PendingTasks$PendingTasks $ActiveTasks 和 $RunningTasks 的總和。The sum of $ActiveTasks and $RunningTasks.
$SucceededTasks$SucceededTasks 已成功完成的工作數目。The number of tasks that finished successfully.
$FailedTasks$FailedTasks 失敗的工作數目。The number of tasks that failed.
$CurrentDedicatedNodes$CurrentDedicatedNodes 目前的專用計算節點數目。The current number of dedicated compute nodes.
$CurrentLowPriorityNodes$CurrentLowPriorityNodes 低優先順序計算節點的目前數目,包括任何已被優先佔用的節點。The current number of low-priority compute nodes, including any nodes that have been pre-empted.
$PreemptedNodeCount$PreemptedNodeCount 優先佔用狀態的集區中之節點數目。The number of nodes in the pool that are in a pre-empted state.

提示

上表中服務定義的唯讀變數是可提供各種方法來存取相關聯資料的物件。The read-only, service-defined variables that are shown in the previous table are objects that provide various methods to access data associated with each. 如需詳細資訊,請參閱本文稍後的取得範例資料For more information, see Obtain sample data later in this article.

類型Types

公式中支援以下類型:These types are supported in a formula:

  • doubledouble

  • doubleVecdoubleVec

  • doubleVecListdoubleVecList

  • stringstring

  • timestamp:timestamp 是包含下列成員的複合結構:timestamp--timestamp is a compound structure that contains the following members:

    • year
    • month (1-12)month (1-12)
    • day (1-31)day (1-31)
    • weekday (以數字格式表示,例如 1 代表星期一)weekday (in the format of number; for example, 1 for Monday)
    • hour (以 24 小時制數字格式表示,例如 13 代表下午 1:00)hour (in 24-hour number format; for example, 13 means 1 PM)
    • minute (00-59)minute (00-59)
    • second (00-59)second (00-59)
  • timeintervaltimeinterval

    • TimeInterval_ZeroTimeInterval_Zero
    • TimeInterval_100nsTimeInterval_100ns
    • TimeInterval_MicrosecondTimeInterval_Microsecond
    • TimeInterval_MillisecondTimeInterval_Millisecond
    • TimeInterval_SecondTimeInterval_Second
    • TimeInterval_MinuteTimeInterval_Minute
    • TimeInterval_HourTimeInterval_Hour
    • TimeInterval_DayTimeInterval_Day
    • TimeInterval_WeekTimeInterval_Week
    • TimeInterval_YearTimeInterval_Year

作業Operations

上一節列出的類型允許這些作業。These operations are allowed on the types that are listed in the previous section.

運算Operation 支援的運算子Supported operators 結果類型Result type
double 運算子 doubledouble operator double +、-、*、/+, -, *, / doubledouble
double 運算子 timeintervaldouble operator timeinterval * timeintervaltimeinterval
doubleVec 運算子 doubledoubleVec operator double +、-、*、/+, -, *, / doubleVecdoubleVec
doubleVec 運算子 doubleVecdoubleVec operator doubleVec +、-、*、/+, -, *, / doubleVecdoubleVec
timeinterval 運算子 doubletimeinterval operator double *、/*, / timeintervaltimeinterval
timeinterval 運算子 timeintervaltimeinterval operator timeinterval +、-+, - timeintervaltimeinterval
timeinterval 運算子 timestamptimeinterval operator timestamp + timestamptimestamp
timestamp 運算子 timeintervaltimestamp operator timeinterval + timestamptimestamp
timestamp timestamptimestamp operator timestamp - timeintervaltimeinterval
doubleoperatordouble -、!-, ! doubledouble
timeintervaloperatortimeinterval - timeintervaltimeinterval
double 運算子 doubledouble operator double <、<=、==、>=、>、!=<, <=, ==, >=, >, != doubledouble
string stringstring operator string <、<=、==、>=、>、!=<, <=, ==, >=, >, != doubledouble
timestamp timestamptimestamp operator timestamp <、<=、==、>=、>、!=<, <=, ==, >=, >, != doubledouble
timeinterval 運算子 timeintervaltimeinterval operator timeinterval <、<=、==、>=、>、!=<, <=, ==, >=, >, != doubledouble
double 運算子 doubledouble operator double &&, ||&&, || doubledouble

測試具有三元運算子的雙精準數 (double ? statement1 : statement2) 時,非零為 true,而零則為 falseWhen testing a double with a ternary operator (double ? statement1 : statement2), nonzero is true, and zero is false.

FunctionsFunctions

這些預先定義的 函式 可供您用來定義自動調整公式。These predefined functions are available for you to use in defining an automatic scaling formula.

函數Function 傳回類型Return type 描述Description
avg(doubleVecList)avg(doubleVecList) doubledouble 傳回 doubleVecList 中所有值的平均值。Returns the average value for all values in the doubleVecList.
len(doubleVecList)len(doubleVecList) doubledouble 傳回 doubleVecList 建立的向量的長度。Returns the length of the vector that is created from the doubleVecList.
lg(double)lg(double) doubledouble 傳回 double 的對數底數 2。Returns the log base 2 of the double.
lg(doubleVecList)lg(doubleVecList) doubleVecdoubleVec 傳回 doubleVecList 的全元件對數底數 2。Returns the component-wise log base 2 of the doubleVecList. vec(double) 必須針對此參數明確傳遞。A vec(double) must be explicitly passed for the parameter. 否則會假設為 double lg(double) 版本。Otherwise, the double lg(double) version is assumed.
ln(double)ln(double) doubledouble 傳回 double 的自然底數。Returns the natural log of the double.
ln(doubleVecList)ln(doubleVecList) doubleVecdoubleVec 傳回 doubleVecList 的全元件對數底數 2。Returns the component-wise log base 2 of the doubleVecList. vec(double) 必須針對此參數明確傳遞。A vec(double) must be explicitly passed for the parameter. 否則會假設為 double lg(double) 版本。Otherwise, the double lg(double) version is assumed.
log(double)log(double) doubledouble 傳回 double 的對數底數 10。Returns the log base 10 of the double.
log(doubleVecList)log(doubleVecList) doubleVecdoubleVec 傳回 doubleVecList 的全元件對數底數 10。Returns the component-wise log base 10 of the doubleVecList. vec(double) 必須針對單一 double 參數明確傳遞。A vec(double) must be explicitly passed for the single double parameter. 否則,會假設為 double log (double) 版本。Otherwise, the double log(double) version is assumed.
max(doubleVecList)max(doubleVecList) doubledouble 傳回 doubleVecList 中的最大值。Returns the maximum value in the doubleVecList.
min(doubleVecList)min(doubleVecList) doubledouble 傳回 doubleVecList 中的最小值。Returns the minimum value in the doubleVecList.
norm(doubleVecList)norm(doubleVecList) doubledouble 傳回 doubleVecList 建立的向量的 2-norm。Returns the two-norm of the vector that is created from the doubleVecList.
percentile(doubleVec v, double p)percentile(doubleVec v, double p) doubledouble 傳回向量 v 的百分位數元素。Returns the percentile element of the vector v.
rand()rand() doubledouble 傳回介於 0.0 到 1.0 之間的隨機值。Returns a random value between 0.0 and 1.0.
range(doubleVecList)range(doubleVecList) doubledouble 傳回 doubleVecList 中最小和最大值之間的差異。Returns the difference between the min and max values in the doubleVecList.
std(doubleVecList)std(doubleVecList) doubledouble 傳回 doubleVecList 中值的標準差範例。Returns the sample standard deviation of the values in the doubleVecList.
stop()stop() 停止評估自動調整運算式。Stops evaluation of the autoscaling expression.
sum(doubleVecList)sum(doubleVecList) doubledouble 傳回 doubleVecList 所有元件的總和。Returns the sum of all the components of the doubleVecList.
time(string dateTime="")time(string dateTime="") timestamptimestamp 如果未傳遞參數,則傳回目前時間的時間戳記,如果有傳遞參數,則為 dateTime 字串的時間戳記。Returns the time stamp of the current time if no parameters are passed, or the time stamp of the dateTime string if it is passed. 支援的 dateTime 格式為 W3C-DTF 和 RFC 1123。Supported dateTime formats are W3C-DTF and RFC 1123.
val(doubleVec v, double i)val(doubleVec v, double i) doubledouble 傳回向量 v 中位置 i 的元素值,起始索引為零。Returns the value of the element that is at location i in vector v, with a starting index of zero.

上表中所述的某些函式可以接受清單作為引數。Some of the functions that are described in the previous table can accept a list as an argument. 逗號分隔清單是 doubledoubleVec 的任意組合。The comma-separated list is any combination of double and doubleVec. 例如:For example:

doubleVecList := ( (double | doubleVec)+(, (double | doubleVec) )* )?

評估之前,doubleVecList 值會轉換成單一的 doubleVecThe doubleVecList value is converted to a single doubleVec before evaluation. 例如,如果 v = [1,2,3],呼叫 avg(v) 就相當於呼叫 avg(1,2,3)For example, if v = [1,2,3], then calling avg(v) is equivalent to calling avg(1,2,3). 呼叫 avg(v, 7) 就相當於呼叫 avg(1,2,3,7)Calling avg(v, 7) is equivalent to calling avg(1,2,3,7).

取得樣本資料Obtain sample data

自動調整公式會對 Batch 服務所提供的度量資料 (範例) 產生作用。Autoscale formulas act on metrics data (samples) that is provided by the Batch service. 公式會根據從服務取得的值擴大或縮減集區大小。A formula grows or shrinks pool size based on the values that it obtains from the service. 上述服務定義的變數是可提供各種方法來存取與該物件相關聯資料的物件。The service-defined variables that were described previously are objects that provide various methods to access data that is associated with that object. 例如,下列運算式顯示取得最後五分鐘的 CPU 使用量的要求:For example, the following expression shows a request to get the last five minutes of CPU usage:

$CPUPercent.GetSample(TimeInterval_Minute * 5)
方法Method 描述Description
GetSample()GetSample() GetSample() 方法會傳回資料樣本的向量。The GetSample() method returns a vector of data samples.

一個樣本有 30 秒的度量資料。A sample is 30 seconds worth of metrics data. 換句話說,每隔 30 秒會取得範例。In other words, samples are obtained every 30 seconds. 如下所述,從收集樣本到可用於公式之間會延遲。But as noted below, there is a delay between when a sample is collected and when it is available to a formula. 因此,並非一段指定時間內的所有樣本都可供公式評估。As such, not all samples for a given time period may be available for evaluation by a formula.
  • doubleVec GetSample(double count)
    指定要從最近收集的樣本中取得的樣本數。Specifies the number of samples to obtain from the most recent samples that were collected.

    GetSample(1) 會傳回最後一個可用的樣本。GetSample(1) returns the last available sample. 不過,這不適用於 $CPUPercent 之類的度量,因為不可能知道「何時」收集到樣本。For metrics like $CPUPercent, however, this should not be used because it is impossible to know when the sample was collected. 此範例可能是最新的,也可能因為系統問題,是更舊的。It might be recent, or, because of system issues, it might be much older. 在此情況下,最好使用如下所示的時間間隔。It is better in such cases to use a time interval as shown below.
  • doubleVec GetSample((timestamp or timeinterval) startTime [, double samplePercent])
    指定收集樣本資料的時間範圍。Specifies a time frame for gathering sample data. 它也會選擇性地指定在要求的時間範圍內必須可用的樣本數百分比。Optionally, it also specifies the percentage of samples that must be available in the requested time frame.

    如果 CPUPercent 歷程記錄中有最後 10 分鐘的所有樣本,則 $CPUPercent.GetSample(TimeInterval_Minute * 10) 會傳回 20 個樣本。$CPUPercent.GetSample(TimeInterval_Minute * 10) would return 20 samples if all samples for the last 10 minutes are present in the CPUPercent history. 不過,如果最後一分鐘的歷程記錄無法使用,則只會傳回 18 個樣本。If the last minute of history was not available, however, only 18 samples would be returned. 在此案例中:In this case:

    $CPUPercent.GetSample(TimeInterval_Minute * 10, 95) 會失敗,因為只有 90% 的樣本可用。$CPUPercent.GetSample(TimeInterval_Minute * 10, 95) would fail because only 90 percent of the samples are available.

    $CPUPercent.GetSample(TimeInterval_Minute * 10, 80) 會成功。$CPUPercent.GetSample(TimeInterval_Minute * 10, 80) would succeed.
  • doubleVec GetSample((timestamp or timeinterval) startTime, (timestamp or timeinterval) endTime [, double samplePercent])
    指定收集資料的時間範圍 (含開始時間和結束時間)。Specifies a time frame for gathering data, with both a start time and an end time.

    如上所述,從收集樣本到可用於公式之間會延遲。As mentioned above, there is a delay between when a sample is collected and when it is available to a formula. 當您使用 GetSample 方法時,請考慮此延遲。Consider this delay when you use the GetSample method. 請參閱下文中的 GetSamplePercentSee GetSamplePercent below.
GetSamplePeriod()GetSamplePeriod() 傳回歷史範例資料集中取得範例的期間。Returns the period of samples that were taken in a historical sample data set.
Count()Count() 傳回度量歷程記錄中的範例總數。Returns the total number of samples in the metric history.
HistoryBeginTime()HistoryBeginTime() 傳回度量的最舊可用資料範例的時間戳記。Returns the time stamp of the oldest available data sample for the metric.
GetSamplePercent()GetSamplePercent() 傳回指定的時間間隔內可用的樣本百分比。Returns the percentage of samples that are available for a given time interval. 例如:For example:

doubleVec GetSamplePercent( (timestamp or timeinterval) startTime [, (timestamp or timeinterval) endTime] )

因為 GetSample 方法在傳回樣本的百分比小於指定的 samplePercent 時會失敗,因此,您可以先使用 GetSamplePercent 方法進行檢查。Because the GetSample method fails if the percentage of samples returned is less than the samplePercent specified, you can use the GetSamplePercent method to check first. 然後您可以在樣本不足時執行替代動作,而不暫停自動調整評估。Then you can perform an alternate action if insufficient samples are present, without halting the automatic scaling evaluation.

樣本、樣本百分比和 GetSample() 方法Samples, sample percentage, and the GetSample() method

自動調整公式的核心是要取得工作和資源度量資料,然後根據該資料調整集區大小。The core operation of an autoscale formula is to obtain task and resource metric data and then adjust pool size based on that data. 因此,請務必清楚了解自動調整公式如何與計量資料 (樣本) 互動。As such, it is important to have a clear understanding of how autoscale formulas interact with metrics data (samples).

範例Samples

Batch 服務會定期取得工作和資源計量的樣本,使其可供自動調整公式使用。The Batch service periodically takes samples of task and resource metrics and makes them available to your autoscale formulas. Batch 服務會每隔 30 秒記錄一次這些樣本。These samples are recorded every 30 seconds by the Batch service. 不過,記錄樣本的時間與樣本可供自動調整公式使用 (與讀取) 的時間之間有所延遲。However, there is typically a delay between when those samples were recorded and when they are made available to (and can be read by) your autoscale formulas. 此外,由於各種因素 (例如網路或其他基礎結構問題),可能未在特定間隔內記錄樣本。Additionally, due to various factors such as network or other infrastructure issues, samples may not be recorded for a particular interval.

樣本百分比Sample percentage

samplePercent 傳遞至 GetSample() 方法,或呼叫 GetSamplePercent() 方法時,percent 是指 Batch 服務可能記錄的樣本總數,與自動調整公式實際可用的樣本數目之間的比較。When samplePercent is passed to the GetSample() method or the GetSamplePercent() method is called, percent refers to a comparison between the total possible number of samples that are recorded by the Batch service and the number of samples that are available to your autoscale formula.

讓我們以 10 分鐘的時間範圍為例。Let's look at a 10-minute timespan as an example. 因為會每隔 30 秒記錄一次樣本,所以在 10 分鐘的時間範圍內,Batch 服務所記錄的樣本總數就已達到 20 個樣本 (每分鐘 2 個)。Because samples are recorded every 30 seconds within a 10-minute timespan, the maximum total number of samples that are recorded by Batch would be 20 samples (2 per minute). 不過,由於回報機制固有的延遲,以及 Azure 中的其他問題,可能只有 15 個樣本可供自動調整公式讀取。However, due to the inherent latency of the reporting mechanism and other issues within Azure, there may be only 15 samples that are available to your autoscale formula for reading. 因此,舉例來說,在這 10 分鐘的期間內,記錄的樣本總數只有 75% 可供您的公式使用。So, for example, for that 10-minute period, only 75% of the total number of samples recorded may be available to your formula.

GetSample() 和樣本範圍GetSample() and sample ranges

您的自動調整公式即將擴大和縮減您的集區 — 新增節點或移除節點。Your autoscale formulas are going to be growing and shrinking your pools — adding nodes or removing nodes. 因為節點為付費使用,您想要確保您的公式使用根據充足資料的明智的分析方法。Because nodes cost you money, you want to ensure that your formulas use an intelligent method of analysis that is based on sufficient data. 因此,建議您在公式中使用趨勢類型分析。Therefore, we recommend that you use a trending-type analysis in your formulas. 此類型會根據所收集樣本的範圍來擴大和縮減集區。This type grows and shrinks your pools based on a range of collected samples.

若要這樣做,請使用 GetSample(interval look-back start, interval look-back end) 傳回樣本的向量:To do so, use GetSample(interval look-back start, interval look-back end) to return a vector of samples:

$runningTasksSample = $RunningTasks.GetSample(1 * TimeInterval_Minute, 6 * TimeInterval_Minute);

Batch 評估上述程式碼後,它會以值的向量形式傳回樣本範圍。When the above line is evaluated by Batch, it returns a range of samples as a vector of values. 例如:For example:

$runningTasksSample=[1,1,1,1,1,1,1,1,1,1];

收集樣本向量後,您便可使用 min()max()avg() 等函式從所收集的範圍衍生出有意義的值。Once you've collected the vector of samples, you can then use functions like min(), max(), and avg() to derive meaningful values from the collected range.

若要增加安全性,如果特定一段時間可用的樣本小於特定百分比,您可以強制公式評估為失敗。For additional security, you can force a formula evaluation to fail if less than a certain sample percentage is available for a particular time period. 強制公式評估為失敗會指示 Batch 在指定的樣本百分比無法使用時停止進一步評估公式。When you force a formula evaluation to fail, you instruct Batch to cease further evaluation of the formula if the specified percentage of samples is not available. 在此情況下,不會變更集區大小。In this case, no change is made to the pool size. 若要指定評估成功所需的樣本百分比,請將其指定為 GetSample()的第三個參數。To specify a required percentage of samples for the evaluation to succeed, specify it as the third parameter to GetSample(). 以下指定了 75% 的樣本需求:Here, a requirement of 75 percent of samples is specified:

$runningTasksSample = $RunningTasks.GetSample(60 * TimeInterval_Second, 120 * TimeInterval_Second, 75);

因為樣本可用性可能延遲,所以請務必記得指定回顧開始時間早於一分鐘的時間範圍。Because there may be a delay in sample availability, it is important to always specify a time range with a look-back start time that is older than one minute. 樣本需要花大約一分鐘的時間才能傳播到整個系統,所以通常無法使用 (0 * TimeInterval_Second, 60 * TimeInterval_Second) 範圍中的樣本。It takes approximately one minute for samples to propagate through the system, so samples in the range (0 * TimeInterval_Second, 60 * TimeInterval_Second) may not be available. 同樣地,您可以使用 GetSample() 的百分比參數來強制特定樣本百分比需求。Again, you can use the percentage parameter of GetSample() to force a particular sample percentage requirement.

重要

我們強烈建議避免「只」 GetSample(1)依賴自動調整公式中的We strongly recommend that you avoid relying only on GetSample(1) in your autoscale formulas. 這是因為 GetSample(1) 基本上會向 Batch 服務表示:「不論您多久以前擷取最後一個樣本,請將它提供給我」。This is because GetSample(1) essentially says to the Batch service, "Give me the last sample you have, no matter how long ago you retrieved it." 因為它只是單一樣本,而且可能是較舊的樣本,所以可能無法代表最近工作或資源狀態的全貌。Since it is only a single sample, and it may be an older sample, it may not be representative of the larger picture of recent task or resource state. 如果您使用 GetSample(1),請確定它是較大的陳述式,而且不是您的公式所依賴的唯一資料點。If you do use GetSample(1), make sure that it's part of a larger statement and not the only data point that your formula relies on.

計量Metrics

您可以在定義公式時使用資源和工作計量。You can use both resource and task metrics when you're defining a formula. 您會根據您取得和評估的度量資料來調整集區中專用節點的目標數目。You adjust the target number of dedicated nodes in the pool based on the metrics data that you obtain and evaluate. 如需每個度量的詳細資訊,請參閱上面的 變數 一節。See the Variables section above for more information on each metric.

度量Metric 描述Description
ResourceResource

資源計量是以計算節點的 CPU、頻寬和記憶體使用量以及節點數目為基礎。Resource metrics are based on the CPU, the bandwidth, the memory usage of compute nodes, and the number of nodes.

這些服務定義的變數適合用於根據節點計數進行調整:These service-defined variables are useful for making adjustments based on node count:

  • $TargetDedicatedNodes$TargetDedicatedNodes
  • $TargetLowPriorityNodes$TargetLowPriorityNodes
  • $CurrentDedicatedNodes$CurrentDedicatedNodes
  • $CurrentLowPriorityNodes$CurrentLowPriorityNodes
  • $PreemptedNodeCount$PreemptedNodeCount
  • $SampleNodeCount$SampleNodeCount

這些服務定義的變數適合用於根據節點資源使用量進行調整:These service-defined variables are useful for making adjustments based on node resource usage:

  • $CPUPercent$CPUPercent
  • $WallClockSeconds$WallClockSeconds
  • $MemoryBytes$MemoryBytes
  • $DiskBytes$DiskBytes
  • $DiskReadBytes$DiskReadBytes
  • $DiskWriteBytes$DiskWriteBytes
  • $DiskReadOps$DiskReadOps
  • $DiskWriteOps$DiskWriteOps
  • $NetworkInBytes$NetworkInBytes
  • $NetworkOutBytes$NetworkOutBytes

TaskTask

工作計量是根據工作的狀態,例如作用中、暫止和已完成。Task metrics are based on the status of tasks, such as Active, Pending, and Completed. 下列服務定義的變數適合用於根據工作度量進行集區大小調整:The following service-defined variables are useful for making pool-size adjustments based on task metrics:

  • $ActiveTasks$ActiveTasks
  • $RunningTasks$RunningTasks
  • $PendingTasks$PendingTasks
  • $SucceededTasks$SucceededTasks
  • $FailedTasks$FailedTasks

撰寫自動調整公式Write an autoscale formula

您建置自動調整公式的方式是使用上述元件撰寫陳述式,再將這些陳述式結合成完整的公式。You build an autoscale formula by forming statements that use the above components, then combine those statements into a complete formula. 在本節中,我們會建立可以執行一些真實世界調整決策的範例自動調整公式。In this section, we create an example autoscale formula that can perform some real-world scaling decisions.

首先,讓我們定義新自動調整公式的需求。First, let's define the requirements for our new autoscale formula. 公式應該︰The formula should:

  1. 如果 CPU 使用率偏高,則增加集區中專用計算節點的目標數目。Increase the target number of dedicated compute nodes in a pool if CPU usage is high.
  2. 如果 CPU 使用率偏低,則減少集區中專用計算節點的目標數目。Decrease the target number of dedicated compute nodes in a pool when CPU usage is low.
  3. 一律以 400 為專用節點的數目上限。Always restrict the maximum number of dedicated nodes to 400.

為了在高 CPU 使用率期間增加節點數目,將陳述式定義為在使用者定義的變數 ($totalDedicatedNodes) 中填入 110% 的專用節點目前目標數目值,但僅限在過去 10 分鐘期間平均 CPU 使用率下限大於 70%。To increase the number of nodes during high CPU usage, define the statement that populates a user-defined variable ($totalDedicatedNodes) with a value that is 110 percent of the current target number of dedicated nodes, but only if the minimum average CPU usage during the last 10 minutes was above 70 percent. 否則,請使用專用節點目前數目值。Otherwise, use the value for the current number of dedicated nodes.

$totalDedicatedNodes =
    (min($CPUPercent.GetSample(TimeInterval_Minute * 10)) > 0.7) ?
    ($CurrentDedicatedNodes * 1.1) : $CurrentDedicatedNodes;

為了減少低 CPU 使用率期間的專用節點數目,如果在過去 60 分鐘內平均 CPU 使用率低於 20%,公式中的下一個陳述式會將相同的 $totalDedicatedNodes 變數設定為 90% 的專用節點目前目標數目。To decrease the number of dedicated nodes during low CPU usage, the next statement in our formula sets the same $totalDedicatedNodes variable to 90 percent of the current target number of dedicated nodes if the average CPU usage in the past 60 minutes was under 20 percent. 否則,使用我們在上述陳述式中填入的目前 $totalDedicatedNodes 值。Otherwise, use the current value of $totalDedicatedNodes that we populated in the statement above.

$totalDedicatedNodes =
    (avg($CPUPercent.GetSample(TimeInterval_Minute * 60)) < 0.2) ?
    ($CurrentDedicatedNodes * 0.9) : $totalDedicatedNodes;

現在將專用計算節點的目標數目設定為上限 400 個:Now limit the target number of dedicated compute nodes to a maximum of 400:

$TargetDedicatedNodes = min(400, $totalDedicatedNodes)

以下是完整的公式:Here's the complete formula:

$totalDedicatedNodes =
    (min($CPUPercent.GetSample(TimeInterval_Minute * 10)) > 0.7) ?
    ($CurrentDedicatedNodes * 1.1) : $CurrentDedicatedNodes;
$totalDedicatedNodes =
    (avg($CPUPercent.GetSample(TimeInterval_Minute * 60)) < 0.2) ?
    ($CurrentDedicatedNodes * 0.9) : $totalDedicatedNodes;
$TargetDedicatedNodes = min(400, $totalDedicatedNodes)

使用 Batch Sdk 建立已啟用自動調整的集區Create an autoscale-enabled pool with Batch SDKs

您可以使用任何Batch sdkBatch REST API BATCH PowerShell Cmdletbatch CLI來設定集區自動調整。Pool autoscaling can be configured using any of the Batch SDKs, the Batch REST API Batch PowerShell cmdlets, and the Batch CLI. 在本節中, 您可以查看 .NET 和 Python 的範例。In this section, you can see examples for both .NET and Python.

.NET.NET

若要在 .NET 中建立已啟用自動調整的集區,請遵循下列步驟:To create a pool with autoscaling enabled in .NET, follow these steps:

  1. 使用 BatchClient.PoolOperations.CreatePool 建立集區。Create the pool with BatchClient.PoolOperations.CreatePool.
  2. CloudPool.AutoScaleEnabled 屬性設定為 trueSet the CloudPool.AutoScaleEnabled property to true.
  3. 使用自動調整公式來設定 CloudPool.AutoScaleFormula 屬性。Set the CloudPool.AutoScaleFormula property with your autoscale formula.
  4. (選擇性) 設定 CloudPool.AutoScaleEvaluationInterval 屬性 (預設值為 15 分鐘)。(Optional) Set the CloudPool.AutoScaleEvaluationInterval property (default is 15 minutes).
  5. 使用 CloudPool.CommitCommitAsync 認可集區。Commit the pool with CloudPool.Commit or CommitAsync.

下列程式碼片段在 .NET 中建立已啟用自動調整的集區。The following code snippet creates an autoscale-enabled pool in .NET. 集區的自動調整公式會將星期一的專用節點目標數目設定為 5,而將一週其他各天的節點目標數目設定為 1。The pool's autoscale formula sets the target number of dedicated nodes to 5 on Mondays, and 1 on every other day of the week. 自動調整間隔 設定為 30 分鐘。The automatic scaling interval is set to 30 minutes. 在這篇文章中C#的其他程式碼片段中myBatchClient , 是適當初始化的BatchClient類別實例。In this and the other C# snippets in this article, myBatchClient is a properly initialized instance of the BatchClient class.

CloudPool pool = myBatchClient.PoolOperations.CreatePool(
                    poolId: "mypool",
                    virtualMachineSize: "standard_d1_v2",
                    cloudServiceConfiguration: new CloudServiceConfiguration(osFamily: "5"));    
pool.AutoScaleEnabled = true;
pool.AutoScaleFormula = "$TargetDedicatedNodes = (time().weekday == 1 ? 5:1);";
pool.AutoScaleEvaluationInterval = TimeSpan.FromMinutes(30);
await pool.CommitAsync();

重要

當您建立已啟用自動調整的集區時,請勿在對 CreatePool 的呼叫上指定 targetDedicatedNodes 參數或 targetLowPriorityNodes 參數。When you create an autoscale-enabled pool, do not specify the targetDedicatedNodes parameter or the targetLowPriorityNodes parameter on the call to CreatePool. 請改為在集區上指定 AutoScaleEnabledAutoScaleFormula 屬性。Instead, specify the AutoScaleEnabled and AutoScaleFormula properties on the pool. 這些屬性的值會判斷每個節點類型的目標數目。The values for these properties determine the target number of each type of node. 此外, 若要手動調整已啟用自動調整的集區大小 (例如, 使用BatchClient. PoolOperations. ResizePoolAsync), 請先用集區上的自動調整, 然後調整其大小。Also, to manually resize an autoscale-enabled pool (for example, with BatchClient.PoolOperations.ResizePoolAsync), first disable automatic scaling on the pool, then resize it.

自動調整間隔Automatic scaling interval

依預設,Batch 服務會根據其自動調整公式每隔 15 分鐘調整集區的大小。By default, the Batch service adjusts a pool's size according to its autoscale formula every 15 minutes. 可使用下列的集區屬性設定此間隔:This interval is configurable by using the following pool properties:

最小間隔為 5 分鐘,而最大間隔為 168 小時。The minimum interval is five minutes, and the maximum is 168 hours. 如果指定此範圍以外的時間間隔,則 Batch 服務會傳回「不正確的要求 (400)」錯誤。If an interval outside this range is specified, the Batch service returns a Bad Request (400) error.

注意

自動調整目前不適合作為低於一分鐘的變更回應,而是要在您執行工作負載時逐步調整您的集區大小。Autoscaling is not currently intended to respond to changes in less than a minute, but rather is intended to adjust the size of your pool gradually as you run a workload.

PythonPython

同樣地, 您可以透過下列方式, 使用 Python SDK 建立已啟用自動調整的集區:Similarly, you can make an autoscale-enabled pool with the Python SDK by:

  1. 建立集區並指定其設定。Create a pool and specify its configuration.
  2. 將集區新增至服務用戶端。Add the pool to the service client.
  3. 使用您所撰寫的公式在集區上啟用自動調整。Enable autoscale on the pool with a formula you write.
# Create a pool; specify configuration
new_pool = batch.models.PoolAddParameter(
    id="autoscale-enabled-pool",
    virtual_machine_configuration=batchmodels.VirtualMachineConfiguration(
        image_reference=batchmodels.ImageReference(
          publisher="Canonical",
          offer="UbuntuServer",
          sku="18.04-LTS",
          version="latest"
            ),
        node_agent_sku_id="batch.node.ubuntu 18.04"),
    vm_size="STANDARD_D1_v2",
    target_dedicated_nodes=0,
    target_low_priority_nodes=0
)
batch_service_client.pool.add(new_pool) # Add the pool to the service client

formula = """$curTime = time();
             $workHours = $curTime.hour >= 8 && $curTime.hour < 18; 
             $isWeekday = $curTime.weekday >= 1 && $curTime.weekday <= 5; 
             $isWorkingWeekdayHour = $workHours && $isWeekday; 
             $TargetDedicated = $isWorkingWeekdayHour ? 20:10;""";

# Enable autoscale; specify the formula
response = batch_service_client.pool.enable_auto_scale(pool_id, auto_scale_formula=formula,
                                            auto_scale_evaluation_interval=datetime.timedelta(minutes=10), 
                                            pool_enable_auto_scale_options=None, 
                                            custom_headers=None, raw=False)

提示

您可以在 GitHub 上的Batch Python 快速入門存放庫中找到更多使用 Python SDK 的範例。More examples of using the Python SDK can be found in the Batch Python Quickstart repository on GitHub.

在現有集區啟用自動調整Enable autoscaling on an existing pool

每個 Batch SDK 會提供啟用自動調整的方法。Each Batch SDK provides a way to enable autoscaling. 例如:For example:

當您在現有集區啟用自動調整時,請記住下列幾點:When you enable autoscaling on an existing pool, keep in mind the following points:

  • 如果在您發出啟用自動調整要求時,集區的自動調整目前已停用,您必須在發出要求時指定有效的自動調整公式。If automatic scaling is currently disabled on the pool when you issue the request to enable autoscaling, you must specify a valid autoscale formula when you issue the request. 您可以選擇性地指定自動調整評估間隔。You can optionally specify an autoscale evaluation interval. 如果您未指定間隔,則會使用預設值 15 分鐘。If you do not specify an interval, the default value of 15 minutes is used.

  • 如果集區的自動調整目前已啟用,您可以指定自動調整公式、評估間隔,或同時指定兩者。If autoscale is currently enabled on the pool, you can specify an autoscale formula, an evaluation interval, or both. 您必須至少指定其中一個屬性。You must specify at least one of these properties.

    • 如果您指定新的自動調整評估間隔,則會停止現有的評估排程並啟動新的排程。If you specify a new autoscale evaluation interval, then the existing evaluation schedule is stopped and a new schedule is started. 新排程的開始時間是發出啟用自動調整要求的時間。The new schedule's start time is the time at which the request to enable autoscaling was issued.
    • 如果您省略自動調整公式或評估間隔,則 Batch 服務會繼續使用該設定目前的值。If you omit either the autoscale formula or evaluation interval, the Batch service continues to use the current value of that setting.

注意

如果當您在 .NET 中建立集區時,指定 CreatePool 方法之 targetDedicatedNodestargetLowPriorityNodes 參數的值,或在其他語言中指定相當的參數,則在評估自動調整公式時會略過這些值。If you specified values for the targetDedicatedNodes or targetLowPriorityNodes parameters of the CreatePool method when you created the pool in .NET, or for the comparable parameters in another language, then those values are ignored when the automatic scaling formula is evaluated.

此C#程式碼片段會使用Batch .net程式庫, 在現有的集區上啟用自動調整:This C# code snippet uses the Batch .NET library to enable autoscaling on an existing pool:

// Define the autoscaling formula. This formula sets the target number of nodes
// to 5 on Mondays, and 1 on every other day of the week
string myAutoScaleFormula = "$TargetDedicatedNodes = (time().weekday == 1 ? 5:1);";

// Set the autoscale formula on the existing pool
await myBatchClient.PoolOperations.EnableAutoScaleAsync(
    "myexistingpool",
    autoscaleFormula: myAutoScaleFormula);

更新自動調整公式Update an autoscale formula

若要更新現有已啟用自動調整集區上的公式,呼叫作業以使用新的公式再次啟用自動調整。To update the formula on an existing autoscale-enabled pool, call the operation to enable autoscaling again with the new formula. 例如,如果在執行下列 .NET 程式碼時,已在 myexistingpool 上啟用自動調整,則會以 myNewFormula 的內容取代其自動調整公式。For example, if autoscaling is already enabled on myexistingpool when the following .NET code is executed, its autoscale formula is replaced with the contents of myNewFormula.

await myBatchClient.PoolOperations.EnableAutoScaleAsync(
    "myexistingpool",
    autoscaleFormula: myNewFormula);

更新自動調整間隔Update the autoscale interval

若要更新現有已啟用自動調整集區的自動調整評估間隔,呼叫作業以使用新的間隔再次啟用自動調整。To update the autoscale evaluation interval of an existing autoscale-enabled pool, call the operation to enable autoscaling again with the new interval. 例如,若要針對在 .NET 中已啟用自動調整的集區,將自動調整評估間隔設定為 60 分鐘︰For example, to set the autoscale evaluation interval to 60 minutes for a pool that's already autoscale-enabled in .NET:

await myBatchClient.PoolOperations.EnableAutoScaleAsync(
    "myexistingpool",
    autoscaleEvaluationInterval: TimeSpan.FromMinutes(60));

評估自動調整公式Evaluate an autoscale formula

您可以先評估公式,再將它套用至集區。You can evaluate a formula before applying it to a pool. 如此一來,您可以測試公式,以在將公式放入生產環境前,查看其陳述式的評估情況。In this way, you can test the formula to see how its statements evaluate before you put the formula into production.

若要評估自動調整公式,您必須先使用有效的公式在集區啟用自動調整。To evaluate an autoscale formula, you must first enable autoscaling on the pool with a valid formula. 若要在尚未啟用自動調整的集區上測試公式,在第一次啟用自動調整時使用一行公式 $TargetDedicatedNodes = 0To test a formula on a pool that doesn't yet have autoscaling enabled, use the one-line formula $TargetDedicatedNodes = 0 when you first enable autoscaling. 然後,使用下列其中之一來評估您想要測試的公式︰Then, use one of the following to evaluate the formula you want to test:

在此Batch .net程式碼片段中, 我們會評估自動調整公式。In this Batch .NET code snippet, we evaluate an autoscale formula. 如果集區並未啟用自動調整,我們會先加以啟用。If the pool does not have autoscaling enabled, we enable it first.

// First obtain a reference to an existing pool
CloudPool pool = await batchClient.PoolOperations.GetPoolAsync("myExistingPool");

// If autoscaling isn't already enabled on the pool, enable it.
// You can't evaluate an autoscale formula on non-autoscale-enabled pool.
if (pool.AutoScaleEnabled == false)
{
    // We need a valid autoscale formula to enable autoscaling on the
    // pool. This formula is valid, but won't resize the pool:
    await pool.EnableAutoScaleAsync(
        autoscaleFormula: "$TargetDedicatedNodes = $CurrentDedicatedNodes;",
        autoscaleEvaluationInterval: TimeSpan.FromMinutes(5));

    // Batch limits EnableAutoScaleAsync calls to once every 30 seconds.
    // Because we want to apply our new autoscale formula below if it
    // evaluates successfully, and we *just* enabled autoscaling on
    // this pool, we pause here to ensure we pass that threshold.
    Thread.Sleep(TimeSpan.FromSeconds(31));

    // Refresh the properties of the pool so that we've got the
    // latest value for AutoScaleEnabled
    await pool.RefreshAsync();
}

// We must ensure that autoscaling is enabled on the pool prior to
// evaluating a formula
if (pool.AutoScaleEnabled == true)
{
    // The formula to evaluate - adjusts target number of nodes based on
    // day of week and time of day
    string myFormula = @"
        $curTime = time();
        $workHours = $curTime.hour >= 8 && $curTime.hour < 18;
        $isWeekday = $curTime.weekday >= 1 && $curTime.weekday <= 5;
        $isWorkingWeekdayHour = $workHours && $isWeekday;
        $TargetDedicatedNodes = $isWorkingWeekdayHour ? 20:10;
    ";

    // Perform the autoscale formula evaluation. Note that this code does not
    // actually apply the formula to the pool.
    AutoScaleRun eval =
        await batchClient.PoolOperations.EvaluateAutoScaleAsync(pool.Id, myFormula);

    if (eval.Error == null)
    {
        // Evaluation success - print the results of the AutoScaleRun.
        // This will display the values of each variable as evaluated by the
        // autoscale formula.
        Console.WriteLine("AutoScaleRun.Results: " +
            eval.Results.Replace("$", "\n    $"));

        // Apply the formula to the pool since it evaluated successfully
        await batchClient.PoolOperations.EnableAutoScaleAsync(pool.Id, myFormula);
    }
    else
    {
        // Evaluation failed, output the message associated with the error
        Console.WriteLine("AutoScaleRun.Error.Message: " +
            eval.Error.Message);
    }
}

成功評估此程式碼片段所顯示的公式會產生類似以下的結果:Successful evaluation of the formula shown in this code snippet produces results similar to:

AutoScaleRun.Results:
    $TargetDedicatedNodes=10;
    $NodeDeallocationOption=requeue;
    $curTime=2016-10-13T19:18:47.805Z;
    $isWeekday=1;
    $isWorkingWeekdayHour=0;
    $workHours=0

取得自動調整執行的相關資訊Get information about autoscale runs

若要確保您的公式如預期般執行,我們建議您定期檢查 Batch 對您的集區執行之自動調整執行的結果。To ensure that your formula is performing as expected, we recommend that you periodically check the results of the autoscaling runs that Batch performs on your pool. 若要這麼做,請取得 (或重新整理) 集區的參考,並檢查其上次自動調整執行的內容。To do so, get (or refresh) a reference to the pool, and examine the properties of its last autoscale run.

在 Batch .NET 中,CloudPool.AutoScaleRun 屬性有數個屬性,可提供最近在集區上執行之自動調整的相關資訊:In Batch .NET, the CloudPool.AutoScaleRun property has several properties that provide information about the latest automatic scaling run performed on the pool:

在 REST API 中,取得集區的相關資訊要求會傳回集區的相關資訊,其中包含 autoScaleRun 屬性中最近執行的自動調整資訊。In the REST API, the Get information about a pool request returns information about the pool, which includes the latest automatic scaling run information in the autoScaleRun property.

下列 C# 程式碼片段會使用 Batch .NET 程式庫來列印集區 myPool 上次自動調整執行的相關資訊︰The following C# code snippet uses the Batch .NET library to print information about the last autoscaling run on pool myPool:

await Cloud pool = myBatchClient.PoolOperations.GetPoolAsync("myPool");
Console.WriteLine("Last execution: " + pool.AutoScaleRun.Timestamp);
Console.WriteLine("Result:" + pool.AutoScaleRun.Results.Replace("$", "\n  $"));
Console.WriteLine("Error: " + pool.AutoScaleRun.Error);

上述程式碼片段的範例輸出︰Sample output of the preceding snippet:

Last execution: 10/14/2016 18:36:43
Result:
  $TargetDedicatedNodes=10;
  $NodeDeallocationOption=requeue;
  $curTime=2016-10-14T18:36:43.282Z;
  $isWeekday=1;
  $isWorkingWeekdayHour=0;
  $workHours=0
Error:

自動調整公式範例Example autoscale formulas

讓我們看看下面幾個公式,其顯示調整集區中計算資源數量的不同方式。Let's look at a few formulas that show different ways to adjust the amount of compute resources in a pool.

範例 1:以時間為基礎的調整Example 1: Time-based adjustment

假設您想要根據一週的天數和一天的時間,調整集區大小。Suppose you want to adjust the pool size based on the day of the week and time of day. 這個範例示範如何據以增加或減少集區中的節點數目。This example shows how to increase or decrease the number of nodes in the pool accordingly.

此公式會先取得目前的時間。The formula first obtains the current time. 如果是工作日 (1-5) 且在上班時間內 (上午 8:00 - 下午 6:00),則將目標集區大小設為 20 個節點。If it's a weekday (1-5) and within working hours (8 AM to 6 PM), the target pool size is set to 20 nodes. 否則,它會設定為 10 個節點。Otherwise, it's set to 10 nodes.

$curTime = time();
$workHours = $curTime.hour >= 8 && $curTime.hour < 18;
$isWeekday = $curTime.weekday >= 1 && $curTime.weekday <= 5;
$isWorkingWeekdayHour = $workHours && $isWeekday;
$TargetDedicatedNodes = $isWorkingWeekdayHour ? 20:10;

範例 2:以工作為基礎的調整Example 2: Task-based adjustment

在此範例中,是根據佇列中的工作數目調整集區大小。In this example, the pool size is adjusted based on the number of tasks in the queue. 公式字串中接受註解和換行。Both comments and line breaks are acceptable in formula strings.

// Get pending tasks for the past 15 minutes.
$samples = $ActiveTasks.GetSamplePercent(TimeInterval_Minute * 15);
// If we have fewer than 70 percent data points, we use the last sample point,
// otherwise we use the maximum of last sample point and the history average.
$tasks = $samples < 70 ? max(0,$ActiveTasks.GetSample(1)) : max( $ActiveTasks.GetSample(1), avg($ActiveTasks.GetSample(TimeInterval_Minute * 15)));
// If number of pending tasks is not 0, set targetVM to pending tasks, otherwise
// half of current dedicated.
$targetVMs = $tasks > 0? $tasks:max(0, $TargetDedicatedNodes/2);
// The pool size is capped at 20, if target VM value is more than that, set it
// to 20. This value should be adjusted according to your use case.
$TargetDedicatedNodes = max(0, min($targetVMs, 20));
// Set node deallocation mode - keep nodes active only until tasks finish
$NodeDeallocationOption = taskcompletion;

範例 3:考量平行工作Example 3: Accounting for parallel tasks

此範例會根據工作數目調整集區大小。This example adjusts the pool size based on the number of tasks. 此公式也會考慮已針對集區設定的cloudpool.maxtaskspercomputenode值。This formula also takes into account the MaxTasksPerComputeNode value that has been set for the pool. 已在集區上啟用平行工作執行的情況下,這個方法很有用。This approach is useful in situations where parallel task execution has been enabled on your pool.

// Determine whether 70 percent of the samples have been recorded in the past
// 15 minutes; if not, use last sample
$samples = $ActiveTasks.GetSamplePercent(TimeInterval_Minute * 15);
$tasks = $samples < 70 ? max(0,$ActiveTasks.GetSample(1)) : max( $ActiveTasks.GetSample(1),avg($ActiveTasks.GetSample(TimeInterval_Minute * 15)));
// Set the number of nodes to add to one-fourth the number of active tasks (the
// MaxTasksPerComputeNode property on this pool is set to 4, adjust this number
// for your use case)
$cores = $TargetDedicatedNodes * 4;
$extraVMs = (($tasks - $cores) + 3) / 4;
$targetVMs = ($TargetDedicatedNodes + $extraVMs);
// Attempt to grow the number of compute nodes to match the number of active
// tasks, with a maximum of 3
$TargetDedicatedNodes = max(0,min($targetVMs,3));
// Keep the nodes active until the tasks finish
$NodeDeallocationOption = taskcompletion;

範例 4:設定初始集區大小Example 4: Setting an initial pool size

此範例顯示的 C# 程式碼片段具有自動調整公式,其在初始期間將集區大小設為指定的節點數目。This example shows a C# code snippet with an autoscale formula that sets the pool size to a specified number of nodes for an initial time period. 然後在初始期間經過之後,再根據執行中和作用中的工作數目來調整集區大小。Then it adjusts the pool size based on the number of running and active tasks after the initial time period has elapsed.

下列程式碼片段中的公式:The formula in the following code snippet:

  • 將初始的集區大小設為 4 個節點。Sets the initial pool size to four nodes.
  • 在集區生命週期的最初 10 分鐘內不調整集區大小。Does not adjust the pool size within the first 10 minutes of the pool's lifecycle.
  • 10 分鐘後,取得過去 60 分鐘內執行中和作用中工作數目的最大值。After 10 minutes, obtains the max value of the number of running and active tasks within the past 60 minutes.
    • 如果這兩個值都是 0 (表示過去 60 分鐘沒有執行或作用中的工作),集區大小就設為 0。If both values are 0 (indicating that no tasks were running or active in the last 60 minutes), the pool size is set to 0.
    • 如果其中一個值大於零,則不進行任何變更。If either value is greater than zero, no change is made.
string now = DateTime.UtcNow.ToString("r");
string formula = string.Format(@"
    $TargetDedicatedNodes = {1};
    lifespan         = time() - time(""{0}"");
    span             = TimeInterval_Minute * 60;
    startup          = TimeInterval_Minute * 10;
    ratio            = 50;

    $TargetDedicatedNodes = (lifespan > startup ? (max($RunningTasks.GetSample(span, ratio), $ActiveTasks.GetSample(span, ratio)) == 0 ? 0 : $TargetDedicatedNodes) : {1});
    ", now, 4);

後續步驟Next steps

  • 使用並行節點工作最大化 Azure Batch 計算資源使用量 包含有關如何對集區中的計算節點同時執行多項工作的詳細資料。Maximize Azure Batch compute resource usage with concurrent node tasks contains details about how you can execute multiple tasks simultaneously on the compute nodes in your pool. 除了自動調整,這項功能有助於減少某些工作負載的作業持續時間,進而節省金錢。In addition to autoscaling, this feature may help to lower job duration for some workloads, saving you money.
  • 另一種效率提升方式,則是確定您的 Batch 應用程式以最佳方式查詢 Batch 服務。For another efficiency booster, ensure that your Batch application queries the Batch service in the most optimal way. 請參閱有效率地查詢 Azure Batch 服務,以了解如何在查詢可能數千個計算節點或工作的狀態時,限制越過網路的資料量。See Query the Azure Batch service efficiently to learn how to limit the amount of data that crosses the wire when you query the status of potentially thousands of compute nodes or tasks.