# BanditPolicy class

## Definition

Defines an early termination policy based on slack criteria, and a frequency and delay interval for evaluation.

`BanditPolicy(evaluation_interval=1, slack_factor=None, slack_amount=None, delay_evaluation=0)`

- Inheritance

#### Parameters

- slack_factor
- float

The ratio used to calculate the allowed distance from the best performing experiment run.

- slack_amount
- float

The absolute distance allowed from the best performing run.

- evaluation_interval
- int

The frequency for applying the policy.

- delay_evaluation
- int

The number of intervals for which to delay the first policy evaluation.
If specified, the policy applies every multiple of `evaluation_interval`

that is greater than or equal to `delay_evaluation`

.

## Remarks

The Bandit policy takes the following configuration parameters:

`slack_factor`

: The amount of slack allowed with respect to the best performing training run. This factor specifies the slack as a ratio.`slack_amount`

: The amount of slack allowed with respect to the best performing training run. This factor specifies the slack as an absolute amount.`evaluation_interval`

: Optional. The frequency for applying the policy. Each time the training script logs the primary metric counts as one interval.`delay_evaluation`

: Optional. The number of intervals to delay policy evaluation. Use this parameter to avoid premature termination of training runs. If specified, the policy applies every multiple of`evaluation_interval`

that is greater than or equal to`delay_evaluation`

.

Any run that doesn't fall within the slack factor or slack amount of the evaluation metric with respect to the best performing run will be terminated.

Consider a Bandit policy with `slack_factor`

= 0.2 and `evaluation_interval`

= 100.
Assume that run X is the currently best performing run with an AUC (performance metric) of 0.8 after 100
intervals. Further, assume the best AUC reported for a run is Y. This policy compares the value
(Y + Y * 0.2) to 0.8, and if smaller, cancels the run. If `delay_evaluation`

= 200, then the
first time the policy will be applied is at interval 200.

Now, consider a Bandit policy with `slack_amount`

= 0.2 and `evaluation_interval`

= 100.
If Run 3 is the currently best performing run with an AUC (performance metric) of 0.8 after 100 intervals,
then any run with an AUC less than 0.6 (0.8 - 0.2) after 100 iterations will be terminated.
Similarly, the `delay_evaluation`

can also be used to delay the first termination policy
evaluation for a specific number of sequences.

For more information about applying early termination policies, see Tune hyperparameters for your model.

## Attributes

### delay_evaluation

Return the number of sequences for which the first evaluation is delayed.

#### Returns

The delay evaluation.

#### Return type

### evaluation_interval

### slack_factor

Return the slack factor with respect to the best performing training run.

#### Returns

The slack factor.

#### Return type

### POLICY_NAME

`POLICY_NAME = 'Bandit'`