ReinforcementLearningEstimator class

Definition

Represents an estimator for training Reinforcement Learning experiments.

ReinforcementLearningEstimator(source_directory, *, environment=None, entry_script=None, script_params=None, compute_target=None, use_gpu=None, pip_packages=None, conda_packages=None, environment_variables=None, rl_framework=ray-0.8.0, cluster_coordination_timeout_seconds=None, max_run_duration_seconds=None, worker_configuration=None, simulator_configuration=None, conda_dependencies_file=None, pip_requirements_file=None, shm_size=None, inputs=None)
Inheritance
azureml.contrib.train.rl._rl_base_estimator._RLBaseEstimator
ReinforcementLearningEstimator

Parameters

source_directory
str

A local directory containing experiment configuration files.

environment
Environment

The environment definition for the experiment. It includes PythonSection, DockerSection, and environment variables. Any environment option not directly exposed through other parameters to the Estimator construction can be set using this parameter. If this parameter is specified, it will be used as a base upon which packages specified in pip_packages and conda_packages will be added.

entry_script
str

The relative path to the file containing the training script.

script_params
dict

A dictionary of command-line arguments to pass to the training script specified in entry_script.

compute_target
AbstractComputeTarget or str

The compute target where the head script will run. This can either be an object or the compute target name. AmlWindowsCompute supports only Azure Files as mounted storage and does not support environment definition.

use_gpu
bool

Specifies whether the environment should support GPUs. If true, a GPU-based default Docker image will be used in the environment. If false, a CPU-based image will be used. Default docker images (CPU or GPU) will be used only if the environment parameter is not set.

conda_packages
list

A list of strings representing conda packages to be added to the head's Python environment for the experiment.

pip_packages
list

A list of strings representing pip packages to be added to the head's Python environment for the experiment.

rl_framework
RLFramework

Orchestration framework to be used in the experiment. The default is Ray version 0.8.0

cluster_coordination_timeout_seconds
int

The maximum time in seconds that the job can take to start once it has passed the queued state.

max_run_duration_seconds
int

The maximum allowed time for the run in seconds. Azure ML will attempt to automatically cancel the job if it takes longer than this value.

worker_configuration
WorkerConfiguration

The configuration for the workers.

simulator_configuration
SimulatorConfiguration

The configuration for the simulators.

pip_requirements_file
str

The relative path to the head's pip requirements text file. This can be provided in combination with the pip_packages parameter.

conda_dependencies_file
str

The relative path to the head's conda dependencies yaml file.

environment_variables
dict

A dictionary of environment variables names and values. These environment variables are set on the head process, where the entry_script being executed.

shm_size
str

The size of the Docker container's shared memory block. If not set, the default azureml.core.environment._DEFAULT_SHM_SIZE is used. For more information, see Docker run reference.

inputs
list

A list of DataReference or DatasetConsumptionConfig objects to use as input.

Remarks

When submitting a training job, Azure ML runs your script in a conda environment within a Docker container. The Reinforcement Learning containers have the following dependencies installed.

Dependencies Ray 0.8.0 Ray 0.8.3
Python 3.6.2 3.6.2
CUDA (GPU image only) 10.0 10.0
cuDNN (GPU image only) 7.5 7.5
NCCL (GPU image only) 2.4.2 2.4.2
azureml-defaults Latest Latest
azureml-contrib-reinforcementlearning Latest Latest
ray[rllib,dashboard] 0.8.0 0.8.3
tensorflow 1.14.0 1.14.0
psutil Latest Latest
setproctitle Latest Latest
gym[atari] Latest Latest

The Docker images extend Ubuntu 16.04.

To install additional dependencies in the head docker container, you can either use the pip_packages or conda_packages. Alternatively, you can build your own image, and pass it in the environment parameter as part of an Environment object.

The Reinforcement Learning estimator supports distributed training across CPU and GPU clusters using Ray, an open-source framework for handling distributed training.

Attributes

DEFAULT_FRAMEWORK

DEFAULT_FRAMEWORK = ray-0.8.0