Assess trained brains in Bonsai


  • Total time to complete: 10 minutes
  • Active time: 5 minutes

In Machine Teaching, assessment means evaluating how well your AI enacts a policy across a set of scenarios (episode configurations) while minimizing trial and error (exploration noise).

You can use the assessment feature in Bonsai to evaluate how well your brain has learned the provided curriculum before exporting. Assessment lets you determine the quality and robustness of the learned behavior in simulation with live visualizations and data logging. Assessment functionality is available in the Bonsai CLI and the Bonsai UI.

Before you start

  • You must have a simulator. You cannot train or assess a brain without a simulator. Your simulator can be local (unmanaged) or packaged in Bonsai (managed).
  • You must have a trained (or partially trained) brain. You cannot assess a brain version that is untrained or in the process of training. You can stop training manually when you are satisfied with the performance metrics or wait for training to stop automatically when the brain reaches your no progress iteration limit.

Assess brains with Bonsai CLI

The Bonsai CLI is particularly useful when you want to specify the episode configurations to assess a brain version against. It is also useful when assessing brains with unmanaged simulators that support visualizations on a local display. For instructions on installing the CLI or information about the commands, reference the Bonsai CLI documentation.

Assess custom scenarios

  1. Create a JSON assessment configuration file.
  2. Run the assessment with the configuration file:
    bonsai brain version assessment start             \
      --brain-name=BRAIN_NAME                         \
      --file=PATH_TO_ASSESSMENT_CONFIG_FILE           \
      --concept-name=CONCEPT_NAME                     \
      --simulator-package-name=SIMULATOR_PACKAGE_NAME \
      –-instance-count INSTANCE_COUNT
    

Assess with unmanaged simulators

  1. Start the simulator locally.
  2. Confirm the simulator is idling (unset):
    bonsai simulator unmanaged list
    
    | Name       | Session Id              | Action   |
    |------------+-------------------------+----------|
    | moab-py-v5 | 367420541_10.244.21.230 | Unset    |
    
  3. Connect the simulator for assessment:
    bonsai simulator unmanaged connect \
      --simulator-name=SIMULATOR_NAME  \
      --brain-name=BRAIN_NAME          \
      --brain-version=VERSION          \
      --concept_name CONCEPT_NAME      \
      --action Assess
    
    Simulators Found: 1
    Simulators Connected: 1
    Simulators Not Connected: 0
    
  4. Confirm the simulator is ready for assessment:
    bonsai simulator unmanaged list
    
    | Name       | Session Id              | Action   |
    |------------+-------------------------+----------|
    | moab-py-v5 | 367420541_10.244.21.230 | Assess   |
    
  5. Start assessment:
    bonsai brain version assessment start
      --brain-name=BRAIN_NAME \
      --file=PATH_TO_FILE \
      --concept-name=CONCEPT_NAME
    

Assess brains with Bonsai UI

  1. Open the Bonsai UI
  2. Select the brain version you want to assess.
  3. Click on the Train tab.
  4. Wait for training to end, or stop training.
  5. Click the Start assessment button in the data panel.

Tip

Bonsai automatically stops training and marks training complete if learning does not progress within the NoProgressIterationLimit window set for the AI curriculum. :::image-end:::

If you are assessing the brain with a managed simulator, the assessment starts automatically with the package configured in the Inkling file. For example:

source simulator MoabSim(Action: SimAction, Config: SimConfig): ObservableState {
  package "Moab"
}

To assess your brain with a local simulator in the Bonsai UI, you need to connect it with the Bonsai CLI before clicking Start assessment:

  1. Comment out the package statement in your Inkling file.
  2. Start the simulator locally.
  3. Connect the simulator for assessment:
    bonsai simulator unmanaged connect \
      --simulator-name=SIMULATOR_NAME  \
      --brain-name=BRAIN_NAME          \
      --brain-version=VERSION          \
      --concept_name CONCEPT_NAME      \
      --action Assess
    
    Simulators Found: 1
    Simulators Connected: 1
    Simulators Not Connected: 0
    

Now, when you click the the Start assessment button, you can select the unmanaged simulator you want to use from the available list.

Animation of selecting a local sim

Animation of the Bonsai assessment feature in UI. The animated mouse cursor clicks "Start Assessment", then selects an unmanaged simulator from the list of options presented.

Assess partially trained brains

You can use the interactive, streaming performance charts in the assessment panel to evaluate a partially trained brain.

The streaming charts show how the various data points tracked in your simulation change over training sessions. Use the data to determine how well (or poorly) your AI responds to environmental changes and makes progress toward your stated objectives.

If your simulator tracks multiple data points, you can click the + Add Chart button to replicate the data chart. Toggle the available fields on or off in each chart to make analysis easier.

Animation of using brain assessment

Animation of the Bonsai assessment feature in UI with a brain that has not finished training. The animated mouse cursor clicks "Start Assessment", then the screen scrolls down to show the performance chart of Moab demo.

Tip

Assessing a partially trained brain is a good way to judge how well your curriculum reflects the problem you actually want the AI to solve.

Create an assessment configuration file

An assessment configuration is a JSON file that defines the target brain version and episode configurations for the assessment. You define the assessment episode configuration as an array of configuration variables and values:

{
  "version": "1.0.0",
  "recipe": {},
  "episodeConfigurations": [
    {
      "configuration_field_1": "value_1",
      "configuration_field_2": "value_2",
      "configuration_field_N": "value_N"
    }
  ]
}

For example, a custom Moab assessment configuration file might look like:

{
  "version": "1.0.0",
  "recipe": {},
  "episodeConfigurations": [
    {
      "initial_x": -0.00278,
      "initial_y": 0.0321,
      "initial_vel_x": 0.00824,
      "initial_vel_y": 0.0108,
      "initial_pitch": 0.108,
      "initial_roll": -0.0993
    },
    {
      "initial_x":-0.00248,
      "initial_y": 0.0325,
      "initial_vel_x": 0.00530,
      "initial_vel_y": 0.00623,
      "initial_pitch": 0.0437,
      "initial_roll": 0.0947
    },
    {
      "initial_x": -0.00213,
      "initial_y": 0.0200,
      "initial_vel_x": -0.00387,
      "initial_vel_y": -0.00836,
      "initial_pitch": -0.0101,
      "initial_roll": -0.0347
    },
    {
      "initial_x": -0.0112,
      "initial_y": 0.00295,
      "initial_vel_x": 0.00189,
      "initial_vel_y": 0.0144,
      "initial_pitch": 0.0383,
      "initial_roll": 0.0382
    }
  ]
}

The exact structure of the episode configuration object depends on the SimConfig section of your Inkling file. For example, in Moab, the SimConfig section includes a dictionary of values, so each episode configuration object is defined similarly.

If your configuration definition only includes simple types (no dictionaries) like:

type SimConfig {
	position<"Right", "Left", "Top", "Bottom">
}

Your episode configuration object is a simple array. For example:

"episodeConfigurations": [
  "Right",
  "Right",
  "Left",
  "Bottom",
  "Top"
]

If your configuration definition contains nested dictionaries like:

type SimConfig {
	x_position: number,
	pole: {
         velocity: number,
         friction: number
     }
}

Your episode configuration object will also include nested objects. For example:

"episodeConfigurations": [
  {
    "x_position": 3.26,
    "pole": {
      "velocity": 4.55,
      "friction": 0.20 
    }
  },
  {
    "x_position": 2.90,
    "pole": {
      "velocity": 3.67,
      "friction": 0.30 
    }
  },
  {
    "x_position": 4.00,
    "pole": {
      "velocity": -4.80,
      "friction": 0.50 
    }
  }
]

Assessment challenges

Today, the most important challenge related to assessment is that numerically analyzing results is not supported within the platform.

Future updates of the assessment feature will help with this challenge. In the meantime, to analyze AI training results numerically, set up manual logging as you integrate your simulator and log directly from the simulator.

Next steps

Once your brain meets your performance expectations, you are ready to export the brain for deployment in production.