Quickstart: Balance a ball with AI (Moab)


  • Total time to complete: 20 minutes
  • Active time: 5 minutes
  • Machine training time: 15 minutes

Teach an AI to balance a ball in the center of a plate with Bonsai, a predefined simulator, and sample code.

Before you start

To complete this demo, you must have a valid Microsoft or Azure account and a valid Bonsai workspace provisioned on Azure. If you need an account or Azure trial, follow the instructions in Microsoft account setup for Bonsai.

Step 1: Load the Moab brain

Bonsai provides a prepackaged simulator and sample code for the ball balancing problem (Moab). To build your brain:

  1. Sign into the Bonsai UI.
  2. Select Moab from the list of demo brains from the Getting Started dialog.
  3. Name your new brain (for example, "Moab Demo").
  4. Click Create Brain to load the sample brain and simulator.

'Create Brain' screen

Partial screenshot of the "Create Brain" screen with the Moab sample brain highlighted.

Step 2: Inspect the curriculum

Bonsai opens the teaching UI when your demo brain loads. The teaching UI includes a coding panel and a graphing panel. The coding panel displays our teaching code (the curriculum) written in a proprietary language called Inkling. The graph in the graphing panel represents the iterative learning process defined by the Inkling code.

Bonsai Teaching UI

Annotated screenshot of the Bonsai Teaching UI divided into three horizontal panels (left, center, and right). The left panel displays available brains and simulators. The center panel is annotated with 'Coding Panel' and displays the example Inkling code. The right panel is annotated with 'Graphing Panel' and displays a teaching graph. The teaching graph has three nodes arranged vertically (top, middle, bottom). The top node is labeled 'ObservableState'. The middle node is labeled 'Concept MoveToCenter'. The bottom node is labeled 'SimAction'.

Clicking the different nodes in the teaching graph highlights the relevant section in the sample code:

  • State node: encapsulates the information available to the brain as the simulation runs (the observable sensor states). For Moab, the observable sensor states are the current position and velocity of the ball.
  • Concept node: encapsulates the concept you want the brain to learn as defined by your training goals. For Moab, the concept is moving a ball to a specific target. The corresponding Inkling goals are driving the ball to the center of the plate and keeping it there (drive Center Of Plate), and avoiding the edge of the plate (avoid Fall Off Plate).
  • Action node: encapsulates the set of valid actions the brain can take in response to the observed state. For Moab, the available actions are adjusting the pitch and roll (tilt) of the plate.

Step 3: Train the brain

Important

Running simulations consumes Azure resources. Following the quickstart as written will charge your Azure subscription approximately 0.50 USD. Repeated training or running the training longer than recommended will result in additional cost.

Open the training UI and start training the brain by clicking the green Train button in the graphing panel.

The training UI replaces the coding panel with an empty data panel and shows an updated teaching graph. When you start training, Bonsai automatically starts up a fleet of simulator instances. The fleet appears in the updated graph as a new Simulator node.

The Simulator node shows you:

  • the simulator name, "MoveToCenterSimulator".
  • the total number of simulator instances in the fleet.
  • the overall speed of the fleet in iterations per second.

Bonsai Training UI

Screenshot of the Bonsai Training UI. The data panel of the UI displays an example performance plot trending upward. The graphing panel of the UI displays an updated teaching graph. The teaching graph now includes a simulator node to the left of the previous nodes. The Simulator node is connected to the ObservableState node and the SimAction node. The new node is labeled 'Simulator MoabSim' and divided into two sections. The left section displays the number of connected simulator instances (15) and the right section displays the current level of goal satisfaction (90.8%).

With each iteration, your brain earns a performance score based on how well it solved the problem. Bonsai reports training progress for your brain in the data panel as a Goal Satisfaction plot. Individual goal satisfaction values indicate how close your brain got to achieving the related goal for a given iteration. The latest overall goal satisfaction value is also reported in the concept node of the teaching graph.

The satisfaction plots should trend upward as your brain gets better at balancing the ball in the center of the plate.

Step 4: Watch the brain in action

The Moab simulator includes a visualizer so you can watch your brain in action. as it works through a particular simulation. To see the visualization, scroll down in the data panel.

The visualizer renders a 3D model of the Moab hardware and a ball. The visualization also displays:

  • the estimated trajectory of the ball (a blue arrow projected onto the plate).
  • the estimated shadow of the ball (a blue circle projected on to the plate under the ball).
  • a real-time graph of changing state variables.

Simulation visualization

Screenshot of the Bonsai Training UI. The data panel is scrolled down to show a 3D rendering of the Moab device balancing a small orange ball. The Moab device has a circular body with actuator arms on top. A clear balancing plate sits on top of the arms.

Try clicking the ball_x and ball_y values. The two lines should converge at the middle of the graph (0.00) as the ball moves to the center of the plate.

Step 5: Stop training

You can stop training your brain when you see either of the following:

  • the overall goal satisfaction value reaches 100%
  • the plot lines become horizontal lines

A 100% satisfaction value means your brain has fully learned the current curriculum. A horizontal plot line means you cannot meaningfully improve your brain for the current curriculum with more iterations.

Tip

The Moab demo brain typically achieves optimal performance within 200k iterations.

To stop training, click the red Stop Training button at the top of the graphing panel.

Next steps

Congratulations! You successfully trained a brain to balance a ball in the center of the plate.

Now that you understand the basics of the Moab brain, try customizing the Inkling code to change your training goals.