Reward and terminal functions in Inkling
Caution
Reward and terminal functions are considered advanced usage for special cases. Most users should use Goals instead.
Reward functions
Reward functions take one or two input parameters. The first parameter provides the new state that was returned from the simulator. The second parameter (which can be omitted) is the action provided by the AI during a training episode. The reward function must return a numeric value indicating the reward associated with that state and action.
The reward function is specified within the curriculum statement using the
reward keyword and a globally named or inline function.
concept Balance(input: SimState): Action {
curriculum {
source MySimulator
reward GetReward
}
}
function IsUpright(Angle: number) {
return Angle > 85 and Angle < 95
}
function GetReward(State: SimState) {
if IsUpright(State.Angle) {
return 1.0
}
return -0.01
}
Warning
Reward functions cannot be used in conjunction with goals. Including both will generate an error.
Terminal functions
As with reward functions, terminal functions can be written within Inkling. When you specify a terminal function in Inkling, the training engine ignores any terminal values passed from the simulator. Terminal functions cannot be used in conjunction with goals.
An Inkling terminal function takes one or two input parameters. The first is the new state that was returned from the simulator. The second parameter (which can be omitted) is the action that was provided by the model. The terminal function must return true (1) if the state is a terminal state or false (0) otherwise.
The terminal function is specified within the curriculum statement using the terminal keyword and a globally named or inline function.
concept Balance(input: SimState): Action {
curriculum {
source MySimulator
terminal function (State: SimState) {
return State.Position > 20
}
}
}
Terminal functions are not required when using reward functions. If no terminal
function is provided, the training episode automatically terminates once the
the configured iteration limit (EpisodeIterationLimit) is reached.
Warning
Terminal functions cannot be used in conjunction with goals. Including both will generate an error.