How bots work
APPLIES TO: SDK v4
A bot is an app that users interact with in a conversational way, using text, graphics (such as cards or images), or speech. Azure Bot Service is a cloud platform. It hosts bots and makes them available to channels.
The Bot Framework Service, which is a component of the Azure Bot Service, sends information between the user's bot-connected app (such as Facebook or Slack and so on, which we call the channel) and the bot. Each channel may include additional information in the activities they send. Before creating bots, it is important to understand how a bot uses activity objects to communicate with its users. Let's first take a look at activities that are exchanged when we run a simple echo bot.
Two activity types illustrated here are: conversation update and message.
The Bot Framework Service may send a conversation update when a party joins the conversation. For example, on starting a conversation with the Bot Framework Emulator, you will see two conversation update activities (one for the user joining the conversation and one for the bot joining). To distinguish these conversation update activities, check who is included in the members added property of the activity.
The message activity carries conversation information between the parties. In an echo bot example, the message activities are carrying simple text and the channel will render this text. Alternatively, the message activity might carry text to be spoken, suggested actions or cards to be displayed.
In this example, the bot created and sent a message activity in response to the inbound message activity it had received. However, a bot can respond in other ways to a received message activity; it's not uncommon for a bot to respond to a conversation update activity by sending some welcome text in a message activity. More information can be found in how to welcome a user.
The Bot Framework SDK
The Bot Framework SDK allows you to build bots that can be hosted on the Azure Bot Service. The service defines a REST API and an activity protocol for how your bot and channels or users can interact. The SDK builds upon this REST API and provides an abstraction of the service so that you can focus on the conversational logic. While you don't need to understand the REST service to use the SDK, understanding some of its features can be helpful.
Bots are apps that have a conversational interface. They can be used to shift simple, repetitive tasks, such as taking a dinner reservation or gathering profile information, on to automated systems that may no longer require direct human intervention. Users converse with a bot using text, interactive cards, and speech. A bot interaction can be a quick question and answer, or it can be a sophisticated conversation that intelligently provides access to services.
Support for features provided by the SDK and REST API varies by channel. You can test your bot using the Bot Framework Emulator, but you should also test all features of your bot on each channel in which you intend to make your bot available.
Interactions involve the exchange of activities, which are handled in turns.
Every interaction between the user (or a channel) and the bot is represented as an activity. The Bot Framework Activity schema defines the activities that can be exchanged between a user or channel and a bot. Activities can represent human text or speech, app-to-app notifications, reactions to other messages, and so on.
In a conversation, people often speak one-at-a-time, taking turns speaking. With a bot, it generally reacts to user input. Within the Bot Framework SDK, a turn consists of the user's incoming activity to the bot and any activity the bot sends back to the user as an immediate response. You can think of a turn as the processing associated with the bot receiving a given activity.
For example, a user might ask a bot to perform a certain task. The bot might respond with a question to get more information about the task, at which point this turn ends. On the next turn, the bot receives a new message from the user that might contain the answer to the bot's question, or it might represent a change of subject or a request to ignore the initial request to perform the task.
Bot application structure
The SDK defines a bot class that handles the conversational reasoning for the bot app. The bot class:
- Recognizes and interprets the user's input.
- Reasons about the input and performs relevant tasks.
- Generates responses about what the bot is doing or has done.
The SDK also defines an adapter class that handles connectivity with the channels. The adapter:
- Provides a method for handling requests from and methods for generating requests to the user's channel.
- Includes a middleware pipeline, which includes turn processing outside of your bot's turn handler.
- Calls the bot's turn handler and catches errors not otherwise handled in the turn handler.
In addition, bots often need to retrieve and store state each turn. This is handled through storage, bot state, and property accessor classes. The SDK does not provide built-in storage, but does provide abstractions for storage and a few implementations of a storage layer. The managing state topic describes these state and storage features.
When you create a bot using the SDK, you provide the code to receive the HTTP traffic and forward it to the adapter. The Bot Framework provides a few templates and samples that you can use to develop your own bots.
The bot object contains the conversational reasoning or logic for a turn and exposes a turn handler, which is the method that can accept incoming activities from the bot adapter.
The SDK provides a couple different paradigms for managing your bot logic.
- Activity handlers provide an event-driven model in which the incoming activity types and sub-types are the events. This can be good for bots that have limited, short interactions with the user.
- The dialogs library provides a state-based model to manage a long-running conversation with the user.
- Use an activity handler and a component dialog for largely sequential conversations. See about component and waterfall dialogs for more information.
- Use a dialog manager and an adaptive dialog for flexible conversation flow that can handle a wider range of user interaction. See the introduction to adaptive dialogs for more information.
- Implement your own bot class and provide your own logic for handling each turn. See how to create your own prompts to gather user input for an example of what this might look like.
The bot adapter
The adapter has a process activity method for starting a turn.
- It takes the request body (the request payload, translated to an activity) and the request header as arguments.
- It checks whether the authentication header is valid.
- It creates a context object for the turn.
- It runs this through its middleware pipeline.
- It sends the activity to the bot object's turn handler.
The adapter also:
- Formats and sends response activities. These responses are typically messages for the user, but can also include information to be consumed by the user's channel directly.
- Surfaces other methods provided by the Bot Connector REST API, such as update message and delete message.
- Catches errors or exceptions not otherwise caught for the turn.
The turn context
The turn context object provides information about the activity such as the sender and receiver, the channel, and other data needed to process the activity. It also allows for the addition of information during the turn across various layers of the bot.
The turn context is one of the most important abstractions in the SDK. Not only does it carry the inbound activity to all the middleware components and the application logic but it also provides the mechanism whereby the middleware components and the bot logic can send outbound activities.
Middleware is much like any other messaging middleware, comprising a linear set of components that are each executed in order, giving each a chance to operate on the activity. The final stage of the middleware pipeline is a callback to the turn handler on the bot class the application has registered with the adapter's process activity method. Middleware implements an on turn method which the adapter calls.
The turn handler takes a turn context as its argument, typically the application logic running inside the turn handler function will process the inbound activity's content and generate one or more activities in response, sending these out using the send activity function on the turn context. Calling send activity on the turn context will cause the middleware components to be invoked on the outbound activities. Middleware components execute before and after the bot's turn handler function. The execution is inherently nested and, as such, sometimes referred to being like an onion.
The middleware topic describes middleware in greater depth.
Bot state and storage
As with other web apps, a bot is inherently stateless. State within a bot follows the same paradigms as modern web applications, and the Bot Framework SDK provides storage layer and state management abstractions to make state management easier.
The managing state topic describes these state and storage features.
Messaging endpoint and provisioning
Typically, your application will need a REST endpoint at which to receive messages. It will also need to provision resources for your bot in accordance with the platform you decide to use.
Follow the Create a bot quickstart to create and test a simple echo bot.
Activities arrive at the bot from the Bot Framework Service via an HTTP POST request. The bot responds to the inbound POST request with a 200 HTTP status code. Activities sent from the bot to the channel are sent on a separate HTTP POST to the Bot Framework Service. This, in turn, is acknowledged with a 200 HTTP status code.
The protocol doesn't specify the order in which these POST requests and their acknowledgments are made. However, to fit with common HTTP service frameworks, typically these requests are nested, meaning that the outbound HTTP request is made from the bot within the scope of the inbound HTTP request. This pattern is illustrated in the earlier diagram. Since there are two distinct HTTP connections back to back, the security model must provide for both.
The bot has 15 seconds to acknowledge the call with a status 200 on most channels. If the bot does not respond within 15 seconds, an HTTP GatewayTimeout error (504) occurs.
The activity processing stack
Let's drill into the previous sequence diagram with a focus on the arrival of a message activity.
The adapter, an integrated component of the SDK, is the core of the SDK runtime. The activity is carried as JSON in the HTTP POST body. This JSON is deserialized to create the activity object that is then handed to the adapter through its process activity method. On receiving the activity, the adapter creates a turn context and calls the middleware.
As mentioned above, the turn context provides the mechanism for the bot to send outbound activities, most often in response to an inbound activity. To achieve this, the turn context provides send, update, and delete activity response methods. Each response method runs in an asynchronous process.
The thread handling the primary bot turn deals with disposing of the context object when it is done. Be sure to
await any activity calls so the primary thread will wait on the generated activity before finishing its processing and disposing of the turn context. Otherwise, if a response (including its handlers) takes any significant amount of time and tries to act on the context object, it may get a context was disposed error.
A bot is a web application, and templates are provided for each language version of the SDK. All templates provide a default endpoint implementation and adapter. Each template includes:
- Resource provisioning
- A language-specific HTTP endpoint implementation that routes incoming activities to an adapter.
- An adapter object
- A bot object
The main difference between the different template types is in the bot object. The templates are:
- Empty bot
- Includes an activity handler that welcomes a user to the conversation by sending a "hello world" message on the first turn of the conversation.
- Echo bot
- Uses an activity handler to welcome users and echo back user input.
- Core bot
- Brings together many features of the SDK and demonstrates best practices for a bot.
- Uses an activity handler to welcome users.
- Uses a component dialog and child dialogs to manage the conversation.
- The dialogs use Language Understanding (LUIS) and QnA Maker features.
Managing bot resources
The bot resources, such as app ID, passwords, keys or secrets for connected services, will need to be managed appropriately. For more on how to do so, see the Bot Framework security guidelines and about managing bot resources.
The SDK also lets you use channel adapters, in which the adapter itself additionally performs the tasks that the Bot Connector Service would normal do for a channel.
The SDK provides a few channel adapters in some languages. More channel adapters are available through the Botkit and Community repositories. For more details, see the Bot Framework SDK repository's table of channels and adapters.
The Bot Connector REST API
The Bot Framework SDK wraps and builds upon the Bot Connector REST API. If you want to understand the underlying HTTP requests that support the SDK, see the Connector authentication and associated articles. The activities a bot sends and receives conform to the Bot Framework Activity schema.