Enterprise Bot Architecture
Many organizations are seeing the productivity gains of utilizing bots across the enterprise. In many cases a department will create a bot as a proof of concept that turns out to be a success and other departments quickly see value in applying bot technology to their domain.
However, this can lead to issues if there is not a cohesive enterprise bot strategy and architecture in place. In this post we will look at some of the scaling issues enterprises face as they deploy more bots and a sample architecture that can help alleviate those issues and maximize the productivity gains for the enterprise.
Acme Co. is a large, global company with about 50,000 employees world wide. They have a help desk that services all their employees and even provides first level support for their customers. Their help desk is staffed with a mix of full time employees and contractors from a help desk service provider. Call volume has been relatively consistent for the past several years with the occasional spikes when new technology or products are introduced.
In the coming year, Acme is migrating most of their user base to the latest version of the Windows operating system and deploying two large overhauls to their customer facing applications.. Based on historical patterns this will cause a significant spike in calls to the help desk and incur additional expenses and lower customer satisfaction. To proactively address these issues, Acme Co. develops a bot to allow employees and customers to find key information and perform several self-help tasks. In addition, Acme Co. takes the time to incorporate password reset functionality into their bot as their metrics show password resets are the number one reason for help desk calls (this is very typical across all industries).
Acme Co., quickly creates a MVP and performs a controlled rollout. Their rollout shows the bot will meet or exceed their expectations and they perform a complete rollout to their internal and external user base. As they perform their migration to Windows 10 and deploy their updates to their customer applications, they not only see the expected increase mitigated, but an actual decline in overall call volume as 90% of the password reset cases can now be handled by the bot. The project is a success and several other teams are keen on utilizing bots in their area and make plans to deploy MVPs.
Fast forward several months, and there are more than half-dozen bots deployed at Acme Co. In addition to the original help desk bot, which was expanded to cover more topics, there are bots for human resource, legal department, IT infrastructure, travel department, and a few more. Several more teams at Acme Co. are looing to have their bots go to production by the end of the year.
Quickly Acme’s CTO realizes that bot technology has helped, but the ad-hoc deployment of bots across the enterprise has created several problems that are creating architectural and technical debt.
The user experience for the employees is starting to suffer. Users have to remember different bots and their locations on the intranet to access them. In addition, some bots allow access through additional channels such as SMS. When users connect to various bots, there is no consistent dialog flows and expected behaviors. Users have to learn the quirks of each bot individually. Bots from various departments also have very inconsistent user interfaces that the users sometimes find confusing.
Users are quickly finding that they don’t know what bots are out there until some time has passed. While each bot is announced on email and the intranet home site for a time, users are inundated with various messaging and often time miss these announcements. They typically find out about bots through word of mouth. This can be very difficult for employees in the remote offices. This means the enterprise is not getting the maximum productivity from their bots.
While various teams did share some code, there is no single bot code repository with templates for the teams to reuse. As one team utilizes a new feature in the core bot framework or fixes a bug in their bot, the other bots do not get the benefit of this work. When a department starts up a new bot they end up using another department's bot as a starting point, but they quickly diverge. lastly, many teams can get funding to develop a bot through the Acme’s global business services or through a partner, but they do not have the funding or skillsets on their team to maintain the bot through its lifecycle.
The bots all have various levels of security and have to go through more comprehensive and expensive audits since much of their code base is unique. Some bots have enabled channels other than the Intranet to allow users to connect to the bot through SMS or messaging systems such as Skype.
Inconsistent API Access
Many of the bots need to interface to internal or 3rd party systems. They have not been consistent in how they integrate with these systems. Some of the bots utilize the systems’ available APIs directly, others utilize a micro-service architecture, and some directly access databases. This has caused a lot of additional development and in some cases tight system coupling and dependencies that are undesirable long term.
A visual representation of what this disjointed architecture would be similar to this:
Finding A Solution
A solid enterprise bot architecture and strategy should be able to solve all the problems listed above and provide some additional benefits such as quick deployment of new bots into the eco-system. The approach we will present here is to create a special bot that will act as the single point of entry into the enterprise bot eco-system. We will call this bot the master bot, and all the other bots will now be child bots.
The master bot will be the single point where ACL is handled and will provide a consistent user interface back to the user. API access will be done through micro-services, Azure Logic Apps, or a combination. All the child bots will use a shared bot template from a central repository. The revised architecture would look something like this:
Managing The Conversation
The master bot will be responsible for managing the conversation with the end user and routing the conversation to the appropriate child bot. While talking to a child bot, the conversation data still flows through the master bot so that it can look for commands or cues that it needs to step into the conversation and perhaps route the conversation to a different bot.
A simple approach that scales well with this approach is to formulate discussions around various topics. Each child bot will register itself with the master bot and tell the master bot which topic(s) it can handle. The master bot can then present a list of common or all topics to the user to help guide them to the right child bot. The master bot can then look for a command such as ‘switch topics” or “show topics” as a cue to step back into the conversation and take over.
Resolving Core Issues
The user experience is greatly improved by having the master bot provide a consistent user interface regardless of which bot is accessed. Company branding is easily enforced and updated across all the bots. The master bot can provide a common set of commands that will always be available allowing the user to quickly perform command actions. The master bot can provide a consistent set of user interface controls across all channels that the bots can rely upon. The child bots still have quite a bit of flexibility in defining custom cards with adaptive cards.
Discovery of new bots and their capabilities is greatly enhanced in this model. Since users will be going to the master bot for many tasks, when a new bot is deployed the master bot can display a new welcome message informing the user of a new capability. In addition, when the user asks to see more topics, new bots will automatically show up in the list.
Bot Maintainability & Creation
By sharing a common template, the individual child bots can receive updates by syncing with the core source repository. They will get core bug fixes, new features, and security enhancements as a result. By having this shared template, departments can create bots at their own speed and deploy them into the eco-system easily.
The master bot and the child bots can have security easily managed by standard security tools. For example, each child bot can be registered in Azure Active Directory and the master bot can simply query AAD with the current user authorization to see what bots the user should have access to. The master bot would also not display topics for bots that the current user does not have access to. This makes for management of access to the bots very easy to control and fits within existing security practices and tools.
Inconsistent API Access
By using a common template and utilizing Azure Logic Apps or custom micro services, the various bots will have a consistent way to get access to other systems and reduce coupling between the bots and those systems. A common class library in the template can handle various tasks such as managing access tokens and services accounts by using services such as Azure Key Vault. The bot writer is simply exposed a simple API to the necessary system and does not have to worry about the details.
Here is an example video of how a master / child bot pattern would work:
[video width="365" height="794" mp4="https://msdnshared.blob.core.windows.net/media/2018/02/masterBotSample.mp4"][/video]
Bots are becoming a valuable way for businesses to enhance productivity and generate new revenue. As bots continue to be deployed across enterprises we will see the creation and refinement of new enterprise wide patterns to manage bots in the enterprise. By leveraging these patterns early on, enterprises will avoid significant technical debt and user confusion.
Sample open source code for the master bot pattern above will be available a few weeks after this posting has been made available. Please subscribe to get notified when the code is made available!