About custom voice-first virtual assistants preview

Custom virtual assistants using Azure Speech Services empower developers to create natural, human-like conversational interfaces for their applications and experiences. The Bot Framework's Direct Line Speech channel enhances these capabilities by providing a coordinated, orchestrated entry point to a compatible bot that enables voice in, voice out interaction with low latency and high reliability. These bots can use Microsoft's Language Understanding (LUIS) for natural language interaction. Direct Line Speech is accessed by devices using the Speech Software Development Kit (SDK).

Conceptual diagram of the direct line speech orchestration service flow

Direct Line Speech and its associated functionality for custom voice-first virtual assistants are an ideal supplement to the Virtual Assistant Solution and Enterprise Template. Though Direct Line Speech can work with any compatible bot, these resources provide a reusable baseline for high-quality conversational experiences as well as common supporting skills and models for getting started quickly.

Core features

Category Features
Custom wake word You can enable users to begin conversations with bots using a custom keyword like "Hey Contoso." This task is accomplished with a custom wake word engine in the Speech SDK, which can be configured with a custom wake word that you can generate here. The Direct Line Speech channel includes service-side wake word verification that improves the accuracy of the wake word activation versus the device alone.
Speech to text The Direct Line Speech channel includes real-time transcription of audio into recognized text using Speech-to-text from Azure Speech Services. This text is available to both your bot and your client application as it is transcribed.
Text to speech Textual responses from your bot will be synthesized using Text-to-speech from Azure Speech Services. This synthesis will then be made available to your client application as an audio stream. Microsoft offers the ability to build your own custom, high-quality Neural TTS voice that gives a voice to your brand, to learn more contact us.
Direct Line Speech As a channel within the Bot Framework, Direct Line Speech enables a smooth and seamless connection between your client application, a compatible bot, and the capabilities of Azure Speech Services. For more information on configuring your bot to use the Direct Line Speech channel, see its page in the Bot Framework documentation.

Get started with virtual assistants

We offer quickstarts designed to have you running code in less than 10 minutes. This table includes a list of voice-first virtual assistant quickstarts organized by language.

Quickstart Platform API reference
C#, UWP Windows Browse
Java Windows, macOS, Linux Browse
Java Android Browse

Sample code

Sample code for creating a voice-first virtual assistant is available on GitHub. These samples cover the client application for connecting to your bot in several popular programming languages.


A tutorial on how to voice-enable your bot using the Speech SDK and Direct Line Speech channel.


Voice-first virtual assistants built using Azure Speech Services can use the full range of customization options available for speech-to-text, text-to-speech, and custom keyword selection.


Customization options vary by language/locale (see Supported languages).

Reference docs

Next steps