What is speech translation?


Transport Layer Security (TLS) 1.2 is now enforced for all HTTP requests to this service. For more information, see Azure Cognitive Services security.

In this overview, you learn about the benefits and capabilities of the speech translation service, which enables real-time, multi-language speech-to-speech and speech-to-text translation of audio streams. With the Speech SDK, your applications, tools, and devices have access to source transcriptions and translation outputs for provided audio. Interim transcription and translation results are returned as speech is detected, and final results can be converted into synthesized speech.

This documentation contains the following article types:

  • Quickstarts are getting-started instructions to guide you through making requests to the service.
  • How-to guides contain instructions for using the service in more specific or customized ways.
  • Concepts provide in-depth explanations of the service functionality and features.
  • Tutorials are longer guides that show you how to use the service as a component in broader business solutions.

Core features

  • Speech-to-text translation with recognition results.
  • Speech-to-speech translation.
  • Support for translation to multiple target languages.
  • Interim recognition and translation results.

Get started

See the quickstart to get started with speech translation. The speech translation service is available via the Speech SDK and the Speech CLI.

Sample code

Sample code for the Speech SDK is available on GitHub. These samples cover common scenarios like reading audio from a file or stream, continuous and single-shot recognition/translation, and working with custom models.

Migration guides

If your applications, tools, or products are using the Translator Speech API, we've created guides to help you migrate to the Speech service.

Reference docs

Next steps