What is speech translation?
Speech translation from the Speech service enables real-time, multi-language speech-to-speech and speech-to-text translation of audio streams. With the Speech SDK, your applications, tools, and devices have access to source transcriptions and translation outputs for provided audio. Interim transcription and translation results are returned as speech is detected, and finals results can be converted into synthesized speech.
Microsoft's translation engine is powered by two different approaches: statistical machine translation (SMT) and neural machine translation (NMT). SMT uses advanced statistical analysis to estimate the best possible translations given the context of a few words. With NMT, neural networks are used to provide more accurate, natural-sounding translations by using the full context of sentences to translate words.
Today, Microsoft uses NMT for translation to most popular languages. All languages available for speech-to-speech translation are powered by NMT. Speech-to-text translation may use SMT or NMT depending on the language pair. When the target language is supported by NMT, the full translation is NMT-powered. When the target language isn't supported by NMT, the translation is a hybrid of NMT and SMT, using English as a "pivot" between the two languages.
Core features
Here are the features available via the Speech SDK and REST APIs:
Use case | SDK | REST |
---|---|---|
Speech-to-text translation with recognition results. | Yes | No |
Speech-to-speech translation. | Yes | No |
Interim recognition and translation results. | Yes | No |
Get started with speech translation
We offer quickstarts designed to have you running code in less than 10 minutes. This table includes a list of speech translation quickstarts organized by language.
Quickstart | Platform | API reference |
---|---|---|
C#, .NET Core | Windows | Browse |
C#, .NET Framework | Windows | Browse |
C#, UWP | Windows | Browse |
C++ | Windows | Browse |
Java | Windows, Linux, macOS | Browse |
Sample code
Sample code for the Speech SDK is available on GitHub. These samples cover common scenarios like reading audio from a file or stream, continuous and single-shot recognition/translation, and working with custom models.
Migration guides
If your applications, tools, or products are using the Translator Speech API, we've created guides to help you migrate to the Speech service.
Reference docs
- Speech SDK
- Speech Devices SDK
- REST API: Speech-to-text
- REST API: Text-to-speech
- REST API: Batch transcription and customization
Next steps
Feedback
Loading feedback...