您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

必应语音是什么?What is Bing Speech?

备注

新的语音服务和 SDK 将替换必应语音,后者自 2019 年 10 月 15 日起将不再工作。The new Speech Service and SDK is replacing Bing Speech, which will no longer work starting October 15, 2019. 有关切换到语音服务的信息,请参阅从必应语音迁移到语音服务For information on switching to the Speech Service, see Migrating from Bing Speech to the Speech Service.

使用基于云的 Microsoft 必应语音 API,开发人员可以轻松地在其应用程序中创建支持语音的强大功能,例如语音命令控制、使用自然语音聊天的用户对话以及语音听录和听写。The cloud-based Microsoft Bing Speech API provides developers an easy way to create powerful speech-enabled features in their applications, like voice command control, user dialog using natural speech conversation, and speech transcription and dictation. Microsoft 语音 API 支持“语音转文本”和“文本转语音”转换。The Microsoft Speech API supports both Speech to Text and Text to Speech conversion.

  • 语音转文本 API 将人类语音转换为可用作控制应用程序的输入或命令的文本。Speech to Text API converts human speech to text that can be used as input or commands to control your application.
  • 文本转语音 API 将文本转换为可向应用程序用户播放的音频流。Text to Speech API converts text to audio streams that can be played back to the user of your application.

语音转文本(语音识别)Speech to text (speech recognition)

Microsoft 语音识别 API 将音频流转录为应用程序可以向用户显示或作为命令输入操作的文本。Microsoft speech recognition API transcribes audio streams into text that your application can display to the user or act upon as command input. 它为开发者提供了两种向其应用添加语音的方法:REST API 基于 Websocket 的客户端库。It provides two ways for developers to add Speech to their apps: REST APIs or Websocket-based client libraries.

  • REST API:开发者可使用从其应用到服务的 HTTP 调用来进行语音识别。REST APIs: Developers can use HTTP calls from their apps to the service for speech recognition.
  • 客户端库:对于高级功能,开发者可下载 Microsoft 语音客户端库,并链接到其应用。Client libraries: For advanced features, developers can download Microsoft Speech client libraries, and link into their apps. 客户端库在使用不同语言(C#、Java、JavaScript、ObjectiveC)的多种平台(Windows、Android、iOS)上可用。The client libraries are available on various platforms (Windows, Android, iOS) using different languages (C#, Java, JavaScript, ObjectiveC). 与 REST API 不同,客户端库使用基于 Websocket 的协议。Unlike the REST APIs, the client libraries utilize Websocket-based protocol.
用例Use cases REST APIREST APIs 客户端库Client Libraries
转换短语音音频,例如无中间结果的命令(音频长度 < 15 秒)Convert a short spoken audio, for example, commands (audio length < 15 s) without interim results Yes Yes
转换长音频(> 15 秒)Convert a long audio (> 15 s) No Yes
流式传输具有所需中间结果的音频Stream audio with interim results desired No Yes
了解使用 LUIS 从音频转换的文本Understand the text converted from audio using LUIS No Yes

无论开发人员选择哪种方法(REST API 或客户端库),Microsoft 语音服务都支持以下内容:Whichever approach developers choose (REST APIs or client libraries), Microsoft speech service supports the following:

  • Cortana、Office 听写、Office 翻译工具和其他 Microsoft 产品中使用的 Microsoft 高级语音识别技术。Advanced speech recognition technologies from Microsoft that are used by Cortana, Office Dictation, Office Translator, and other Microsoft products.
  • 实时连续识别。Real-time continuous recognition. 语音识别 API 使用户能够实时将音频听录为文本,并支持接收目前为止已识别字词的中间结果。The speech recognition API enables users to transcribe audio into text in real time, and supports to receive the intermediate results of the words that have been recognized so far. 语音服务还支持语音结束检测。The speech service also supports end-of-speech detection. 此外,用户可选择其他格式功能,例如大写和标点符号、屏蔽不当字词和文本规范化。In addition, users can choose additional formatting capabilities, like capitalization and punctuation, masking profanity, and text normalization.
  • 支持已针对交互、对话和听写场景优化的语音识别结果。Supports optimized speech recognition results for interactive, conversation, and dictation scenarios. 对于需要自定义语言模型和声学模型的用户方案,自定义语音服务允许创建适合应用程序和用户的语音模型。For user scenarios which require customized language models and acoustic models, Custom Speech Service allows you to create speech models that tailored to your application and your users.
  • 支持多种方言中的多种口语。Support many spoken languages in multiple dialects. 有关每种识别模式下支持的语言的完整列表,请参阅识别语言For the full list of supported languages in each recognition mode, see recognition languages.
  • 与语言理解集成。Integration with language understanding. 除了将输入音频转换为文本外,语音转文本还为应用程序提供了理解文本含义的附加功能。Besides converting the input audio into text, the Speech to Text provides applications an additional capability to understand what the text means. 它使用语言理解智能服务 (LUIS) 从识别文本中提取意向和实体。It uses the Language Understanding Intelligent Service(LUIS) to extract intents and entities from the recognized text.

后续步骤Next steps

文本转语音(语音合成)Text to speech (speech synthesis)

文本转语音 API 使用 REST 将结构化文本转换为音频流。Text to Speech APIs use REST to convert structured text to an audio stream. API 提供多种语音和语言的文本到语音快速转换。The APIs provide fast text to speech conversion in various voices and languages. 此外,用户还可以使用 SSML 标签更改发音、音量、音高In addition users also have the ability to change audio characteristics like pronunciation, volume, pitch etc. 等音频特性。using SSML tags.

后续步骤Next steps