什麼是語音服務?What is the Speech service?

語音服務會將語音轉文字、文字轉語音及語音翻譯整合至單一 Azure 訂用帳戶。The Speech service is the unification of speech-to-text, text-to-speech, and speech-translation into a single Azure subscription. 藉由語音 CLI語音 SDK語音裝置 SDKSpeech StudioREST API,可輕易地透過語音來啟用您的應用程式、工具和裝置。It's easy to speech enable your applications, tools, and devices with the Speech CLI, Speech SDK, Speech Devices SDK, Speech Studio, or REST APIs.

重要

語音服務已取代 Bing 語音 API 和翻譯工具語音。The Speech service has replaced Bing Speech API and Translator Speech. 如需移轉說明,請參閱「移轉」一節。See the Migration section for migration instructions.

下列功能是語音服務的一部分。The following features are part of the Speech service. 請使用此資料表中的連結,深入了解每項功能的常見使用案例,或瀏覽 API 參考。Use the links in this table to learn more about common use-cases for each feature, or browse the API reference.

服務Service 功能Feature 描述Description SDKSDK RESTREST
語音轉文字Speech-to-Text 即時語音轉換文字Real-time Speech-to-text 語音轉換文字可將音訊串流或本機檔案即時轉譯或翻譯為的文字,讓您的應用程式、工具或裝置可以取用或顯示。Speech-to-text transcribes or translates audio streams or local files to text in real time that your applications, tools, or devices can consume or display. 若搭配 Language Understanding (LUIS) 使用語音轉文字,即可從轉譯的語音衍生使用者意圖,以及根據語音命令執行動作。Use speech-to-text with Language Understanding (LUIS) to derive user intents from transcribed speech and act on voice commands. Yes Yes
批次語音轉換文字Batch Speech-to-Text 批次語音轉換文字可針對 Azure Blob 儲存體中的大量語音音訊資料,啟用非同步語音轉換文字轉譯。Batch Speech-to-text enables asynchronous speech-to-text transcription of large volumes of speech audio data stored in Azure Blob Storage. 除了將語音音訊轉換為文字之外,批次語音轉換文字也可以進行自動分段標記和情感分析。In addition to converting speech audio to text, Batch Speech-to-text also allows for diarization and sentiment-analysis. No Yes
多裝置交談Multi-device Conversation 透過便利的轉譯和翻譯支援,在一個對話中連接多個裝置或用戶端以傳送以語音或文字為基礎的訊息Connect multiple devices or clients in a conversation to send speech- or text-based messages, with easy support for transcription and translation Yes No
對話轉譯Conversation Transcription 啟用即時語音辨識、說話者識別和自動分段標記功能。Enables real-time speech recognition, speaker identification, and diarization. 非常適合利用辨識說話者的能力來轉譯面對面會議。It's perfect for transcribing in-person meetings with the ability to distinguish speakers. Yes No
建立自訂語音模型Create Custom Speech Models 如果您在獨特的環境中使用語音轉文字進行辨識及轉譯,您可以建立並定型自訂原音、語言和發音模型,以處理環境噪音或業界專有詞彙。If you are using speech-to-text for recognition and transcription in a unique environment, you can create and train custom acoustic, language, and pronunciation models to address ambient noise or industry-specific vocabulary. No Yes
文字轉換語音Text-to-Speech 文字轉換語音Text-to-speech 文字轉語音會使用語音合成標記語言 (SSML) 將輸入文字轉換為仿真人的合成語音。Text-to-speech converts input text into human-like synthesized speech using Speech Synthesis Markup Language (SSML). 可選擇標準語音和類神經語音 (請參閱語言支援)。Choose from standard voices and neural voices (see Language support). Yes Yes
建立自訂語音Create Custom Voices 建立您品牌或產品專有的自訂聲音音調。Create custom voice fonts unique to your brand or product. No Yes
語音翻譯Speech Translation 語音翻譯Speech translation 語音翻譯可讓您在應用程式、工具和裝置上使用即時且多語言的語音翻譯。Speech translation enables real-time, multi-language translation of speech to your applications, tools, and devices. 此服務可用於語音轉語音及語音轉文字翻譯。Use this service for speech-to-speech and speech-to-text translation. Yes No
語音助理Voice assistants 語音助理Voice assistants 使用語音服務的語音助理能賦予開發人員建立自然、擬人的對話介面,供其應用程式和體驗之用。Voice assistants using the Speech service empower developers to create natural, human-like conversational interfaces for their applications and experiences. 語音助理服務能在裝置和助理實作之間提供迅速且可靠的互動;該助理實作會使用 Bot Framework 的 Direct Line 語音頻道,或是整合的自訂命令服務來完成工作。The voice assistant service provides fast, reliable interaction between a device and an assistant implementation that uses the Bot Framework's Direct Line Speech channel or the integrated Custom Commands service for task completion. Yes No
說話者辨識Speaker Recognition 說話者驗證與識別Speaker verification & identification 說話者辨識服務提供以唯一語音特性驗證和識別說話者的演算法。The Speaker Recognition service provides algorithms that verify and identify speakers by their unique voice characteristics. 說話者辨識是用來回答「誰在說話?」的問題。Speaker Recognition is used to answer the question “who is speaking?”. Yes Yes

重要

現在對於此服務的所有 HTTP 要求都會強制執行 TLS 1.2。TLS 1.2 is now enforced for all HTTP requests to this service. 如需詳細資訊,請參閱 Azure 認知服務安全性For more information, see Azure Cognitive Services security.

免費試用語音服務Try the Speech service for free

在下列步驟中,您同時需要 Microsoft 帳戶和 Azure 帳戶。For the following steps, you need both a Microsoft account and an Azure account. 如果您沒有 Microsoft 帳戶,則可以在 Microsoft 帳戶入口網站免費註冊。If you do not have a Microsoft account, you can sign up for one free of charge at the Microsoft account portal. 選取 [使用 Microsoft 帳戶登入],然後在系統要求您登入時選取 [建立 Microsoft 帳戶]。Select Sign in with Microsoft and then, when asked to sign in, select Create a Microsoft account. 依照步驟來建立及驗證新的 Microsoft 帳戶。Follow the steps to create and verify your new Microsoft account.

有了 Microsoft 帳戶之後,請移至 Azure 註冊頁面、選取 [免費開始],然後使用 Microsoft 帳戶建立新的 Azure 帳戶。Once you have a Microsoft account, go to the Azure sign-up page, select Start free, and create a new Azure account using a Microsoft account. 以下是如何註冊 Azure 免費帳戶的影片。Here is a video of how to sign up for Azure free account.

注意

當您註冊免費 Azure 帳戶時,其隨附 200 美元的服務點數,您可以將此點數應用於有效時間長達 30 天的付費語音服務訂用帳戶。When you sign up for a free Azure account, it comes with $200 in service credit that you can apply toward a paid Speech service subscription, valid for up to 30 days. 如果點數用完或在 30 天結束時過期,Azure 服務就會停用。Your Azure services are disabled when your credit runs out or expires at the end of the 30 days. 若要繼續使用 Azure 服務,則必須將您的帳戶升級。To continue using Azure services, you must upgrade your account. 如需詳細資訊,請參閱如何升級您的 Azure 免費帳戶For more information, see How to upgrade your Azure free account.

語音服務有兩個服務層級:免費 (f0) 和訂用帳戶 (s0),其各有不同的限制和權益。The Speech service has two service tiers: free(f0) and subscription(s0), which have different limitations and benefits. 如果您使用免費的低容量語音服務層級,則即使在免費試用或服務點數到期後,您仍可保留此免費訂用帳戶。If you use the free, low-volume Speech service tier you can keep this free subscription even after your free trial or service credit expires. 如需詳細資訊,請參閱認知服務定價 - 語音服務For more information, see Cognitive Services pricing - Speech service.

建立 Azure 資源Create the Azure resource

若要將語音服務資源 (免費或付費層) 新增至您的 Azure 帳戶:To add a Speech service resource (free or paid tier) to your Azure account:

  1. 使用您的 Microsoft 帳戶,登入 Azure 入口網站Sign in to the Azure portal using your Microsoft account.

  2. 選取入口網站左上方的 [建立資源]。Select Create a resource at the top left of the portal. 如果您沒有看到 [建立資源],則隨時都可以在畫面左上角選取摺疊的功能表來找到。If you do not see Create a resource, you can always find it by selecting the collapsed menu in the upper left corner of the screen.

  3. 在 [新增] 視窗中,於搜尋方塊中輸入「語音」,然後按 ENTER。In the New window, type "speech" in the search box and press ENTER.

  4. 在搜尋結果中,選取 [Speech]。In the search results, select Speech.

    語音搜尋結果

  5. 選取 [建立],然後:Select Create, then:

    • 為您的新資源提供唯一的名稱。Give a unique name for your new resource. 此名稱可協助您區分繫結至相同服務的多個訂用帳戶。The name helps you distinguish among multiple subscriptions tied to the same service.
    • 選擇與新資源相關聯的 Azure 訂用帳戶來決定費用的計費方式。Choose the Azure subscription that the new resource is associated with to determine how the fees are billed. 這是如何在 Azure 入口網站中建立 Azure 訂用帳戶的簡介。Here is the introduction for how to create an Azure subscription in the Azure portal.
    • 選擇將使用此資源的區域Choose the region where the resource will be used. Azure 是在全球各地許多地區正式推出的全域雲端平台。Azure is a global cloud platform that is generally available in many regions worldwide. 若要取得最佳效能,請選取最接近您或您應用程式執行位置的區域。To get the best performance, select a region that’s closest to you or where your application runs. 語音服務的可用性會因不同區域而有所差異。The Speech service availabilities vary from different regions. 請確定您是在支援的區域中建立資源。Make sure that you create your resource in a supported region. 請參閱語音服務的區域支援See region support for Speech services.
    • 選擇免費 (F0) 或付費 (S0) 的定價層。Choose either a free (F0) or paid (S0) pricing tier. 如需每一層的定價和使用量配額完整資訊,請選取 [檢視完整定價詳細資料],或是參閱語音服務定價For complete information about pricing and usage quotas for each tier, select View full pricing details or see speech services pricing. 如需了解資源的限制,請參閱 Azure 認知服務限制For limits on resources, see Azure Cognitive Services Limits.
    • 為此語音訂用帳戶建立新的資源群組,或將該訂用帳戶指派給現有的資源群組。Create a new resource group for this Speech subscription or assign the subscription to an existing resource group. 資源群組可協助組織各種 Azure 訂用帳戶。Resource groups help you keep your various Azure subscriptions organized.
    • 選取 [建立] 。Select Create. 這會帶您前往部署概觀並顯示部署進度訊息。This will take you to the deployment overview and display deployment progress messages.

部署新的語音資源需要幾分鐘的時間。It takes a few moments to deploy your new Speech resource.

尋找金鑰和區域Find keys and region

若要尋找已完成部署的金鑰和區域,請遵循下列步驟:To find the keys and region of a completed deployment, follow these steps:

  1. 使用您的 Microsoft 帳戶,登入 Azure 入口網站Sign in to the Azure portal using your Microsoft account.

  2. 選取 [所有資源],然後選取認知服務資源的名稱。Select All resources, and select the name of your Cognitive Services resource.

  3. 在左側窗格中,於 資源管理 下選取 [金鑰和端點]。On the left pane, under RESOURCE MANAGEMENT, select Keys and Endpoint.

每個訂用帳戶都有兩個金鑰,您可以在應用程式中使用任一個金鑰。Each subscription has two keys; you can use either key in your application. 若要將金鑰複製/貼到您的程式碼編輯器或其他位置,請選取每個金鑰旁的 [複製] 按鈕,切換視窗將剪貼簿內容貼到所需的位置。To copy/paste a key to your code editor or other location, select the copy button next to each key, switch windows to paste the clipboard contents to the desired location.

此外,請複製 LOCATION 值,也就是您的區域識別碼 (例如Additionally, copy the LOCATION value, which is your region ID (ex. westuswesteurope) 以呼叫 SDK。westus, westeurope) for SDK calls.

重要

這些訂用帳戶金鑰可用來存取您的認知服務 API。These subscription keys are used to access your Cognitive Service API. 請勿共用您的金鑰。Do not share your keys. 安全地加以儲存,例如使用 Azure Key Vault。Store them securely– for example, using Azure Key Vault. 我們也建議您定期重新產生這些金鑰。We also recommend regenerating these keys regularly. 進行 API 呼叫時,只需要一個金鑰。Only one key is necessary to make an API call. 重新產生第一個金鑰時,您可以使用第二個金鑰繼續存取服務。When regenerating the first key, you can use the second key for continued access to the service.

完成快速入門Complete a quickstart

我們會以最熱門的程式設計語言提供快速入門,目的是教您基本的設計模式,並讓您能在 10 分鐘內執行程式碼。We offer quickstarts in most popular programming languages, each designed to teach you basic design patterns, and have you running code in less than 10 minutes. 請參閱下列清單,以取得每項功能的快速入門。See the following list for the quickstart for each feature.

在您有機會開始使用語音服務之後,請嘗試我們的教學課程,其可說明如何解決各種案例。After you've had a chance to get started with the Speech service, try our tutorials that show you how to solve various scenarios.

取得範例程式碼Get sample code

您可以在 GitHub 上取得語音服務的程式碼範例。Sample code is available on GitHub for the Speech service. 這些範例包含常見案例,例如:從檔案或資料流讀取音訊、連續辨識、一次性辨識及使用自訂模型。These samples cover common scenarios like reading audio from a file or stream, continuous and single-shot recognition, and working with custom models. 使用下列連結來檢視 SDK 和 REST 範例:Use these links to view SDK and REST samples:

自訂語音體驗Customize your speech experience

語音服務可順利地與內建模型搭配使用,不過,您可以進一步自訂及調整體驗,以搭配您的產品或環境。The Speech service works well with built-in models, however, you may want to further customize and tune the experience for your product or environment. 從原音模型調整到專屬於自身品牌的獨特聲音音調,都是自訂選項的範圍。Customization options range from acoustic model tuning to unique voice fonts for your brand.

其他產品則會提供專為醫療保健或保險等特定用途而調整的語音模型,但可供所有人平等地使用。Other products offer speech models tuned for specific purposes like healthcare or insurance, but are available to everyone equally. Azure 語音中的自訂會成為 您的獨特 競爭優勢一部分,而其他使用者或客戶則無法使用。Customization in Azure Speech becomes part of your unique competitive advantage that is unavailable to any other user or customer. 換句話說,您的模型是私人的,而且只會針對您的使用案例進行自訂調整。In other words, your models are private and custom-tuned for your use-case only.

語音服務Speech Service 平台Platform 描述Description
語音轉文字Speech-to-Text 客製化的語音Custom Speech 依據您的需求和可用的資料,自訂語音辨識模型。Customize speech recognition models to your needs and available data. 克服像是語音模式、詞彙及背景雜音等語音辨識的阻礙。Overcome speech recognition barriers such as speaking style, vocabulary and background noise.
文字轉語音Text-to-Speech 自訂語音Custom Voice 使用您的談話資料,為您的文字轉語音應用程式建置可辨識且獨一無二的語音。Build a recognizable, one-of-a-kind voice for your Text-to-Speech apps with your speaking data available. 您可以藉由調整一組語音參數,來進一步微調語音輸出。You can further fine-tune the voice outputs by adjusting a set of voice parameters.

使用 Docker 容器在內部部署環境進行部署Deploy on premises using Docker containers

使用語音服務容器在內部部署環境中部署 API 功能。Use Speech service containers to deploy API features on-premises. 這些 Docker 容器可讓服務更加契合您的資料,以實現合規性、安全性或其他操作性原因。These Docker containers enable you to bring the service closer to your data for compliance, security or other operational reasons. 語音服務提供下列容器:The Speech service offers the following containers:

  • 標準語音轉換文字Standard Speech-to-text
  • 自訂語音轉換文字Custom Speech-to-text
  • 標準文字轉換語音Standard Text-to-speech
  • 類神經文字轉換語音Neural Text-to-speech
  • 自訂文字轉換語音 (預覽)Custom Text-to-speech (preview)
  • 語音語言偵測 (預覽)Speech Language Detection (preview)

參考文件Reference docs

後續步驟Next steps