快速入門:使用適用於 Unity 的語音 SDK (搶鮮版 (Beta)) 來辨識語音Quickstart: Recognize speech with the Speech SDK for Unity (Beta)

另備有文字轉換語音的快速入門。Quickstarts are also available for text-to-speech.

使用本指南,可透過 Unity 和適用於 Unity 的語音 SDK (搶鮮版 (Beta)) 建立語音轉文字的應用程式。Use this guide to create a speech-to-text application using Unity and the Speech SDK for Unity (Beta). 完成之後,您可以使用電腦的麥克風將語音即時轉譯為文字。When finished, you can use your computer's microphone to transcribe speech to text in real time. 如果您不熟悉 Unity,建議您先詳讀 Unity 使用者手冊,再開始進行應用程式開發。If you are not familiar with Unity, it is recommended to study the Unity User Manual before starting your application development.

注意

適用於 Unity 的語音 SDK 目前為搶鮮版 (Beta)。The Speech SDK for Unity is currently in beta. 它支援 Windows Desktop (x86 和 x64) 或通用 Windows 平台 (x86、x64、ARM/ARM64),以及 Android (x86、ARM32/64)。It supports Windows Desktop (x86 and x64) or Universal Windows Platform (x86, x64, ARM/ARM64), and Android (x86, ARM32/64).

必要條件Prerequisites

若要完成此專案,您需要:To complete this project, you'll need:

建立 Unity 專案Create a Unity project

  • 啟動 Unity,然後在 [專案] 索引標籤下方選取 [新增] 。Start Unity and under the Projects tab select New.
  • 指定 csharp-unity 作為 [專案名稱] ,並指定 3D 作為 [範本] ,然後選擇位置。Specify Project name as csharp-unity, Template as 3D and pick a location. 接著,選取 [建立專案] 。Then select Create project.
  • 不久後應該就會跳出 [Unity 編輯器] 視窗。After a bit of time, the Unity Editor window should pop up.

安裝語音 SDKInstall the Speech SDK

重要

下載此頁面上的任何「Azure 認知服務的語音 SDK」元件,即表示您知悉其授權。By downloading any of the Speech SDK for Azure Cognitive Services components on this page, you acknowledge its license. 請參閱語音 SDK 的 Microsoft 軟體授權條款See the Microsoft Software License Terms for the Speech SDK.

  • 適用於 Unity 的語音 SDK (搶鮮版 (Beta)) 會封裝為 Unity 資產套件 (.unitypackage)。The Speech SDK for Unity (Beta) is packaged as a Unity asset package (.unitypackage). 請從這裡下載。Download it from here.

  • 選取 [資產] > [匯入套件] > [自訂套件] ,以匯入語音 SDK。Import the Speech SDK by selecting Assets > Import Package > Custom Package. 如需詳細資訊,請查看 Unity 文件Check out the Unity documentation for details.

  • 在檔案選擇器中,選取您先前下載的語音 SDK .unitypackage 檔案。In the file picker, select the Speech SDK .unitypackage file that you downloaded above.

  • 確定所有檔案皆已選取,然後按一下 [匯入] :Ensure that all files are selected and click Import:

    匯入語音 SDK Unity 資產套件時的 Unity 編輯器螢幕擷取畫面

新增 UIAdd UI

我們僅在場景中新增最低限度的 UI,其中包含觸發語音辨識的按鈕,和顯示結果的文字欄位。We add a minimal UI to our scene, consisting of a button to trigger speech recognition and a text field to display the result.

  • 階層視窗 (依預設位於左側) 中,會顯示 Unity 建立的範例場景,內含新專案。In the Hierarchy Window (by default on the left), a sample scene is shown that Unity created with the new project.
  • 按一下 [階層視窗] 頂端的 [建立] 按鈕,然後選取 [UI] > [按鈕] 。Click the Create button at the top of the Hierarchy Window, and select UI > Button.
  • 這會建立三個可在 [階層視窗] 中檢視的遊戲物件:內嵌在畫布物件中的按鈕物件,和事件系統物件。This creates three game objects that you can see in the Hierarchy Window: a Button object nested within a Canvas object, and an EventSystem object.
  • 瀏覽場景檢視,以清楚檢視畫布和場景檢視中的按鈕。Navigate the Scene View so you have a good view of the canvas and the button in the Scene View.
  • 按一下 [階層視窗] 中的 [按鈕] 物件,以在偵測器視窗 (依預設位於右側) 中顯示其設定。Click the Button object in the Hierarchy Window to display its settings in the Inspector Window (by default on the right).
  • 將 [位置 X] 和 [位置 Y] 屬性設為 0,使按鈕置於畫布中央。Set the Pos X and Pos Y properties to 0, so the button is centered in the middle of the canvas.
  • 再按一下 [階層視窗] 頂端的 [建立] 按鈕,然後選取 [UI] > [文字] 以建立文字欄位。Click the Create button at the top of the Hierarchy Window again, and select UI > Text to create a text field.
  • 按一下 [階層視窗] 中的 [文字] 物件,以在偵測器視窗 (依預設位於右側) 中顯示其設定。Click the Text object in the Hierarchy Window to display its settings in the Inspector Window (by default on the right).
  • 將 [位置 X] 和 [位置 Y] 屬性分別設為 0120,並將 [寬度] 和 [高度] 屬性分別設為 240120,以確保文字欄位和按鈕不會重疊。Set the Pos X and Pos Y properties to 0 and 120, and set the Width and Height properties to 240 and 120 to ensure that the text field and the button do not overlap.

完成之後,UI 應該會如下列螢幕擷取畫面所示:When you're done, the UI should look similar to this screenshot:

快速入門使用者介面在 Unity 編輯器中的螢幕擷取畫面Screenshot of the quickstart user interface in the Unity Editor

新增範例程式碼Add the sample code

  1. 專案視窗 (依預設位於左下方) 中按一下 [建立] 按鈕,然後選取 [C# 指令碼] 。In the Project Window (by default on the left bottom), click the Create button and then select C# script. 將指令碼命名為 HelloWorldName the script HelloWorld.

  2. 按兩下該指令碼加以編輯。Edit the script by double-clicking it.

    注意

    您可以在 [編輯] > [喜好設定] 下方設定所將啟動的程式碼編輯器,詳情請參閱 Unity 使用者手冊You can configure which code editor will be launched under Edit > Preferences, see the Unity User Manual.

  3. 將所有程式碼取代為下列內容:Replace all code with the following:

    using UnityEngine;
    using UnityEngine.UI;
    using Microsoft.CognitiveServices.Speech;
    #if PLATFORM_ANDROID
    using UnityEngine.Android;
    #endif
    
    public class HelloWorld : MonoBehaviour
    {
        // Hook up the two properties below with a Text and Button object in your UI.
        public Text outputText;
        public Button startRecoButton;
    
        private object threadLocker = new object();
        private bool waitingForReco;
        private string message;
    
        private bool micPermissionGranted = false;
    
    #if PLATFORM_ANDROID
        // Required to manifest microphone permission, cf.
        // https://docs.unity3d.com/Manual/android-manifest.html
        private Microphone mic;
    #endif
    
        public async void ButtonClick()
        {
            // Creates an instance of a speech config with specified subscription key and service region.
            // Replace with your own subscription key and service region (e.g., "westus").
            var config = SpeechConfig.FromSubscription("YourSubscriptionKey", "YourServiceRegion");
    
            // Make sure to dispose the recognizer after use!
            using (var recognizer = new SpeechRecognizer(config))
            {
                lock (threadLocker)
                {
                    waitingForReco = true;
                }
    
                // Starts speech recognition, and returns after a single utterance is recognized. The end of a
                // single utterance is determined by listening for silence at the end or until a maximum of 15
                // seconds of audio is processed.  The task returns the recognition text as result.
                // Note: Since RecognizeOnceAsync() returns only a single utterance, it is suitable only for single
                // shot recognition like command or query.
                // For long-running multi-utterance recognition, use StartContinuousRecognitionAsync() instead.
                var result = await recognizer.RecognizeOnceAsync().ConfigureAwait(false);
    
                // Checks result.
                string newMessage = string.Empty;
                if (result.Reason == ResultReason.RecognizedSpeech)
                {
                    newMessage = result.Text;
                }
                else if (result.Reason == ResultReason.NoMatch)
                {
                    newMessage = "NOMATCH: Speech could not be recognized.";
                }
                else if (result.Reason == ResultReason.Canceled)
                {
                    var cancellation = CancellationDetails.FromResult(result);
                    newMessage = $"CANCELED: Reason={cancellation.Reason} ErrorDetails={cancellation.ErrorDetails}";
                }
    
                lock (threadLocker)
                {
                    message = newMessage;
                    waitingForReco = false;
                }
            }
        }
    
        void Start()
        {
            if (outputText == null)
            {
                UnityEngine.Debug.LogError("outputText property is null! Assign a UI Text element to it.");
            }
            else if (startRecoButton == null)
            {
                message = "startRecoButton property is null! Assign a UI Button to it.";
                UnityEngine.Debug.LogError(message);
            }
            else
            {
                // Continue with normal initialization, Text and Button objects are present.
    
    #if PLATFORM_ANDROID
                // Request to use the microphone, cf.
                // https://docs.unity3d.com/Manual/android-RequestingPermissions.html
                message = "Waiting for mic permission";
                if (!Permission.HasUserAuthorizedPermission(Permission.Microphone))
                {
                    Permission.RequestUserPermission(Permission.Microphone);
                }
    #else
                micPermissionGranted = true;
                message = "Click button to recognize speech";
    #endif
                startRecoButton.onClick.AddListener(ButtonClick);
            }
        }
    
        void Update()
        {
    #if PLATFORM_ANDROID
            if (!micPermissionGranted && Permission.HasUserAuthorizedPermission(Permission.Microphone))
            {
                micPermissionGranted = true;
                message = "Click button to recognize speech";
            }
    #endif
    
            lock (threadLocker)
            {
                if (startRecoButton != null)
                {
                    startRecoButton.interactable = !waitingForReco && micPermissionGranted;
                }
                if (outputText != null)
                {
                    outputText.text = message;
                }
            }
        }
    }
    
  4. 找出字串 YourSubscriptionKey 並將其取代為您的語音服務訂用帳戶金鑰。Locate and replace the string YourSubscriptionKey with your Speech Services subscription key.

  5. 找出字串 YourServiceRegion 並將其取代為與您的訂用帳戶相關聯的區域Locate and replace the string YourServiceRegion with the region associated with your subscription. 例如,如果您使用的是免費試用版,區域將是 westusFor example, if you're using the free trial, the region is westus.

  6. 儲存指令碼的變更。Save the changes to the script.

  7. 回到 Unity 編輯器,您必須將指令碼新增為其中一個遊戲物件的元件。Back in the Unity Editor, the script needs to be added as a component to one of your game objects.

    • 在階層視窗中按一下 [畫布] 物件。Click on the Canvas object in the Hierarchy Window. 這會在偵測器視窗 (依預設位於右側) 中開啟設定。This opens up the setting in the Inspector Window (by default on the right).

    • 在偵測器視窗中按一下 [新增元件] 按鈕,然後搜尋我們先前建立的 HelloWorld 指令碼,並加以新增。Click the Add Component button in the Inspector Window, then search for the HelloWorld script we create above and add it.

    • 請注意,Hello World 元件具有 [輸出文字] 和 [開始辨識按鈕] 這兩個未初始化的屬性,與 HelloWorld 類別的公用屬性相符。Note that the Hello World component has two uninitialized properties, Output Text and Start Reco Button, that match public properties of the HelloWorld class. 若要加以連接,請按一下 [物件選擇器] (屬性右側的小圓圈圖示),然後選擇您先前建立的文字和按鈕物件。To wire them up, click the Object Picker (the small circle icon to the right of the property), and choose the text and button objects you created earlier.

      注意

      按鈕中還有內嵌的文字物件。The button also has a nested text object. 請確定您並未不慎選擇該物件作為文字輸出 (或者,請使用 [偵測器視窗] 中的 [名稱] 欄位將其中一個文字物件重新命名,以避免混淆)。Make sure you do not accidentally pick it for text output (or rename one of the text objects using the Name field in the Inspector Window to avoid that confusion).

在 Unity 編輯器中執行應用程式Run the application in the Unity Editor

  • 在 [Unity 編輯器] 工具列 (在功能表列下方) 按下 [播放] 按鈕。Press the Play button in the Unity Editor toolbar (below the menu bar).

  • 在應用程式啟動後按一下按鈕,然後對電腦的麥克風說出英文片語或句子。After the app launches, click the button and speak an English phrase or sentence into your computer's microphone. 您的語音會傳送到語音服務,並且轉譯為文字,出現在視窗中。Your speech is transmitted to the Speech Services and transcribed to text, which appears in the window.

    在 Unity 遊戲視窗中執行快速入門的螢幕擷取畫面Screenshot of the running quickstart in the Unity Game Window

  • 查看主控台視窗中偵錯訊息。Check the Console Window for debug messages.

  • 完成語音辨識後,請按一下 [Unity 編輯器] 工具列中的 [播放] 按鈕以停止應用程式。When you're done recognizing speech, click the Play button in the Unity Editor toolbar to stop the app.

執行此應用程式的其他選項Additional options to run this application

此應用程式也可部署至 Android,作為 Windows 的獨立應用程式或 UWP 應用程式。This application can also be deployed to Android, as a Windows stand-alone app, or UWP application. 請參閱 quickstart/csharp-unity 資料夾中的範例存放庫,其中包含這些其他目標的設定說明。Refer to our sample repository in the quickstart/csharp-unity folder that describes the configuration for these additional targets.

後續步驟Next steps

另請參閱See also