快速入門:使用適用於 Unity 的語音 SDK (搶鮮版 (Beta)) 來辨識語音Quickstart: Recognize speech with the Speech SDK for Unity (Beta)

另備有文字轉換語音的快速入門。Quickstarts are also available for text-to-speech.

使用本指南,可透過 Unity 和適用於 Unity 的語音 SDK (搶鮮版 (Beta)) 建立語音轉文字的應用程式。Use this guide to create a speech-to-text application using Unity and the Speech SDK for Unity (Beta). 完成之後,您便可以命令裝置將語音即時轉譯為文字。When finished, you can talk into your device to transcribe speech to text in real time. 如果您不熟悉 Unity,建議您在開發應用程式之前,先研究 Unity 使用者手冊If you're new to Unity, we suggest you study the Unity User Manual before developing your application.

注意

適用於 Unity 的語音 SDK 目前為搶鮮版 (Beta)。The Speech SDK for Unity is currently in beta. 它支援 Windows Desktop (x86 和 x64) 或通用 Windows 平台 (x86、x64、ARM/ARM64),以及 Android (x86、ARM32/64)。It supports Windows Desktop (x86 and x64) or Universal Windows Platform (x86, x64, ARM/ARM64), and Android (x86, ARM32/64).

必要條件Prerequisites

若要完成此專案,您需要:To complete this project, you'll need:

建立 Unity 專案Create a Unity project

  1. 開啟 Unity。Open Unity. 如果您首次使用 Unity,便會出現 [Unity 中樞] 視窗。If you're using Unity for the first time, the Unity Hub window appears. (您也可以直接開啟 Unity 中樞來進入此視窗)。(You can also open Unity Hub directly to get to this window.)

    Unity 中樞視窗Unity Hub window

  2. 選取 [ 新增]。Select New. 隨即會出現 [使用 Unity 建立新專案] 視窗。The Create a new project with Unity window appears.

    在 Unity 中樞中建立新專案Create a new project in Unity Hub

  3. 在 [專案名稱] 中,輸入 csharp-unityIn Project Name, enter csharp-unity.

  4. 在 [範本] 中,如果尚未選取 [3D] ,請加以選取。In Templates, if 3D isn't already selected, select it.

  5. 在 [位置] 中,選取或建立用來儲存專案的資料夾。In Location, select or create a folder to save the project in.

  6. 選取 [建立] 。Select Create.

不久後就會出現 [Unity 編輯器] 視窗。After a bit of time, the Unity Editor window appears.

安裝語音 SDKInstall the Speech SDK

若要安裝適用於 Unity 的語音 SDK,請遵循下列步驟:To install the Speech SDK for Unity, follow these steps:

重要

下載此頁面上的任何「Azure 認知服務的語音 SDK」元件,即表示您知悉其授權。By downloading any of the Speech SDK for Azure Cognitive Services components on this page, you acknowledge its license. 請參閱語音 SDK 的 Microsoft 軟體授權條款See the Microsoft Software License Terms for the Speech SDK.

  1. 下載和開啟 適用於 Unity 的語音 SDK (搶鮮版 (Beta)),其會封裝為 Unity 資產套件 (.unitypackage)。Download and open the Speech SDK for Unity (Beta), which is packaged as a Unity asset package (.unitypackage). 當資產套件開啟時,[匯入 Unity 套件] 對話方塊隨即出現。When the asset package is opened, the Import Unity Package dialog box appears.

    Unity 編輯器中的 [匯入 Unity 套件] 對話方塊Import Unity Package dialog box in the Unity Editor

  2. 確定所有檔案皆已選取,然後選取 [匯入] 。Ensure that all files are selected, and select Import. 一會兒之後,Unity 資產套件就會匯入到您的專案中。After a few moments, the Unity asset package is imported into your project.

如需將資產套件匯入到 Unity 的詳細資訊,請參閱 Unity 文件For more information about importing asset packages into Unity, see the Unity documentation.

新增 UIAdd UI

現在,讓我們在場景中新增最小的 UI。Now let's add a minimal UI to our scene. 此 UI 會包含用來觸發語音辨識的按鈕,以及用來顯示結果的文字欄位。This UI consists of a button to trigger speech recognition and a text field to display the result. [階層] 視窗中,會顯示 Unity 使用新專案所建立的場景範例。In the Hierarchy window, a sample scene is shown that Unity created with the new project.

  1. 在 [階層] 視窗頂端,選取 [建立] > [UI] > [按鈕] 。At the top of the Hierarchy window, select Create > UI > Button.

    此動作會建立三個可在 [階層] 視窗中檢視的遊戲物件:按鈕物件、含有按鈕的畫布物件,以及 EventSystem 物件。This action creates three game objects that you can see in the Hierarchy window: a Button object, a Canvas object containing the button, and an EventSystem object.

    Unity 編輯器環境Unity Editor environment

  2. 瀏覽 [場景] 檢視,以清楚檢視 [場景] 檢視中的畫布和按鈕。Navigate the Scene view so you have a good view of the canvas and the button in the Scene view.

  3. [偵測器] 視窗 (依預設位於右側) 中,將 Pos XPos Y 屬性設定為 0,以便讓按鈕位於畫布中央。In the Inspector window (by default on the right), set the Pos X and Pos Y properties to 0, so the button is centered in the middle of the canvas.

  4. 在 [階層] 視窗中,選取 [建立] > [UI] > [文字] 以建立文字物件。In the Hierarchy window, select Create > UI > Text to create a Text object.

  5. 在 [偵測器] 視窗中,將 Pos XPos Y 屬性設定為 0120,並將 [寬度] 和 [高度] 屬性設定為 240120In the Inspector window, set the Pos X and Pos Y properties to 0 and 120, and set the Width and Height properties to 240 and 120. 這些值可確保文字欄位和按鈕不會重疊。These values ensure that the text field and the button don't overlap.

完成之後,[場景] 檢視應該會如下列螢幕擷取畫面所示:When you're done, the Scene view should look similar to this screenshot:

Unity 編輯器中的 [場景] 檢視Scene view in the Unity Editor

新增範例程式碼Add the sample code

若要新增 Unity 專案的指令碼程式碼範例,請遵循下列步驟:To add the sample script code for the Unity project, follow these steps:

  1. [專案] 視窗中,選取 [建立] > [C# 指令碼] 以新增 C# 指令碼。In the Project window, select Create > C# script to add a new C# script.

    Unity 編輯器中的 [專案] 視窗Project window in the Unity Editor

  2. 將指令碼命名為 HelloWorldName the script HelloWorld.

  3. 按兩下 HelloWorld 以編輯新建立的指令碼。Double-click HelloWorld to edit the newly created script.

    注意

    若要設定供 Unity 用來編輯的程式碼編輯器,請選取[編輯] > [喜好設定] ,然後移至 [外部工具] 喜好設定。To configure the code editor to be used by Unity for editing, select Edit > Preferences, and then go to the External Tools preferences. 如需詳細資訊,請參閱 Unity 使用者手冊For more information, see the Unity User Manual.

  4. 將現有指令碼取代為下列程式碼:Replace the existing script with the following code:

    using UnityEngine;
    using UnityEngine.UI;
    using Microsoft.CognitiveServices.Speech;
    #if PLATFORM_ANDROID
    using UnityEngine.Android;
    #endif
    
    public class HelloWorld : MonoBehaviour
    {
        // Hook up the two properties below with a Text and Button object in your UI.
        public Text outputText;
        public Button startRecoButton;
    
        private object threadLocker = new object();
        private bool waitingForReco;
        private string message;
    
        private bool micPermissionGranted = false;
    
    #if PLATFORM_ANDROID
        // Required to manifest microphone permission, cf.
        // https://docs.unity3d.com/Manual/android-manifest.html
        private Microphone mic;
    #endif
    
        public async void ButtonClick()
        {
            // Creates an instance of a speech config with specified subscription key and service region.
            // Replace with your own subscription key and service region (e.g., "westus").
            var config = SpeechConfig.FromSubscription("YourSubscriptionKey", "YourServiceRegion");
    
            // Make sure to dispose the recognizer after use!
            using (var recognizer = new SpeechRecognizer(config))
            {
                lock (threadLocker)
                {
                    waitingForReco = true;
                }
    
                // Starts speech recognition, and returns after a single utterance is recognized. The end of a
                // single utterance is determined by listening for silence at the end or until a maximum of 15
                // seconds of audio is processed.  The task returns the recognition text as result.
                // Note: Since RecognizeOnceAsync() returns only a single utterance, it is suitable only for single
                // shot recognition like command or query.
                // For long-running multi-utterance recognition, use StartContinuousRecognitionAsync() instead.
                var result = await recognizer.RecognizeOnceAsync().ConfigureAwait(false);
    
                // Checks result.
                string newMessage = string.Empty;
                if (result.Reason == ResultReason.RecognizedSpeech)
                {
                    newMessage = result.Text;
                }
                else if (result.Reason == ResultReason.NoMatch)
                {
                    newMessage = "NOMATCH: Speech could not be recognized.";
                }
                else if (result.Reason == ResultReason.Canceled)
                {
                    var cancellation = CancellationDetails.FromResult(result);
                    newMessage = $"CANCELED: Reason={cancellation.Reason} ErrorDetails={cancellation.ErrorDetails}";
                }
    
                lock (threadLocker)
                {
                    message = newMessage;
                    waitingForReco = false;
                }
            }
        }
    
        void Start()
        {
            if (outputText == null)
            {
                UnityEngine.Debug.LogError("outputText property is null! Assign a UI Text element to it.");
            }
            else if (startRecoButton == null)
            {
                message = "startRecoButton property is null! Assign a UI Button to it.";
                UnityEngine.Debug.LogError(message);
            }
            else
            {
                // Continue with normal initialization, Text and Button objects are present.
    
    #if PLATFORM_ANDROID
                // Request to use the microphone, cf.
                // https://docs.unity3d.com/Manual/android-RequestingPermissions.html
                message = "Waiting for mic permission";
                if (!Permission.HasUserAuthorizedPermission(Permission.Microphone))
                {
                    Permission.RequestUserPermission(Permission.Microphone);
                }
    #else
                micPermissionGranted = true;
                message = "Click button to recognize speech";
    #endif
                startRecoButton.onClick.AddListener(ButtonClick);
            }
        }
    
        void Update()
        {
    #if PLATFORM_ANDROID
            if (!micPermissionGranted && Permission.HasUserAuthorizedPermission(Permission.Microphone))
            {
                micPermissionGranted = true;
                message = "Click button to recognize speech";
            }
    #endif
    
            lock (threadLocker)
            {
                if (startRecoButton != null)
                {
                    startRecoButton.interactable = !waitingForReco && micPermissionGranted;
                }
                if (outputText != null)
                {
                    outputText.text = message;
                }
            }
        }
    }
    
  5. 尋找字串 YourSubscriptionKey 並將其取代為您的語音服務訂用帳戶金鑰。Find and replace the string YourSubscriptionKey with your Speech Services subscription key.

  6. 尋找字串 YourServiceRegion 並將其取代為與您的訂用帳戶相關聯的區域Find and replace the string YourServiceRegion with the region associated with your subscription. 例如,如果您使用的是免費試用版,區域將是 westusFor example, if you're using the free trial, the region is westus.

  7. 儲存指令碼的變更。Save the changes to the script.

現在,返回 Unity 編輯器,然後將指令碼新增為其中一個遊戲物件的元件:Now return to the Unity Editor and add the script as a component to one of your game objects:

  1. 在 [階層] 視窗中,選取 [畫布] 物件。In the Hierarchy window, select the Canvas object.

  2. 在 [偵測器] 視窗中,選取 [新增元件] 按鈕。In the Inspector window, select the Add Component button.

    Unity 編輯器中的 [偵測器] 視窗Inspector window in the Unity Editor

  3. 在下拉式清單中,搜尋我們前面建立的 HelloWorld 指令碼並加以新增。In the drop-down list, search for the HelloWorld script we created above and add it. [偵測器] 視窗中便會出現 [Hello World (指令碼)] 區段,並列出兩個未初始化的屬性,分別是 [輸出文字] 和 [開始辨識按鈕] 。A Hello World (Script) section appears in the Inspector window, listing two uninitialized properties, Output Text and Start Reco Button. 這些 Unity 元件屬性會符合 HelloWorld 類別的公用屬性。These Unity component properties match public properties of the HelloWorld class.

  4. 選取 [開始辨識按鈕] 屬性的物件選擇器 (屬性右邊的小圓圈圖示),然後選擇您稍早建立的 [按鈕] 物件。Select the Start Reco Button property's object picker (the small circle icon to the right of the property), and choose the Button object you created earlier.

  5. 選取 [輸出文字] 屬性的物件選擇器,然後選擇您稍早建立的 [文字] 物件。Select the Output Text property's object picker, and choose the Text object you created earlier.

    注意

    按鈕中還有內嵌的文字物件。The button also has a nested text object. 請確定您並未不慎選擇該物件作為文字輸出 (或者,請使用 [偵測器] 視窗中的 [名稱] 欄位將其中一個文字物件重新命名,以避免混淆)。Make sure you do not accidentally pick it for text output (or rename one of the text objects using the Name field in the Inspector window to avoid confusion).

在 Unity 編輯器中執行應用程式Run the application in the Unity Editor

現在您已經準備好在 Unity 編輯器內執行應用程式。Now you're ready to run the application within the Unity Editor.

  1. 在 Unity 編輯器工具列 (位於功能表列下方) 中,選取 [播放] 按鈕 (指向右方的三角形)。In the Unity Editor toolbar (below the menu bar), select the Play button (a right-pointing triangle).

  2. 移至 [遊戲] 檢視,並等候 [文字] 物件顯示 [按一下按鈕以辨識語音] 。Go to Game view, and wait for the Text object to display Click button to recognize speech. (當應用程式尚未啟動或尚未準備好回應時,則會顯示 [新增文字] )。(It displays New Text when the application hasn't started or isn't ready to respond.)

  3. 選取該按鈕,然後對電腦的麥克風說出英文片語或句子。Select the button and speak an English phrase or sentence into your computer's microphone. 您的語音會傳送到語音服務並轉譯為文字,文字則會出現在 [遊戲] 檢視中。Your speech is transmitted to the Speech Services and transcribed to text, which appears in the Game view.

    Unity 編輯器中的 [遊戲] 檢視Game view in the Unity Editor

  4. 查看 [主控台] 視窗中的偵錯訊息。Check the Console window for debug messages. 如果 [主控台] 視窗並未出現,請移至功能表列,然後選取 [視窗] > [一般] > [主控台] 來顯示該視窗。If the Console window isn't showing, go to the menu bar and select Window > General > Console to display it.

  5. 完成語音辨識後,請選取 [Unity 編輯器] 工具列中的 [播放] 按鈕以停止應用程式。When you're done recognizing speech, select the Play button in the Unity Editor toolbar to stop the application.

執行此應用程式的其他選項Additional options to run this application

此應用程式也可部署為 Android 應用程式、Windows 的獨立應用程式或 UWP 應用程式。This application can also be deployed to as an Android app, a Windows stand-alone app, or a UWP application. 如需詳細資訊,請參閱存放庫範例For more information, see our sample repository. quickstart/csharp-unity 資料夾會說明這些額外目標的設定。The quickstart/csharp-unity folder describes the configuration for these additional targets.

後續步驟Next steps

另請參閱See also