MR Input 212:語音MR Input 212: Voice

注意

混合實境學院教學課程的設計是以 HoloLens (第 1 代) 和混合實境沉浸式頭戴裝置為準。The Mixed Reality Academy tutorials were designed with HoloLens (1st gen) and Mixed Reality Immersive Headsets in mind. 因此,對於仍在尋找這些裝置開發指引的開發人員而言,我們覺得這些教學課程很重要。As such, we feel it is important to leave these tutorials in place for developers who are still looking for guidance in developing for those devices. 這些教學課程 不會 使用用於 HoloLens 2 的最新工具組或互動進行更新。These tutorials will not be updated with the latest toolsets or interactions being used for HoloLens 2. 系統會保留這些資訊,以繼續在支援的裝置上運作。They will be maintained to continue working on the supported devices. 已針對 HoloLens 2 公佈一系列新的教學課程A new series of tutorials has been posted for HoloLens 2.

語音輸入 可讓我們以另一種方式與我們的全像影像互動。Voice input gives us another way to interact with our holograms. 語音命令的運作方式非常自然且簡單。Voice commands work in a very natural and easy way. 設計您的語音命令,使其為:Design your voice commands so that they are:

  • NaturalNatural
  • 容易記住Easy to remember
  • 適當的內容Context appropriate
  • 與相同內容中的其他選項有充分的差異Sufficiently distinct from other options within the same context

MR 基本的 101中,我們使用了 KeywordRecognizer 來建立兩個簡單的語音命令。In MR Basics 101, we used the KeywordRecognizer to build two simple voice commands. 在 MR 輸入212中,我們將深入探討並學習如何:In MR Input 212, we'll dive deeper and learn how to:

  • 設計針對 HoloLens 語音引擎優化的語音命令。Design voice commands that are optimized for the HoloLens speech engine.
  • 讓使用者知道有哪些語音命令可供使用。Make the user aware of what voice commands are available.
  • 確認我們已聽過使用者的語音命令。Acknowledge that we've heard the user's voice command.
  • 使用聽寫辨識器瞭解使用者的意思。Understand what the user is saying, using a Dictation Recognizer.
  • 使用文法辨識器,根據 SRGS 或語音辨識文法規格(file)來接聽命令。Use a Grammar Recognizer to listen for commands based on an SRGS, or Speech Recognition Grammar Specification, file.

在此課程中,我們將重新流覽 [模型瀏覽器],這是我們在 Mr 輸入 210MR 輸入 211中建立的。In this course, we'll revisit Model Explorer, which we built in MR Input 210 and MR Input 211.

重要

下列各章節中內嵌的影片是使用較舊版本的 Unity 和混合現實工具組所記錄。The videos embedded in each of the chapters below were recorded using an older version of Unity and the Mixed Reality Toolkit. 雖然逐步指示是正確且最新的,但您可能會在相對應的影片中看到已過期的腳本和視覺效果。While the step-by-step instructions are accurate and current, you may see scripts and visuals in the corresponding videos that are out-of-date. 影片仍隨附于 posterity,因為所涵蓋的概念仍然適用。The videos remain included for posterity and because the concepts covered still apply.

裝置支援Device support

課程Course HoloLensHoloLens 沉浸式頭戴裝置Immersive headsets
MR Input 212:語音MR Input 212: Voice ✔️✔️ ✔️✔️

在您開始使用 Intune 之前Before you start

必要條件Prerequisites

專案檔Project files

  • 下載專案 所需的 檔案。Download the files required by the project. 需要 Unity 2017.2 或更新版本。Requires Unity 2017.2 or later.
  • 取消將檔案封存到您的桌面或其他易於觸及的位置。Un-archive the files to your desktop or other easy to reach location.

注意

如果您想要在下載之前查看原始程式碼, 可在 GitHub 上取得。If you want to look through the source code before downloading, it's available on GitHub.

勘誤表和記事Errata and Notes

  • 您必須在 [工具]-[>選項] 下的 Visual Studio 中停用 [啟用 Just My Code] ( 核取) ,才能在程式碼中叫用中斷點。"Enable Just My Code" needs to be disabled (unchecked) in Visual Studio under Tools->Options->Debugging in order to hit breakpoints in your code.

Unity 設定Unity Setup

指示Instructions

  1. 啟動 Unity。Start Unity.
  2. 選取 [開啟] 。Select Open.
  3. 流覽至您先前取消封存的 HolographicAcademy-全像 -212-Voice 資料夾。Navigate to the HolographicAcademy-Holograms-212-Voice folder you previously un-archived.
  4. 尋找並選取 [啟動 / 模型瀏覽器] 資料夾。Find and select the Starting/Model Explorer folder.
  5. 按一下 [ 選取資料夾 ] 按鈕。Click the Select Folder button.
  6. 在 [ 專案 ] 面板中,展開 [ 場景 ] 資料夾。In the Project panel, expand the Scenes folder.
  7. 按兩下 [ ModelExplorer 場景],以在 Unity 中載入。Double-click ModelExplorer scene to load it in Unity.

建置Building

  1. 在 Unity 中,選取 [ File > Build Settings]。In Unity, select File > Build Settings.
  2. 如果 場景/ModelExplorer 未列在 組建的場景 中,請按一下 [ 新增開啟場景 ] 以加入場景。If Scenes/ModelExplorer is not listed in Scenes In Build, click Add Open Scenes to add the scene.
  3. 如果您是特別針對 HoloLens 進行開發,請將 目標裝置 設定為 hololensIf you're specifically developing for HoloLens, set Target device to HoloLens. 否則,請將它保留在 任何裝置 上。Otherwise, leave it on Any device.
  4. 請確定 組建類型 設定為 [ D3D ],並將 [ Sdk ] 設定為 [ 最新安裝 的 (,其應為 SDK 16299 或更新版本的) 。Ensure Build Type is set to D3D and SDK is set to Latest installed (which should be SDK 16299 or newer).
  5. 按一下 [建置]。Click Build.
  6. 建立名為 "App" 的 新資料夾Create a New Folder named "App".
  7. 按一下 應用程式 資料夾。Single click the App folder.
  8. 按下 [ 選取資料夾 ],Unity 就會開始建立 Visual Studio 的專案。Press Select Folder and Unity will start building the project for Visual Studio.

當 Unity 完成時,將會出現檔案總管視窗。When Unity is done, a File Explorer window will appear.

  1. 開啟 應用程式 資料夾。Open the App folder.
  2. 開啟 ModelExplorer Visual Studio 方案Open the ModelExplorer Visual Studio Solution.

如果要部署到 HoloLens:If deploying to HoloLens:

  1. 使用 Visual Studio 中的頂端工具列,將目標從 Debug 變更為 Release ,以及從 ARM 變更為 x86Using the top toolbar in Visual Studio, change the target from Debug to Release and from ARM to x86.
  2. 按一下 [本機電腦] 按鈕旁的下拉箭號,然後選取 [ 遠端電腦]。Click on the drop down arrow next to the Local Machine button, and select Remote Machine.
  3. 輸入 您的 HoloLens 裝置 IP 位址 ,並將驗證模式設定為 通用 (未加密的通訊協定)Enter your HoloLens device IP address and set Authentication Mode to Universal (Unencrypted Protocol). 按一下 [選取]。Click Select. 如果您不知道您的裝置 IP 位址,請查看 [ 設定] > Network & Internet > Advanced 選項If you do not know your device IP address, look in Settings > Network & Internet > Advanced Options.
  4. 在頂端功能表列中,按一下 [ Debug-> 啟動但不進行調試 ],或按 Ctrl + F5In the top menu bar, click Debug -> Start Without debugging or press Ctrl + F5. 如果這是您第一次部署至您的裝置,您必須將 它與 Visual Studio 配對If this is the first time deploying to your device, you will need to pair it with Visual Studio.
  5. 部署應用程式之後,請使用 選取手勢 來關閉 FitboxWhen the app has deployed, dismiss the Fitbox with a select gesture.

如果部署到沉浸式耳機:If deploying to an immersive headset:

  1. 使用 Visual Studio 中的頂端工具列,將目標從 Debug 變更為 Release ,以及從 ARM 變更為 x64Using the top toolbar in Visual Studio, change the target from Debug to Release and from ARM to x64.
  2. 請確定部署目標設定為 [ 本機電腦]。Make sure the deployment target is set to Local Machine.
  3. 在頂端功能表列中,按一下 [ Debug-> 啟動但不進行調試 ],或按 Ctrl + F5In the top menu bar, click Debug -> Start Without debugging or press Ctrl + F5.
  4. 部署應用程式之後,請在動作控制器上提取觸發程式來關閉 FitboxWhen the app has deployed, dismiss the Fitbox by pulling the trigger on a motion controller.

注意

您可能會注意到 Visual Studio 錯誤面板中有一些紅色錯誤。You might notice some red errors in the Visual Studio Errors panel. 您可以放心地忽略它們。It is safe to ignore them. 切換至 [輸出] 面板以查看實際的組建進度。Switch to the Output panel to view actual build progress. 輸出面板中的錯誤會要求您進行修正 (最常見的原因是腳本) 中發生錯誤。Errors in the Output panel will require you to make a fix (most often they are caused by a mistake in a script).

第1章-認知Chapter 1 - Awareness

目標Objectives

  • 瞭解語音命令設計的 Dos 和不注意事項Learn the Dos and Don'ts of voice command design.
  • 使用 KeywordRecognizer 來新增以注視為基礎的語音命令。Use KeywordRecognizer to add gaze based voice commands.
  • 使用游標 回饋 讓使用者知道語音命令。Make users aware of voice commands using cursor feedback.

語音命令設計Voice Command Design

在本章中,您將瞭解如何設計語音命令。In this chapter, you'll learn about designing voice commands. 建立語音命令時:When creating voice commands:

DODO

  • 建立簡潔的命令。Create concise commands. 您不想要使用「 播放目前選取的影片」,因為該命令並非簡潔,而且使用者很容易就會忘記。You don't want to use "Play the currently selected video", because that command is not concise and would easily be forgotten by the user. 相反地,您應該使用:「 播放影片」,因為它很簡潔,而且有多個音節。Instead, you should use: "Play Video", because it is concise and has multiple syllables.
  • 使用簡單的詞彙。Use a simple vocabulary. 請一律嘗試使用簡單的單字和片語,方便使用者探索和記住。Always try to use common words and phrases that are easy for the user to discover and remember. 例如,如果您的應用程式具有可顯示或隱藏的附注物件,您就不會使用 " Show 牌子" 命令,因為 "牌子" 是很少使用的詞彙。For example, if your application had a note object that could be displayed or hidden from view, you would not use the command "Show Placard", because "placard" is a rarely used term. 相反地,您會 使用下列命令來顯示應用 程式中的附注。Instead, you would use the command: "Show Note", to reveal the note in your application.
  • 保持一致。Be consistent. 語音命令在您的應用程式中應該保持一致。Voice commands should be kept consistent across your application. 假設您的應用程式中有兩個場景,而且這兩個場景都包含用來關閉應用程式的按鈕。Imagine that you have two scenes in your application and both scenes contain a button for closing the application. 如果第一個場景使用 " Exit" 命令來觸發按鈕,但第二個場景使用「 關閉應用程式」命令,則使用者會感到混淆。If the first scene used the command "Exit" to trigger the button, but the second scene used the command "Close App", then the user is going to get very confused. 如果相同的功能跨多個場景保存,則應該使用相同的語音命令來觸發它。If the same functionality persists across multiple scenes, then the same voice command should be used to trigger it.

不要DON'T

  • 使用單一的音節命令。Use single syllable commands. 例如,如果您要建立語音命令來播放影片,您應該避免使用簡單的命令「 播放」,因為這只是一個音節,而且可能很容易被系統錯過。As an example, if you were creating a voice command to play a video, you should avoid using the simple command "Play", as it is only a single syllable and could easily be missed by the system. 相反地,您應該使用:「 播放影片」,因為它很簡潔,而且有多個音節。Instead, you should use: "Play Video", because it is concise and has multiple syllables.
  • 使用系統命令。Use system commands. 系統會保留 "Select" 命令,以觸發目前焦點物件的點擊事件。The "Select" command is reserved by the system to trigger a Tap event for the currently focused object. 請勿重複使用關鍵字或片語中的 "Select" 命令,因為它可能無法如您預期般運作。Do not re-use the "Select" command in a keyword or phrase, as it might not work as you expect. 例如,如果在您的應用程式中選取 cube 的聲音命令為「 選取 cube」,但使用者在從未提到命令時正在查看球體,則會改為選取球體。For example, if the voice command for selecting a cube in your application was "Select cube", but the user was looking at a sphere when they uttered the command, then the sphere would be selected instead. 同樣地,已啟用語音的應用程式行命令。Similarly app bar commands are voice enabled. 請勿在 CoreWindow View 中使用下列語音命令:Don't use the following speech commands in your CoreWindow View:
    1. 向後Go Back
    2. 捲軸工具Scroll Tool
    3. Zoom 工具Zoom Tool
    4. 拖曳工具Drag Tool
    5. AdjustAdjust
    6. 移除Remove
  • 使用類似的聲音。Use similar sounds. 請嘗試避免使用 rhyme 的語音命令。Try to avoid using voice commands that rhyme. 如果您的購物應用程式支援 [ 顯示存放區] 和 [ 顯示更多] 作為語音命令,則您會想要在另一個使用中的命令停用其中一個命令。If you had a shopping application which supported "Show Store" and "Show More" as voice commands, then you would want to disable one of the commands while the other was in use. 例如,您可以使用 [ 顯示存放區] 按鈕來開啟存放區,然後在商店顯示時停用該命令,如此就可以使用 [ 顯示更多] 命令來進行流覽。For example, you could use the "Show Store" button to open the store, and then disable that command when the store was displayed so that the "Show More" command could be used for browsing.

指示Instructions

  • Unity 的階層 面板中,使用搜尋工具來尋找 holoComm_screen_mesh 物件。In Unity's Hierarchy panel, use the search tool to find the holoComm_screen_mesh object.
  • 按兩下 holoComm_screen_mesh 物件以在 場景 中加以查看。Double-click on the holoComm_screen_mesh object to view it in the Scene. 這是太空人的監看,它會回應我們的語音命令。This is the astronaut's watch, which will respond to our voice commands.
  • 在 [偵測 ] 面板中,找出 語音輸入來源 (腳本) 元件。In the Inspector panel, locate the Speech Input Source (Script) component.
  • 展開 [ 關鍵字 ] 區段以查看支援的語音命令: 開啟 CommunicatorExpand the Keywords section to see the supported voice command: Open Communicator.
  • 按一下右邊的齒輪,然後選取 [ 編輯腳本]。Click the cog to the right side, then select Edit Script.
  • 探索 SpeechInputSource.cs ,瞭解它如何使用 KeywordRecognizer 來新增語音命令。Explore SpeechInputSource.cs to understand how it uses the KeywordRecognizer to add voice commands.

建置和部署Build and Deploy

  • 在 Unity 中,使用檔案 > 組建設定 來重建應用程式。In Unity, use File > Build Settings to rebuild the application.
  • 開啟 應用程式 資料夾。Open the App folder.
  • 開啟 ModelExplorer Visual Studio 方案Open the ModelExplorer Visual Studio Solution.

(如果您已在安裝期間于 Visual Studio 中建立或部署此專案,則您可以開啟 VS 的實例,然後在出現) 提示時按一下 [全部重載]。(If you already built/deployed this project in Visual Studio during set-up, then you can open that instance of VS and click 'Reload All' when prompted).

  • 在 Visual Studio 中,按一下 [ Debug-> 啟動但不進行調試 ],或按 Ctrl + F5In Visual Studio, click Debug -> Start Without debugging or press Ctrl + F5.
  • 將應用程式部署到 HoloLens 之後,請使用 [點 [一下手勢] 關閉 符合 ] 方塊。After the application deploys to the HoloLens, dismiss the fit box using the air-tap gesture.
  • 觀賞太空人的觀賞。Gaze at the astronaut's watch.
  • 當 watch 具有焦點時,請確認游標變更為麥克風。When the watch has focus, verify that the cursor changes to a microphone. 這會提供應用程式接聽語音命令的意見反應。This provides feedback that the application is listening for voice commands.
  • 確認 watch 上出現工具提示。Verify that a tooltip appears on the watch. 這可協助使用者探索 "Open Communicator" 命令。This helps users discover the "Open Communicator" command.
  • 撥雲見日監看時,請說「 開啟 communicator 」來開啟 communicator 面板。While gazing at the watch, say "Open Communicator" to open the communicator panel.

第2章-確認Chapter 2 - Acknowledgement

目標Objectives

  • 使用麥克風輸入記錄訊息。Record a message using the Microphone input.
  • 將意見反應提供給使用者,指出應用程式正在接聽其語音。Give feedback to the user that the application is listening to their voice.

注意

必須針對要從麥克風錄製的應用程式宣告 麥克風 功能。The Microphone capability must be declared for an app to record from the microphone. 您已在 MR 輸入212中為您完成這項操作,但請記住您自己的專案。This is done for you already in MR Input 212, but keep this in mind for your own projects.

  1. 在 Unity 編輯器中,流覽至 [> Player 編輯 > 專案設定],移至播放機設定In the Unity Editor, go to the player settings by navigating to "Edit > Project Settings > Player"
  2. 按一下 [通用 Windows 平臺] 索引標籤Click on the "Universal Windows Platform" tab
  3. 在 [發佈設定 > 功能] 區段中,檢查 麥克風 功能In the "Publishing Settings > Capabilities" section, check the Microphone capability

指示Instructions

  • 在 Unity 的 階層面板中 ,確認已選取 holoComm_screen_mesh 物件。In Unity's Hierarchy panel, verify that the holoComm_screen_mesh object is selected.
  • 在 [偵測 ] 面板中,尋找 太空人 Watch (腳本) 元件。In the Inspector panel, find the Astronaut Watch (Script) component.
  • 按一下設定為 [ Communicator 預製專案 ] 屬性值的小、藍色 cube。Click on the small, blue cube which is set as the value of the Communicator Prefab property.
  • 在 [ 專案 ] 面板中, Communicator 預製專案現在應該會有焦點。In the Project panel, the Communicator prefab should now have focus.
  • 按一下 [專案] 面板中的 Communicator 預製專案,以在偵測 中查看其元件。Click on the Communicator prefab in the Project panel to view its components in the Inspector.
  • 查看 麥克風管理員 (腳本) 元件,這可讓我們記錄使用者的聲音。Look at the Microphone Manager (Script) component, this will allow us to record the user's voice.
  • 請注意, Communicator 物件具有 語音輸入處理常式 (腳本) 元件,可回應 傳送訊息 命令。Notice that the Communicator object has a Speech Input Handler (Script) component for responding to the Send Message command.
  • 查看 Communicator (腳本) 元件,然後按兩下腳本,在 Visual Studio 中開啟它。Look at the Communicator (Script) component and double-click on the script to open it in Visual Studio.

Communicator.cs 負責在 communicator 裝置上設定適當的按鈕狀態。Communicator.cs is responsible for setting the proper button states on the communicator device. 這可讓我們的使用者記錄訊息、播放訊息,然後將訊息傳送至太空人。This will allow our users to record a message, play it back, and send the message to the astronaut. 它也會啟動和停止動畫 wave 表單,以向使用者確認他們的聲音聽到。It will also start and stop an animated wave form, to acknowledge to the user that their voice was heard.

  • Communicator.cs 中,從 Start 方法 (81 和 82) 刪除下列幾行。In Communicator.cs, delete the following lines (81 and 82) from the Start method. 這會啟用 communicator 上的 [記錄] 按鈕。This will enable the 'Record' button on the communicator.
// TODO: 2.a Delete the following two lines:
RecordButton.SetActive(false);
MessageUIRenderer.gameObject.SetActive(false);

建置和部署Build and Deploy

  • 在 Visual Studio 中,重建您的應用程式並部署至裝置。In Visual Studio, rebuild your application and deploy to the device.
  • 觀賞太空人的監看,然後說「 開啟 communicator 」來顯示 Communicator。Gaze at the astronaut's watch and say "Open Communicator" to show the communicator.
  • 按下 [ 錄製 ] 按鈕 (麥克風) ,開始記錄太空人的口頭訊息。Press the Record button (microphone) to start recording a verbal message for the astronaut.
  • 開始說話,並確認 wave 動畫是在 communicator 上播放的,它會將意見反應提供給使用者,指出他們的聲音聽起來。Start speaking, and verify that the wave animation plays on the communicator, which provides feedback to the user that their voice is heard.
  • 按下 [ 停止 ] 按鈕 (左方塊) ,並確認 wave 動畫停止執行。Press the Stop button (left square), and verify that the wave animation stops running.
  • 按下 [ 播放 ] 按鈕 (右三角形) 播放錄製的訊息,並在裝置上聆聽。Press the Play button (right triangle) to play back the recorded message and hear it on the device.
  • 按下 [ 停止 ] 按鈕 (右方塊) 停止播放錄製的訊息。Press the Stop button (right square) to stop playback of the recorded message.
  • 說「 傳送訊息 」以關閉 communicator,並接收來自太空人的「訊息已接收」回應。Say "Send Message" to close the communicator and receive a 'Message Received' response from the astronaut.

第3章-瞭解與聽寫辨識器Chapter 3 - Understanding and the Dictation Recognizer

目標Objectives

  • 使用聽寫辨識器將使用者的語音轉換成文字。Use the Dictation Recognizer to convert the user's speech to text.
  • 在 communicator 中顯示聽寫辨識器的假設和最終結果。Show the Dictation Recognizer's hypothesized and final results in the communicator.

在本章中,我們將使用聽寫辨識器來建立太空人的訊息。In this chapter, we'll use the Dictation Recognizer to create a message for the astronaut. 使用聽寫辨識器時,請記住:When using the Dictation Recognizer, keep in mind that:

  • 您必須連線到 WiFi,聽寫辨識器才能運作。You must be connected to WiFi for the Dictation Recognizer to work.
  • 在設定的一段時間後發生超時。Timeouts occur after a set period of time. 有兩個需要注意的超時:There are two timeouts to be aware of:
    • 如果辨識器啟動,但在前五秒內未收到任何音訊,則會超時。If the recognizer starts and doesn't hear any audio for the first five seconds, it will timeout.
    • 如果辨識器已指定結果,但接著會聽到20秒的無回應,則會超時。If the recognizer has given a result but then hears silence for twenty seconds, it will timeout.
  • 一次只能執行一種辨識器 (關鍵字或聽寫) 。Only one type of recognizer (Keyword or Dictation) can run at a time.

注意

必須針對要從麥克風錄製的應用程式宣告 麥克風 功能。The Microphone capability must be declared for an app to record from the microphone. 您已在 MR 輸入212中為您完成這項操作,但請記住您自己的專案。This is done for you already in MR Input 212, but keep this in mind for your own projects.

  1. 在 Unity 編輯器中,流覽至 [> Player 編輯 > 專案設定],移至播放機設定In the Unity Editor, go to the player settings by navigating to "Edit > Project Settings > Player"
  2. 按一下 [通用 Windows 平臺] 索引標籤Click on the "Universal Windows Platform" tab
  3. 在 [發佈設定 > 功能] 區段中,檢查 麥克風 功能In the "Publishing Settings > Capabilities" section, check the Microphone capability

指示Instructions

我們將編輯 MicrophoneManager.cs 以使用聽寫辨識器。We're going to edit MicrophoneManager.cs to use the Dictation Recognizer. 這就是我們要新增的內容:This is what we'll add:

  1. 當您按下 [ 錄製] 按鈕 時,我們會 啟動 DictationRecognizerWhen the Record button is pressed, we'll start the DictationRecognizer.
  2. 顯示 DictationRecognizer 瞭解的 假設Show the hypothesis of what the DictationRecognizer understood.
  3. 鎖定 DictationRecognizer 瞭解的 結果Lock in the results of what the DictationRecognizer understood.
  4. 檢查 DictationRecognizer 是否有超時。Check for timeouts from the DictationRecognizer.
  5. 當按下 [ 停止] 按鈕 或 mic 會話超時時,請 停止 DictationRecognizerWhen the Stop button is pressed, or the mic session times out, stop the DictationRecognizer.
  6. 重新開機 KeywordRecognizer,它會接聽 傳送訊息 命令。Restart the KeywordRecognizer, which will listen for the Send Message command.

讓我們開始這次的教學。Let's get started. 完成 3. a MicrophoneManager.cs 中的所有程式碼撰寫練習,或複製並貼上完成的程式碼,如下所示:Complete all coding exercises for 3.a in MicrophoneManager.cs, or copy and paste the finished code found below:

// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License. See LICENSE in the project root for license information.

using System.Collections;
using System.Text;
using UnityEngine;
using UnityEngine.UI;
using UnityEngine.Windows.Speech;

namespace Academy
{
    public class MicrophoneManager : MonoBehaviour
    {
        [Tooltip("A text area for the recognizer to display the recognized strings.")]
        [SerializeField]
        private Text dictationDisplay;

        private DictationRecognizer dictationRecognizer;

        // Use this string to cache the text currently displayed in the text box.
        private StringBuilder textSoFar;

        // Using an empty string specifies the default microphone.
        private static string deviceName = string.Empty;
        private int samplingRate;
        private const int messageLength = 10;

        // Use this to reset the UI once the Microphone is done recording after it was started.
        private bool hasRecordingStarted;

        void Awake()
        {
            /* TODO: DEVELOPER CODING EXERCISE 3.a */

            // 3.a: Create a new DictationRecognizer and assign it to dictationRecognizer variable.
            dictationRecognizer = new DictationRecognizer();

            // 3.a: Register for dictationRecognizer.DictationHypothesis and implement DictationHypothesis below
            // This event is fired while the user is talking. As the recognizer listens, it provides text of what it's heard so far.
            dictationRecognizer.DictationHypothesis += DictationRecognizer_DictationHypothesis;

            // 3.a: Register for dictationRecognizer.DictationResult and implement DictationResult below
            // This event is fired after the user pauses, typically at the end of a sentence. The full recognized string is returned here.
            dictationRecognizer.DictationResult += DictationRecognizer_DictationResult;

            // 3.a: Register for dictationRecognizer.DictationComplete and implement DictationComplete below
            // This event is fired when the recognizer stops, whether from Stop() being called, a timeout occurring, or some other error.
            dictationRecognizer.DictationComplete += DictationRecognizer_DictationComplete;

            // 3.a: Register for dictationRecognizer.DictationError and implement DictationError below
            // This event is fired when an error occurs.
            dictationRecognizer.DictationError += DictationRecognizer_DictationError;

            // Query the maximum frequency of the default microphone. Use 'unused' to ignore the minimum frequency.
            int unused;
            Microphone.GetDeviceCaps(deviceName, out unused, out samplingRate);

            // Use this string to cache the text currently displayed in the text box.
            textSoFar = new StringBuilder();

            // Use this to reset the UI once the Microphone is done recording after it was started.
            hasRecordingStarted = false;
        }

        void Update()
        {
            // 3.a: Add condition to check if dictationRecognizer.Status is Running
            if (hasRecordingStarted && !Microphone.IsRecording(deviceName) && dictationRecognizer.Status == SpeechSystemStatus.Running)
            {
                // Reset the flag now that we're cleaning up the UI.
                hasRecordingStarted = false;

                // This acts like pressing the Stop button and sends the message to the Communicator.
                // If the microphone stops as a result of timing out, make sure to manually stop the dictation recognizer.
                // Look at the StopRecording function.
                SendMessage("RecordStop");
            }
        }

        /// <summary>
        /// Turns on the dictation recognizer and begins recording audio from the default microphone.
        /// </summary>
        /// <returns>The audio clip recorded from the microphone.</returns>
        public AudioClip StartRecording()
        {
            // 3.a Shutdown the PhraseRecognitionSystem. This controls the KeywordRecognizers
            PhraseRecognitionSystem.Shutdown();

            // 3.a: Start dictationRecognizer
            dictationRecognizer.Start();

            // 3.a Uncomment this line
            dictationDisplay.text = "Dictation is starting. It may take time to display your text the first time, but begin speaking now...";

            // Set the flag that we've started recording.
            hasRecordingStarted = true;

            // Start recording from the microphone for 10 seconds.
            return Microphone.Start(deviceName, false, messageLength, samplingRate);
        }

        /// <summary>
        /// Ends the recording session.
        /// </summary>
        public void StopRecording()
        {
            // 3.a: Check if dictationRecognizer.Status is Running and stop it if so
            if (dictationRecognizer.Status == SpeechSystemStatus.Running)
            {
                dictationRecognizer.Stop();
            }

            Microphone.End(deviceName);
        }

        /// <summary>
        /// This event is fired while the user is talking. As the recognizer listens, it provides text of what it's heard so far.
        /// </summary>
        /// <param name="text">The currently hypothesized recognition.</param>
        private void DictationRecognizer_DictationHypothesis(string text)
        {
            // 3.a: Set DictationDisplay text to be textSoFar and new hypothesized text
            // We don't want to append to textSoFar yet, because the hypothesis may have changed on the next event
            dictationDisplay.text = textSoFar.ToString() + " " + text + "...";
        }

        /// <summary>
        /// This event is fired after the user pauses, typically at the end of a sentence. The full recognized string is returned here.
        /// </summary>
        /// <param name="text">The text that was heard by the recognizer.</param>
        /// <param name="confidence">A representation of how confident (rejected, low, medium, high) the recognizer is of this recognition.</param>
        private void DictationRecognizer_DictationResult(string text, ConfidenceLevel confidence)
        {
            // 3.a: Append textSoFar with latest text
            textSoFar.Append(text + ". ");

            // 3.a: Set DictationDisplay text to be textSoFar
            dictationDisplay.text = textSoFar.ToString();
        }

        /// <summary>
        /// This event is fired when the recognizer stops, whether from Stop() being called, a timeout occurring, or some other error.
        /// Typically, this will simply return "Complete". In this case, we check to see if the recognizer timed out.
        /// </summary>
        /// <param name="cause">An enumerated reason for the session completing.</param>
        private void DictationRecognizer_DictationComplete(DictationCompletionCause cause)
        {
            // If Timeout occurs, the user has been silent for too long.
            // With dictation, the default timeout after a recognition is 20 seconds.
            // The default timeout with initial silence is 5 seconds.
            if (cause == DictationCompletionCause.TimeoutExceeded)
            {
                Microphone.End(deviceName);

                dictationDisplay.text = "Dictation has timed out. Please press the record button again.";
                SendMessage("ResetAfterTimeout");
            }
        }

        /// <summary>
        /// This event is fired when an error occurs.
        /// </summary>
        /// <param name="error">The string representation of the error reason.</param>
        /// <param name="hresult">The int representation of the hresult.</param>
        private void DictationRecognizer_DictationError(string error, int hresult)
        {
            // 3.a: Set DictationDisplay text to be the error string
            dictationDisplay.text = error + "\nHRESULT: " + hresult;
        }

        /// <summary>
        /// The dictation recognizer may not turn off immediately, so this call blocks on
        /// the recognizer reporting that it has actually stopped.
        /// </summary>
        public IEnumerator WaitForDictationToStop()
        {
            while (dictationRecognizer != null && dictationRecognizer.Status == SpeechSystemStatus.Running)
            {
                yield return null;
            }
        }
    }
}

建置和部署Build and Deploy

  • 在 Visual Studio 中重建並部署至您的裝置。Rebuild in Visual Studio and deploy to your device.
  • 使用輕量手勢關閉 [符合] 方塊。Dismiss the fit box with an air-tap gesture.
  • 觀賞太空人的監看,然後說「 開啟 Communicator」。Gaze at the astronaut's watch and say "Open Communicator".
  • 選取 [ 錄製 ] 按鈕 (麥克風) 以記錄您的訊息。Select the Record button (microphone) to record your message.
  • 開始說話。Start speaking. 聽寫辨識器 會解讀您的語音,並顯示 communicator 中的假設文字。The Dictation Recognizer will interpret your speech and show the hypothesized text in the communicator.
  • 當您記錄訊息時,請嘗試說出「 傳送訊息 」。Try saying "Send Message" while you are recording a message. 請注意, 關鍵字辨識器 沒有回應,因為 聽寫辨識器 仍在作用中。Notice that the Keyword Recognizer does not respond because the Dictation Recognizer is still active.
  • 停止說話幾秒鐘。Stop speaking for a few seconds. 觀賞聽寫辨識器完成其假設並顯示最終結果。Watch as the Dictation Recognizer completes its hypothesis and shows the final result.
  • 開始說話,然後暫停20秒。Begin speaking and then pause for 20 seconds. 這會導致 聽寫辨識器 超時。This will cause the Dictation Recognizer to timeout.
  • 請注意,在上述 timeout 之後, 關鍵字辨識器 會重新啟用。Notice that the Keyword Recognizer is re-enabled after the above timeout. Communicator 現在會回應語音命令。The communicator will now respond to voice commands.
  • 說「 傳送訊息 」,將訊息傳送至太空人。Say "Send Message" to send the message to the astronaut.

第4章-文法辨識器Chapter 4 - Grammar Recognizer

目標Objectives

  • 使用文法辨識器,根據 SRGS 或語音辨識文法規格(file)辨識使用者的語音。Use the Grammar Recognizer to recognize the user's speech according to an SRGS, or Speech Recognition Grammar Specification, file.

注意

必須針對要從麥克風錄製的應用程式宣告 麥克風 功能。The Microphone capability must be declared for an app to record from the microphone. 您已在 MR 輸入212中為您完成這項操作,但請記住您自己的專案。This is done for you already in MR Input 212, but keep this in mind for your own projects.

  1. 在 Unity 編輯器中,流覽至 [> Player 編輯 > 專案設定],移至播放機設定In the Unity Editor, go to the player settings by navigating to "Edit > Project Settings > Player"
  2. 按一下 [通用 Windows 平臺] 索引標籤Click on the "Universal Windows Platform" tab
  3. 在 [發佈設定 > 功能] 區段中,檢查 麥克風 功能In the "Publishing Settings > Capabilities" section, check the Microphone capability

指示Instructions

  1. 在 [階層] 面板中搜尋 Jetpack_Center ,然後選取它。In the Hierarchy panel, search for Jetpack_Center and select it.
  2. 在 [偵測 ] 面板中尋找 Tagalong 動作 腳本。Look for the Tagalong Action script in the Inspector panel.
  3. 按一下 [物件] 右邊的小圓圈, 沿著欄位標記Click the little circle to the right of the Object To Tag Along field.
  4. 在彈出的視窗中,搜尋 SRGSToolbox ,然後從清單中選取它。In the window that pops up, search for SRGSToolbox and select it from the list.
  5. 請查看 StreamingAssets 資料夾中的 SRGSColor.xml 檔案。Take a look at the SRGSColor.xml file in the StreamingAssets folder.
    1. 您可以在 W3C 網站上找到 SRGS 設計規格。The SRGS design spec can be found on the W3C website here.

在我們的 SRGS 檔案中,有三種類型的規則:In our SRGS file, we have three types of rules:

  • 此規則可讓您從十二個色彩的清單中說出一種色彩。A rule which lets you say one color from a list of twelve colors.
  • 有三個規則會接聽色彩規則和三個圖形之一的組合。Three rules which listen for a combination of the color rule and one of the three shapes.
  • ColorChooser 的根規則,它會接聽三個「色彩 + 圖形」規則的任何組合。The root rule, colorChooser, which listens for any combination of the three "color + shape" rules. 您可以依任何順序或從1到全部三個的任意數量,來說出圖形。The shapes can be said in any order and in any amount from just one to all three. 這是唯一接聽的規則,因為它在初始文法標記中是指定為檔案頂端的根規則 < > 。This is the only rule that is listened for, as it's specified as the root rule at the top of the file in the initial <grammar> tag.

建置和部署Build and Deploy

  • 在 Unity 中重建應用程式,然後從 Visual Studio 建立和部署,以在 HoloLens 上體驗應用程式。Rebuild the application in Unity, then build and deploy from Visual Studio to experience the app on HoloLens.
  • 使用輕量手勢關閉 [符合] 方塊。Dismiss the fit box with an air-tap gesture.
  • 看看太空人的 jetpack,然後執行一下點一下手勢。Gaze at the astronaut's jetpack and perform an air-tap gesture.
  • 開始說話。Start speaking. 文法辨識器 會解讀您的語音,並根據辨識來變更圖形的色彩。The Grammar Recognizer will interpret your speech and change the colors of the shapes based on the recognition. 範例命令為「藍色圓圈、黃色方形」。An example command is "blue circle, yellow square".
  • 執行另一個輕量手勢來關閉工具箱。Perform another air-tap gesture to dismiss the toolbox.

結束The End

恭喜!Congratulations! 您現在已完成 MR 輸入212: VoiceYou have now completed MR Input 212: Voice.

  • 您知道語音命令的 Dos 和注意事項。You know the Dos and Don'ts of voice commands.
  • 您已瞭解如何運用工具提示,讓使用者知道語音命令。You saw how tooltips were employed to make users aware of voice commands.
  • 您看到了數種類型的意見反應,用來確認使用者的聲音聽起來。You saw several types of feedback used to acknowledge that the user's voice was heard.
  • 您知道如何在關鍵字辨識器與聽寫辨識器之間切換,以及這兩項功能如何理解及解讀您的聲音。You know how to switch between the Keyword Recognizer and the Dictation Recognizer, and how these two features understand and interpret your voice.
  • 您已瞭解如何在您的應用程式中使用 SRGS 檔案和文法辨識器進行語音辨識。You learned how to use an SRGS file and the Grammar Recognizer for speech recognition in your application.