1.集成并使用语音识别和听录1. Integrating and using speech recognition and transcription

在本教程系列中,你将创建一个混合现实应用程序用于探索如何将 Azure 语音服务与 HoloLens 2 配合使用。In this tutorial series, you will create a Mixed Reality application that explores the use of Azure Speech Services with the HoloLens 2. 完成本教程系列后,你便可以使用设备的麦克风实时进行语音转文本的听录,将语音翻译成其他语言,并利用意向识别功能通过人工智能来理解语音命令。When you complete this tutorial series, you will be able to use your device's microphone to transcribe speech to text in real time, translate your speech into other languages, and leverage the Intent recognition feature to understand voice commands using artificial intelligence.

目标Objectives

  • 了解如何将 Azure 语音服务与 HoloLens 2 应用程序相集成Learn how to integrate Azure Speech Services with a HoloLens 2 application
  • 了解如何使用语音识别来听录文本Learn how to use speech recognition to transcribe text

必备条件Prerequisites

提示

如果你尚未完成入门教程系列,建议先完成这些教程。If you have not completed the Getting started tutorials series yet, it's recommended that you complete those tutorials first.

  • 一台 Windows 10 电脑,其中已安装并配置正确的工具A Windows 10 PC configured with the correct tools installed
  • Windows 10 SDK 10.0.18362.0 或更高版本Windows 10 SDK 10.0.18362.0 or later
  • 一些基本的 C# 编程功能Some basic C# programming ability
  • 一个针对开发配置的 HoloLens 2 设备A HoloLens 2 device configured for development
  • Unity Hub,其中已安装 Unity 2019 LTS 并添加了通用 Windows 平台生成支持模块Unity Hub with Unity 2019 LTS installed and the Universal Windows Platform Build Support module added

重要

建议对本系列教程使用 Unity 2019 LTS。The recommended Unity version for this tutorial series is Unity 2019 LTS. 这将取代上述链接的先决条件中所述的任何 Unity 版本要求或建议。This supersedes any Unity version requirements or recommendations stated in the prerequisites linked above.

创建和准备 Unity 项目Creating and preparing the Unity project

在本部分,你将创建一个新的 Unity 项目,并使其准备好用于 MRTK 开发。In this section, you will create a new Unity project and get it ready for MRTK development.

为此,请先执行初始化项目和第一个应用程序中的以下步骤,但请忽略有关在设备上生成应用程序的说明:For this, first follow the Initializing your project and first application, excluding the Build your application to your device instructions, which includes the following steps:

  1. 创建 Unity 项目并为其指定适当的名称,例如“MRTK 教程”Creating the Unity project and give it a suitable name, for example, MRTK Tutorials
  2. 切换生成平台Switching the build platform
  3. 导入 TextMeshPro 基本资源Importing the TextMeshPro Essential Resources
  4. 导入混合现实工具包Importing the Mixed Reality Toolkit
  5. 配置 Unity 项目Configuring the Unity project
  6. 创建并配置场景,并为场景指定适当的名称,例如 AzureSpeechServicesCreating and configuring the scene and give the scene a suitable name, for example, AzureSpeechServices

然后,根据更改空间感知显示选项说明将场景的 MRTK 配置配置文件更改为“DefaultHoloLens2ConfigurationProfile”,并将空间感知网格的显示选项更改为“遮挡” 。Then follow the Changing the Spatial Awareness Display Option instructions to change the MRTK configuration profile for your scene to the DefaultHoloLens2ConfigurationProfile and change the display options for the spatial awareness mesh to Occlusion.

配置语音命令的启动行为Configuring the speech commands start behavior

由于你要使用语音 SDK 进行语音识别和听录,因此需要配置 MRTK 语音命令,使其不会干扰语音 SDK 的功能。Because you will use the Speech SDK for speech recognition and transcription you need to configure the MRTK Speech Commands so they do not interfere with the Speech SDK functionality. 为此,可将语音命令的启动行为从“自动启动”更改为“手动启动”。To achieve this you can change the speech commands start behavior from Auto Start to Manual Start.

在“层次结构”窗口中选择“MixedRealityToolkit”对象后,在“检查器”窗口中选择“输入”选项卡,克隆“DefaultHoloLens2InputSystemProfile”和“DefaultMixedRealitySpeechCommandsProfile”,然后将语音命令的“启动行为”更改为“手动启动”: With the MixedRealityToolkit object selected in the Hierarchy window, in the Inspector window, select the Input tab, clone the DefaultHoloLens2InputSystemProfile and the DefaultMixedRealitySpeechCommandsProfile, and then change the speech commands Start Behavior to Manual Start:

mrlearning-speech

提示

有关如何克隆和配置 MRTK 配置文件的提示,可参阅配置混合现实工具包配置文件说明。For a reminder on how to clone and configure MRTK profiles, you can refer to the Configuring the Mixed Reality Toolkit profiles instructions.

配置功能Configuring the capabilities

在 Unity 菜单中,选择“编辑” > “项目设置...”打开“播放器设置”窗口,然后找到“播放器” > “发布设置”部分: In the Unity menu, select Edit > Project Settings... to open the Player Settings window, then locate the Player > Publishing Settings section:

mrlearning-speech

在“发布设置”中,向下滚动到“功能”部分,仔细检查在教程中最初创建项目时启用的“InternetClient”、“Microphone”和“SpatialPerception”功能是否确实已启用。 In the Publishing Settings, scroll down to the Capabilities section and double-check that the InternetClient, Microphone, and SpatialPerception capabilities, which you enabled when you created the project at the beginning of the tutorial, are enabled. 然后启用“InternetClientServer”和“PrivateNetworkClientServer”功能: Then, enable the InternetClientServer and PrivateNetworkClientServer capabilities:

mrlearning-speech

导入教程资产Importing the tutorial assets

下载以下 Unity 自定义包,并 按其列出顺序 将其 导入Download and import the following Unity custom packages in the order they are listed:

提示

有关如何导入 Unity 自定义包的提示,可参阅导入混合现实工具包说明。For a reminder on how to import a Unity custom package, you can refer to the Importing the Mixed Reality Toolkit instructions.

导入教程资产后,“项目”窗口应如下所示:After you have imported the tutorial assets your Project window should look similar to this:

mrlearning-speech

准备场景Preparing the scene

在本部分,你将通过添加教程预制件来准备场景,并配置“Lunarcom 控制器(脚本)”组件来控制场景。In this section, you will prepare the scene by adding the tutorial prefab and configure the Lunarcom Controller (Script) component to control your scene.

在“项目”窗口中导航到“资产” > “MRTK.Tutorials.AzureSpeechServices” > “预制件”文件夹,并将“Lunarcom”预制件拖放到“层次结构”窗口中,以将其添加到场景中: In the Project window, navigate to Assets > MRTK.Tutorials.AzureSpeechServices > Prefabs folder and drag the Lunarcom prefab into the Hierarchy window to add it to your scene:

mrlearning-speech

仍在“层次结构”窗口中选中了“Lunarcom”对象的情况下,在“检查器”窗口中,使用“添加组件”按钮将“Lunarcom 控制器(脚本)”组件添加到 Lunarcom 对象中: With the Lunarcom object still selected in the Hierarchy window, in the Inspector window, use the Add Component button to add the Lunarcom Controller (Script) component to the Lunarcom object:

mrlearning-speech

备注

MRTK 中未包含“Lunarcom 控制器(脚本)”组件。The Lunarcom Controller (Script) component is not part of MRTK. 本教程的资产中随附了该组件。It was provided with this tutorial's assets.

在仍选中了“Lunarcom”对象的情况下,将其展开以显示其子对象,然后将“Terminal”对象拖放到“Lunarcom 控制器(脚本)”组件的“终端”字段中: With the Lunarcom object still selected, expand it to reveal its child objects, then drag the Terminal object into the Lunarcom Controller (Script) component's Terminal field:

mrlearning-speech

在仍选中了“Lunarcom”对象的情况下,展开 Terminal 对象以显示其子对象,然后将“ConnectionLight”对象拖放到“Lunarcom 控制器(脚本)”组件的“连接光线”字段中,并将“OutputText”对象拖放到“输出文本”字段中: With the Lunarcom object still selected, expand the Terminal object to reveal its child objects, then drag the ConnectionLight object into the Lunarcom Controller (Script) component's Connection Light field and the OutputText object into the Output Text field:

mrlearning-speech

在仍选中了“Lunarcom”对象的情况下,展开 Buttons 对象以显示其子对象,然后在“检查器”窗口中展开“按钮”列表,将其“大小”设置为 3,并将“MicButton”、“SatelliteButton”和“RocketButton”对象分别拖放到“元素 0”、“元素 1”和“元素 2”字段中: With the Lunarcom object still selected, expand the Buttons object to reveal its child objects, and then in the Inspector window, expand the Buttons list, set its Size to 3, and drag the MicButton, SatelliteButton, and RocketButton objects into the Element 0, 1, and 2 fields respectively:

mrlearning-speech

将 Unity 项目连接到 Azure 资源Connecting the Unity project to the Azure resource

若要使用 Azure 语音服务,需要创建一个 Azure 资源并获取语音服务的 API 密钥。To use Azure Speech Services, you need to create an Azure resource and obtain an API key for the Speech Service. 按照“免费试用语音服务”中的说明进行操作,并记下服务区域(也称为“位置”)和 API 密钥(也称为“密钥 1”或“密钥 2”)。Follow the Try the Speech service for free instructions and make a note of your service region (also known as Location) and API key (also known as Key1 or Key2).

在“层次结构”窗口中选择“Lunarcom”对象,然后在“检查器”窗口中找到“Lunarcom 控制器(脚本)”组件的“语音 SDK 凭据”部分,并按如下所述对其进行配置: In the Hierarchy window, select the Lunarcom object, then in the Inspector window, locate the Lunarcom Controller (Script) component's Speech SDK Credentials section and configure it as follows:

  • 在“语音服务 API 密钥”字段中,输入你的 API 密钥(“密钥 1”或“密钥 2”)In the Speech Service API Key field, enter your API key (Key1 or Key2)
  • 在“语音服务区域”字段中,使用小写字母输入不带空格的服务区域(位置)In the Speech Service Region field, enter your service region (Location) using lowercase letters and spaces removed

mrlearning-speech

使用语音识别听录语音Using speech recognition to transcribe speech

在“层次结构”窗口中选择“Lunarcom”对象,然后在“检查器”窗口中,使用“添加组件”按钮将“Lunarcom 语音识别器(脚本)”组件添加到 Lunarcom 对象中: In the Hierarchy window, select the Lunarcom object, then in the Inspector window, use the Add Component button to add the Lunarcom Speech Recognizer (Script) component to the Lunarcom object:

mrlearning-speech

备注

MRTK 中未包含“Lunarcom 语音识别器(脚本)”组件。The Lunarcom Speech Recognizer (Script) component is not part of MRTK. 本教程的资产中随附了该组件。It was provided with this tutorial's assets.

如果现在进入“游戏”模式,可以先按麦克风按钮来测试语音识别:If you now enter Game mode, you can test the speech recognition by first pressing the microphone button:

mrlearning-speech

假设计算机上配备了麦克风,当你讲话时,会在终端面板上听录你的语音:Then, assuming your computer has a microphone, when you say something, your speech will be transcribed on the terminal panel:

mrlearning-speech

注意

应用程序需要连接到 Azure,因此请确保计算机/设备已连接到 Internet。The application needs to connect to Azure, so make sure your computer/device is connected to the internet.

祝贺Congratulations

现已实现了由 Azure 提供支持的语音识别。You have implemented speech recognition powered by Azure. 请在设备上运行该应用程序,以确保功能正常。Run the application on your device to ensure the feature is working properly.

下一篇教程将介绍如何使用 Azure 语音识别执行命令。In the next tutorial, you will learn how to execute commands using Azure speech recognition.