Exercise - Create the speech translation script
In Unity, displaying the result from an API integration within a mixed reality app requires adding and configuring UI elements to the scene. To display speech translation from the Azure Cognitive Services Speech Translation service, you must add a quad, text, and a button to the scene and assign them to their respective properties in the corresponding C# script. Also, your Azure credentials must be validated to use the Speech service.
Here, you'll add Mixed Reality Toolkit (MRTK) UI elements to a Unity scene. You'll then create a script that uses the Cognitive Services Speech SDK for translation.
Create UI
You create a parent object called TranslationUI to store the UI elements inside the object as its children. Storing the UI elements as the children of TranslationUI helps keep the project hierarchy organized. It also enables the UI elements to inherit properties from the parent, such as the transform.
The UI consists of a quad, which serves as a flat surface for the text. Displayed on top of the quad is text. The top portion of the quad contains the recognized text, and the bottom displays the translated speech. The UI text elements are similar in setup. After you create and configure one text element, you can duplicate it to save time in creating the others.
Create a Translation UI GameObject to store the UI
In the Hierarchy window, select the + icon and select Create Empty.
Select the GameObject object in the Hierarchy window. The GameObject properties appear in the Inspector window.
In the Inspector window, name the GameObject Translation UI. Change Transform Position to 0, 0, 0.8. This setting moves the object away from the user, which later helps to display the UI elements in front of the user.
Add a Quad GameObject
To add the object as a child of Translation UI, in the Hierarchy window, select the Translation UI object. Next, right-click the object and select 3D Object > Quad.
Adjust the Scene view camera by using the Scene gizmo to view the front of the quad.
Select the Quad object in the Hierarchy window. The quad's properties appear in the Inspector window.
In the Inspector window, change the quad's Transform Scale to 0.8, 0.5, 0.1. This setting will make the quad rectangular.
The Quad object's material should be a color that will make the text UI legible. MRTK is equipped with materials that can be applied to GameObjects. In the Project window, enter MRTK_Standard_White in the search bar to find the color white.
Drag the MRTK_Standard_White material from the Project window to the Quad object. The color of the Quad object in the Scene window should now be white.
Add text to display on the Quad GameObject
To add the text object as child of Translation UI, in the Hierarchy window, select the Translation UI object. Next, right-click the object and select 3D Object > Text - TextMeshPro.
Select the Text - TextMeshPro object in the Hierarchy window. The Text - TextMeshPro properties appear in the Inspector window.
In the Inspector window, name the Text - TextMeshPro object Recognition (Label) and change the following Rect Transform properties:
- Pos X = 0, Pos Y = 0.1, Pos Z = 0
- Width = 0.74
- Height = 0.2
With the Recognition (Label) object still selected, in the Inspector window, change the remaining properties:
- Text = Recognized Speech:
- Font Size = 0.3
- Face Color = 000000
You can duplicate the Recognition (Label) object to create the remaining text UI. The Pos Y and Text properties will be changed for each object to position the text appropriately against the Quad object. With the Recognition (Label) object still selected, use the keyboard shortcut Ctrl+D to duplicate the object three times.
In the Hierarchy window, rename each object as follows:
- Recognition (Label)
- Recognition Output
- Translation (Label)
- Translation Output
For each duplicated object, change the Pos Y and Text properties as follows:
- Recognition Output: Pos Y = 0.03, Text = Recognized speech from Azure
- Translation (Label): Pos Y = -0.1, Text = Translation:
- Translation Output: Pos Y = -0.17, Text = Translated speech from Azure
Add a button to display on the Quad GameObject
An MRTK button prefab is used for the microphone. In the Project window, search for PressableButtonHoloLens2Circular_32x32.
To add the button prefab as a child of Translation UI, drag the button from the Project window to the Translation UI object.
Select the PressableButtonHoloLens2Circular_32x32 object in the Hierarchy window. The button properties appear in the Inspector window.
In the Inspector window, name the button object Mic and change the following Transform properties:
- Position: 0.3, -0.17, -0.1
- Scale: 2, 2, 1
With the Mic object still selected, in the Inspector window, change Icon to a microphone.
Create the Translation.cs script
Note
This script has additional helper code to help debug any potential issues that might occur when you run the Unity scene.
In the Project window, select the Assets folder. This folder will be used to store the Translation.cs script.
In the Project window, select the + icon and select C# Script. Name the script Translation.cs.
In the Project window, double-click the script to open it in Visual Studio.
In Visual Studio, replace the default code provided in the template with the following script:
using System.Collections; using System.Collections.Generic; using UnityEngine; using Microsoft.CognitiveServices.Speech; using Microsoft.CognitiveServices.Speech.Translation; using Microsoft.MixedReality.Toolkit.UI; using TMPro; public class Translation : MonoBehaviour { public TextMeshPro recognizedText; public TextMeshPro translatedText; public PressableButton micButton; public string SpeechServiceSubscriptionKey = ""; public string SpeechServiceRegion = ""; private bool waitingforReco; private string recognizedString; private string translatedString; private bool micPermissionGranted = false; private object threadLocker = new object(); public async void ButtonClick() { var translationConfig = SpeechTranslationConfig.FromSubscription(SpeechServiceSubscriptionKey, SpeechServiceRegion); translationConfig.SpeechRecognitionLanguage = "en-US"; translationConfig.AddTargetLanguage("fr"); using (var recognizer = new TranslationRecognizer(translationConfig)) { lock (threadLocker) { waitingforReco = true; } var result = await recognizer.RecognizeOnceAsync().ConfigureAwait(false); if (result.Reason == ResultReason.TranslatedSpeech) { recognizedString = result.Text; foreach (var element in result.Translations) { translatedString = element.Value; } } else if (result.Reason == ResultReason.NoMatch) { recognizedString = "NOMATCH: Speech could not be recognized."; } else if (result.Reason == ResultReason.Canceled) { var cancellation = CancellationDetails.FromResult(result); recognizedString = $"CANCELED: Reason={cancellation.Reason} ErrorDetails={cancellation.ErrorDetails}"; } lock (threadLocker) { waitingforReco = false; } } } // Start is called before the first frame update void Start() { if (translatedText == null) { UnityEngine.Debug.LogError("translatedText property is null! Assign a UI TextMeshPro Text element to it."); } else if (micButton == null) { UnityEngine.Debug.LogError("micButton property is null! Assign a MRTK Pressable Button to it."); } else { micPermissionGranted = true; micButton.ButtonPressed.AddListener(ButtonClick); } } // Update is called once per frame void Update() { lock (threadLocker) { recognizedText.text = recognizedString; translatedText.text = translatedString; } } }
The script is written to recognize English (
en-US
) and translate to French (fr
). You can change the recognition language by modifying the value for theSpeechRecognitionLanguage
property. You can change the target language by modifying the parameter for theAddTargetLanguage
method.translationConfig.SpeechRecognitionLanguage = "<assign a locale>"; translationConfig.AddTargetLanguage("<assign a language code>");
Save the file and return to Unity.
Add the Translation.cs script to a GameObject
In the Hierarchy window, select the Translation UI object. The Translation UI properties appear in the Inspector window.
In the Inspector window, select Add Component. In the Search window that appears, enter translation and select the Translation.cs script.
With the Translation UI object still selected, in the Inspector window, expand the Translation.cs script properties.
In the Inspector window, assign the following objects to their respective properties:
- Recognized Text: Recognition Output
- Translated Text: Translation Output
- Mic Button: Mic
With the Translation UI object still selected, in the Inspector window, enter the Speech Service Subscription Key and Speech Service Region information created with the Azure Speech resource.
Try speech translation in play mode
In the Unity toolbar, select the Play icon to enter play mode.
Using the Unity in-editor input simulation, press the Spacebar on your keyboard to simulate hand input with the right hand.
While pressing the Spacebar, use your mouse scroll wheel to scroll forward to press the mic button. Then say "Hello" into your microphone.
The recognized speech and the translation will appear in the scene. You can press the button again to translate more phrases.
Need help? See our troubleshooting guide or provide specific feedback by reporting an issue.