Quickstart: Run the Speech Devices SDK sample app on Windows, Linux or Android

In this quickstart, you'll learn how to use the Speech Devices SDK for Windows to build a speech-enabled product or use it as a Conversation Transcription device. For Conversation Transcription only the Azure Kinect DK is supported. For other speech use linear mic arrays that provide a microphone array geometry are supported.

The application is built with the Speech SDK package, and the Eclipse Java IDE (v4) on 64-bit Windows. It runs on a 64-bit Java 8 runtime environment (JRE).

This guide requires an Azure Cognitive Services account with a Speech service resource.

The source code for the sample application is included with the Speech Devices SDK. It's also available on GitHub.

Prerequisites

This quickstart requires:

If you plan to use the intents you'll need a Language Understanding Service (LUIS) subscription. To learn more about LUIS and intent recognition, see Recognize speech intents with LUIS, C#. A sample LUIS model is available for this app.

Create and configure the project

  1. Start Eclipse.

  2. In the Eclipse IDE Launcher, in the Workspace field, enter the name of a new workspace directory. Then select Launch.

    Screenshot that shows the Eclipse Launcher where you enter the name of the workspace directory.

  3. In a moment, the main window of the Eclipse IDE appears. Close the Welcome screen if one is present.

  4. From the Eclipse menu bar, create a new project by choosing File > New > Java Project. If not available choose Project and then Java Project.

  5. The New Java Project wizard starts. Browse for the location of the sample project. Select Finish.

    Screenshot that shows the New Java Project wizard.

  6. In the Package explorer, right-click your project. Choose Configure > Convert to Maven Project from the context menu. Select Finish.

    Screenshot of Package explorer

  7. Open the pom.xml file and edit it.

    At the end of the file, before the closing tag </project>, create repositories and dependencies elements, as shown here, and ensure the version matches your current version:

    <repositories>
         <repository>
             <id>maven-cognitiveservices-speech</id>
             <name>Microsoft Cognitive Services Speech Maven Repository</name>
             <url>https://csspeechstorage.blob.core.windows.net/maven/</url>
         </repository>
    </repositories>
    
    <dependencies>
        <dependency>
             <groupId>com.microsoft.cognitiveservices.speech</groupId>
             <artifactId>client-sdk</artifactId>
             <version>1.15.0</version>
        </dependency>
    </dependencies>
    
  8. Copy the contents of Windows-x64 to the Java Project location, eg C:\SDSDK\JRE-Sample-Release

  9. Copy kws.table, participants.properties and Microsoft.CognitiveServices.Speech.extension.pma.dll into the project folder target\classes

Configure the sample application

  1. Add your speech subscription key to the source code. If you want to try intent recognition, also add your Language Understanding service subscription key and application ID.

    For speech and LUIS, your information goes into FunctionsList.java:

     // Subscription
     private static String SpeechSubscriptionKey = "<enter your subscription info here>";
     private static String SpeechRegion = "westus"; // You can change this if your speech region is different.
     private static String LuisSubscriptionKey = "<enter your subscription info here>";
     private static String LuisRegion = "westus2"; // you can change this, if you want to test the intent, and your LUIS region is different.
     private static String LuisAppId = "<enter your LUIS AppId>";
    

    If you are using conversation transcription, your speech key and region information are also needed in Cts.java:

     private static final String CTSKey = "<Conversation Transcription Service Key>";
     private static final String CTSRegion="<Conversation Transcription Service Region>";// Region may be "centralus" or "eastasia"
    
  2. The default keyword (keyword) is "Computer". You can also try one of the other provided keywords, like "Machine" or "Assistant". The resource files for these alternate keywords are in the Speech Devices SDK, in the keyword folder. For example, C:\SDSDK\JRE-Sample-Release\keyword\Computer contains the files used for the keyword "Computer".

    Tip

    You can also create a custom keyword.

    To use a new keyword, update the following line in FunctionsList.java, and copy the keyword to your app. For example, to use the keyword 'Machine' from the keyword package machine.zip:

    • Copy the kws.table file from the zip package into the project folder target/classes.

    • Update the FunctionsList.java with the keyword name:

      private static final String Keyword = "Machine";
      

Run the sample application from Eclipse

  1. From the Eclipse menu bar, Run > Run As > Java Application. Then select FunctionsList and OK.

    Screenshot of Select Java Application

  2. The Speech Devices SDK example application starts and displays the following options:

    Screenshot of a sample Speech Devices SDK application and options.

  3. Try the new Conversation Transcription demo. Start transcribing with Session > Start. By default everyone is a guest. However, if you have participant's voice signatures they can be put into a file participants.properties in the project folder target/classes. To generate the voice signature, look at Transcribe conversations (SDK).

    Screenshot of a demo Conversation Transcription application.

Create and run a standalone application

  1. In the Package explorer, right-click your project. Choose Export.

  2. The Export window appears. Expand Java and select Runnable JAR file and then select Next.

    Screenshot that shows the Export window where you select Runnable JAR file.

  3. The Runnable JAR File Export window appears. Choose an Export destination for the application, and then select Finish.

    Screenshot that shows the Runnable JAR File Export window where you choose the export destination.

  4. Please put kws.table, participants.properties, unimic_runtime.dll, pma.dll and Microsoft.CognitiveServices.Speech.extension.pma.dll in the destination folder chosen above as these files are needed by the application.

  5. To run the standalone application

    java -jar SpeechDemo.jar
    

In this quickstart, you'll learn how to use the Speech Devices SDK for Linux to build a speech-enabled product or use it as a Conversation Transcription device. Currently only the Azure Kinect DK is supported.

The application is built with the Speech SDK package, and the Eclipse Java IDE (v4) on 64-bit Linux (Ubuntu 16.04, Ubuntu 18.04, Debian 9, RHEL 7/8, CentOS 7/8). It runs on a 64-bit Java 8 runtime environment (JRE).

This guide requires an Azure Cognitive Services account with a Speech service resource.

The source code for the sample application is included with the Speech Devices SDK. It's also available on GitHub.

Prerequisites

This quickstart requires:

  • Operating System: 64-bit Linux (Ubuntu 16.04, Ubuntu 18.04, Debian 9, RHEL 7/8, CentOS 7/8)
  • Azure Kinect DK
  • Eclipse Java IDE
  • Java 8 or JDK 8 only.
  • An Azure subscription key for the Speech service. Get one for free.
  • Download the latest version of the Speech Devices SDK for Java, and extract the .zip to your working directory.

    Note

    This quickstart assumes that the app is extracted to /home/wcaltest/JRE-Sample-Release

Make sure these dependencies are installed before starting Eclipse.

  • On Ubuntu:

    sudo apt-get update
    sudo apt-get install libssl1.0.0 libasound2
    
  • On Debian 9:

    sudo apt-get update
    sudo apt-get install libssl1.0.2 libasound2
    
  • On RHEL/CentOS:

    sudo yum update
    sudo yum install alsa-lib openssl
    

    Note

Conversation Transcription is currently only available for "en-US" and "zh-CN", in the "centralus" and "eastasia" regions. You must have a speech key in one of those regions to use Conversation Transcription.

If you plan to use the intents you'll need a Language Understanding Service (LUIS) subscription. To learn more about LUIS and intent recognition, see Recognize speech intents with LUIS, C#. A sample LUIS model is available for this app.

Create and configure the project

  1. Start Eclipse.

  2. In the Eclipse IDE Launcher, in the Workspace field, enter the name of a new workspace directory. Then select Launch.

    Screenshot that shows the Eclipse Launcher.

  3. In a moment, the main window of the Eclipse IDE appears. Close the Welcome screen if one is present.

  4. From the Eclipse menu bar, create a new project by choosing File > New > Java Project. If not available choose Project and then Java Project.

  5. The New Java Project wizard starts. Browse for the location of the sample project. Select Finish.

    Screenshot of New Java Project wizard

  6. In the Package explorer, right-click your project. Choose Configure > Convert to Maven Project from the context menu. Select Finish.

    Screenshot of Package explorer

  7. Open the pom.xml file and edit it.

    At the end of the file, before the closing tag </project>, create repositories and dependencies elements, as shown here, and ensure the version matches your current version:

    <repositories>
         <repository>
             <id>maven-cognitiveservices-speech</id>
             <name>Microsoft Cognitive Services Speech Maven Repository</name>
             <url>https://csspeechstorage.blob.core.windows.net/maven/</url>
         </repository>
    </repositories>
    
    <dependencies>
        <dependency>
             <groupId>com.microsoft.cognitiveservices.speech</groupId>
             <artifactId>client-sdk</artifactId>
             <version>1.15.0</version>
        </dependency>
    </dependencies>
    
  8. In the Package explorer, right-click your project. Choose Properties, then Run/Debug Settings > New… > Java Application.

  9. The Edit Configuration window appears. In the Name field enter Main, and use Search for the Main Class to find and select com.microsoft.cognitiveservices.speech.samples.FunctionsList.

    Screenshot of Edit Launch Configuration

  10. Copy the audio binaries for your target architecture, from either Linux-arm or Linux-x64, to the Java Project location, eg /home/wcaltest/JRE-Sample-Release

  11. Also from the Edit Configuration window select the Environment page and New. The New Environment Variable window appears. In the Name field enter LD_LIBRARY_PATH and in the value field enter the folder containing the *.so files, for example /home/wcaltest/JRE-Sample-Release

  12. Copy kws.table and participants.properties into the project folder target/classes

Configure the sample application

  1. Add your speech subscription key to the source code. If you want to try intent recognition, also add your Language Understanding service subscription key and application ID.

    For speech and LUIS, your information goes into FunctionsList.java:

     // Subscription
     private static String SpeechSubscriptionKey = "<enter your subscription info here>";
     private static String SpeechRegion = "westus"; // You can change this if your speech region is different.
     private static String LuisSubscriptionKey = "<enter your subscription info here>";
     private static String LuisRegion = "westus2"; // you can change this, if you want to test the intent, and your LUIS region is different.
     private static String LuisAppId = "<enter your LUIS AppId>";
    

    If you are using conversation transcription, your speech key and region information are also needed in Cts.java:

     private static final String CTSKey = "<Conversation Transcription Service Key>";
     private static final String CTSRegion="<Conversation Transcription Service Region>";// Region may be "centralus" or "eastasia"
    
  2. The default keyword (keyword) is "Computer". You can also try one of the other provided keywords, like "Machine" or "Assistant". The resource files for these alternate keywords are in the Speech Devices SDK, in the keyword folder. For example, /home/wcaltest/JRE-Sample-Release/keyword/Computer contains the files used for the keyword "Computer".

    Tip

    You can also create a custom keyword.

    To use a new keyword, update the following line in FunctionsList.java, and copy the keyword to your app. For example, to use the keyword 'Machine' from the keyword package machine.zip:

    • Copy the kws.table file from the zip package into the project folder target/classes.

    • Update the FunctionsList.java with the keyword name:

      private static final String Keyword = "Machine";
      

Run the sample application from Eclipse

  1. From the Eclipse menu bar, Run > Run

  2. The Speech Devices SDK example application starts and displays the following options:

    Screenshot that shows a Speech Devices SDK example application and options.

  3. Try the new Conversation Transcription demo. Start transcribing with Session > Start. By default everyone is a guest. However, if you have participant's voice signatures they can be put into participants.properties in the project folder target/classes. To generate the voice signature, look at Transcribe conversations (SDK).

    Screenshot that shows a demo Conversation Transcription application.

Create and run standalone the application

  1. In the Package explorer, right-click your project. Choose Export.

  2. The Export window appears. Expand Java and select Runnable JAR file and then select Next.

    Screenshot that shows the Export window.

  3. The Runnable JAR File Export window appears. Choose an Export destination for the application, and then select Finish.

    Screenshot that shows the Runnable JAR File Export window.

  4. Please put kws.table and participants.properties in the destination folder chosen above as these files are needed by the application.

  5. Set the LD_LIBRARY_LIB to the folder containing the *.so files

    export LD_LIBRARY_PATH=/home/wcaltest/JRE-Sample-Release
    
  6. To run the standalone application

    java -jar SpeechDemo.jar
    

In this quickstart, you'll learn how to use the Speech Devices SDK for Android to build a speech-enabled product or use it as a Conversation Transcription device.

This guide requires an Azure Cognitive Services account with a Speech service resource.

The source code for the sample application is included with the Speech Devices SDK. It's also available on GitHub.

Prerequisites

Before you start using the Speech Devices SDK, you'll need to:

Set up the device

  1. Start Vysor on your computer.

    Vysor

  2. Your device should be listed under Choose a device. Select the View button next to the device.

  3. Connect to your wireless network by selecting the folder icon, and then select Settings > WLAN.

    Vysor WLAN

    Note

    If your company has policies about connecting devices to its Wi-Fi system, you need to obtain the MAC address and contact your IT department about how to connect it to your company's Wi-Fi.

    To find the MAC address of the dev kit, select the file folder icon on the desktop of the dev kit.

    Vysor file folder

    Select Settings. Search for "mac address", and then select Mac address > Advanced WLAN. Write down the MAC address that appears near the bottom of the dialog box.

    Vysor MAC address

    Some companies might have a time limit on how long a device can be connected to their Wi-Fi system. You might need to extend the dev kit's registration with your Wi-Fi system after a specific number of days.

Run the sample application

To validate your development kit setup, build and install the sample application:

  1. Start Android Studio.

  2. Select Open an existing Android Studio project.

    Android Studio - Open an existing project

  3. Go to C:\SDSDK\Android-Sample-Release\example. Select OK to open the example project.

  4. Configure gradle to reference the Speech SDK. The following files can be found under Gradle Scripts in Android Studio.

    Update the build.gradle(Project:example), the allprojects block should match below, by adding the maven lines.

    allprojects {
        repositories {
            google()
            jcenter()
            mavenCentral()
            maven {
                url 'https://csspeechstorage.blob.core.windows.net/maven/'
            }
        }
    }
    

    Update the build.gradle(Module:app) by adding this line to the dependencies section.

    implementation'com.microsoft.cognitiveservices.speech:client-sdk:1.17.0'
    
  5. Add your speech subscription key to the source code. If you want to try intent recognition, also add your Language Understanding service subscription key and application ID.

    For speech and LUIS, your information goes into MainActivity.java:

     // Subscription
     private static String SpeechSubscriptionKey = "<enter your subscription info here>";
     private static String SpeechRegion = "westus"; // You can change this if your speech region is different.
     private static String LuisSubscriptionKey = "<enter your subscription info here>";
     private static String LuisRegion = "westus2"; // you can change this, if you want to test the intent, and your LUIS region is different.
     private static String LuisAppId = "<enter your LUIS AppId>";
    

    If you are using conversation transcription, your speech key and region information are also needed in conversation.java:

     private static final String CTSKey = "<Conversation Transcription Service Key>";
     private static final String CTSRegion="<Conversation Transcription Service Region>";// Region may be "centralus" or "eastasia"
    
  6. The default keyword is "Computer". You can also try one of the other provided keywords, like "Machine" or "Assistant". The resource files for these alternate keywords are in the Speech Devices SDK, in the keyword folder. For example, C:\SDSDK\Android-Sample-Release\keyword\Computer contains the files used for the keyword "Computer".

    Tip

    You can also create a custom keyword.

    To use a new keyword, update the following two lines in MainActivity.java, and copy the keyword package to your app. For example, to use the keyword 'Machine' from the keyword package kws-machine.zip:

    • Copy the keyword package into the folder "C:\SDSDK\Android-Sample-Release\example\app\src\main\assets".

    • Update the MainActivity.java with the keyword and the package name:

      private static final String Keyword = "Machine";
      private static final String KeywordModel = "kws-machine.zip" // set your own keyword package name.
      
  7. Update the following lines, which contain the microphone array geometry settings:

    private static final String DeviceGeometry = "Circular6+1";
    private static final String SelectedGeometry = "Circular6+1";
    

    This table lists supported values:

    Variable Meaning Available values
    DeviceGeometry Physical mic configuration For a circular dev kit: Circular6+1
    For a linear dev kit: Linear4
    SelectedGeometry Software mic configuration For a circular dev kit that uses all mics: Circular6+1
    For a circular dev kit that uses four mics: Circular3+1
    For a linear dev kit that uses all mics: Linear4
    For a linear dev kit that uses two mics: Linear2
  8. To build the application, on the Run menu, select Run 'app'. The Select Deployment Target dialog box appears.

  9. Select your device, and then select OK to deploy the application to the device.

    Select Deployment Target dialog box

  10. The Speech Devices SDK example application starts and displays the following options:

    Sample Speech Devices SDK example application and options

  11. Try the new Conversation Transcription demo. Start transcribing with 'Start Session'. By default everyone is a guest. However, if you have participant's voice signatures they can be put into a file /video/participants.properties on the device. To generate the voice signature, look at Transcribe conversations (SDK).

    Demo Conversation Transcription application

  12. Experiment!

Troubleshooting

If you cannot connect to the Speech Device. Type the following command in a Command Prompt window. It will return a list of devices:

 adb devices

Note

This command uses the Android Debug Bridge, adb.exe, which is part of the Android Studio installation. This tool is located in C:\Users[user name]\AppData\Local\Android\Sdk\platform-tools. You can add this directory to your path to make it more convenient to invoke adb. Otherwise, you must specify the full path to your installation of adb.exe in every command that invokes adb.

If you see an error no devices/emulators found then check your USB cable is connected and ensure a high quality cable is used.

Next steps