快速入門:使用語音 SDK 在 Android 上以 Java 辨識語音Quickstart: Recognize speech in Java on Android by using the Speech SDK

在本文中,您將了解如何使用認知服務語音 SDK 將語音轉譯成文字,以開發適用於 Android 的 Java 應用程式。In this article, you'll learn how to develop a Java application for Android using the Cognitive Services Speech SDK to transcribe speech to text. 應用程式以語音 SDK Maven 套件 1.6.0 版和 Android Studio 3.3 為基礎。The application is based on the Speech SDK Maven Package, version 1.6.0, and Android Studio 3.3. 語音 SDK 目前與使用 32/64 位元 ARM 和 Intel x86/x64 相容處理器的 Android 裝置相容。The Speech SDK is currently compatible with Android devices having 32/64-bit ARM and Intel x86/x64 compatible processors.

注意

針對語音裝置 SDK 和 Roobo 裝置,請參閱語音裝置 SDKFor the Speech Devices SDK and the Roobo device, see Speech Devices SDK.

必要條件Prerequisites

您需要語音服務訂用帳戶金鑰,才能完成本快速入門。You need a Speech Services subscription key to complete this Quickstart. 您可以免費取得一個金鑰。You can get one for free. 如需詳細資訊,請參閱免費試用語音服務See Try the Speech Services for free for details.

建立和設定專案Create and configure a project

  1. 啟動 Android Studio,然後在 [歡迎使用] 視窗中選擇 [開始新的 Android Studio 專案] 。Launch Android Studio, and choose Start a new Android Studio project in the Welcome window.

    Android Studio 歡迎使用視窗的螢幕擷取畫面

  2. 在 [選擇您的專案] 精靈出現時,選取活動選取方塊中的 [手機和平板電腦] 和 [空白活動] 。The Choose your project wizard appears, select Phone and Tablet and Empty Activity in the activity selection box. 選取 [下一步] 。Select Next.

    選擇專案精靈的螢幕擷取畫面

  3. 在 [設定您的專案] 畫面中,輸入 Quickstart 作為名稱,並輸入 samples.speech.cognitiveservices.microsoft.com 作為套件名稱,然後選擇專案目錄。In the Configure your project screen, enter Quickstart as Name, samples.speech.cognitiveservices.microsoft.com as Package name, and choose a project directory. 針對 [最低 API 層級] ,選擇 [API 23:Android 6.0 (Marshmallow)] ,並將所有其他核取方塊保留為未核取,然後選取 [完成] 。For Minimum API level pick API 23: Android 6.0 (Marshmallow), leave all other checkboxes unchecked, and select Finish.

    設定專案精靈的螢幕擷取畫面

Android Studio 需要一些時間來準備您新的 Android 專案。Android Studio takes a moment to prepare your new Android project. 接著,請設定專案以了解語音 SDK 並使用 Java 8。Next, configure the project to know about the Speech SDK and to use Java 8.

重要

下載此頁面上的任何「Azure 認知服務的語音 SDK」元件,即表示您知悉其授權。By downloading any of the Speech SDK for Azure Cognitive Services components on this page, you acknowledge its license. 請參閱語音 SDK 的 Microsoft 軟體授權條款See the Microsoft Software License Terms for the Speech SDK.

認知服務語音 SDK 目前的版本為 1.6.0The current version of the Cognitive Services Speech SDK is 1.6.0.

適用於 Android 的語音 SDK 會封裝成 AAR (Android 程式庫) (英文),其中包含必要的程式庫及所需的 Android 權限。The Speech SDK for Android is packaged as an AAR (Android Library), which includes the necessary libraries and required Android permissions. 它會裝載在位於 https://csspeechstorage.blob.core.windows.net/maven/ 的 Maven 存放庫中。It is hosted in a Maven repository at https://csspeechstorage.blob.core.windows.net/maven/.

將您的專案設定為使用語音 SDK。Set up your project to use the Speech SDK. 從 Android Studio 功能表列中選擇 [檔案] > [專案結構] ,以開啟 [專案結構] 視窗。Open the Project Structure window by choosing File > Project Structure from the Android Studio menu bar. 在 [專案結構] 視窗中,進行下列變更:In the Project Structure window, make the following changes:

  1. 在位於視窗左側的清單中,選取 [專案] 。In the list on the left side of the window, select Project. 在單引號中附加逗號和我們的 Maven 存放庫 URL,以編輯 [預設程式庫存放庫] 。Edit the Default Library Repository settings by appending a comma and our Maven repository URL enclosed in single quotes. 'https://csspeechstorage.blob.core.windows.net/maven/''https://csspeechstorage.blob.core.windows.net/maven/'

    專案結構視窗的螢幕擷取畫面

  2. 在相同畫面中的左側,選取 [應用程式] 。In the same screen, on the left side, select app. 然後,選取視窗頂端的 [相依性] 索引標籤。Then select the Dependencies tab at the top of the window. 選取綠色加號 (+),然後從下拉式功能表中選擇 [程式庫相依性] 。Select the green plus sign (+), and choose Library dependency from the drop-down menu.

    專案結構視窗的螢幕擷取畫面

  3. 在出現的視窗中,輸入適用於 Android 之語音 SDK 的名稱和版本 com.microsoft.cognitiveservices.speech:client-sdk:1.6.0In the window that comes up, enter the name and version of our Speech SDK for Android, com.microsoft.cognitiveservices.speech:client-sdk:1.6.0. 然後選取 [確定] 。Then select OK. 語音 SDK 現在應該會加入至相依性的清單,如下所示:The Speech SDK should be added to the list of dependencies now, as shown below:

    專案結構視窗的螢幕擷取畫面

  4. 選取 [屬性] 索引標籤。針對 [來源相容性] 和 [目標相容性] 兩者,都選取 [1.8] 。Select the Properties tab. For both Source Compatibility and Target Compatibility, select 1.8.

  5. 選取 [確定] 以關閉 [專案結構] 視窗,並將您的變更套用至專案。Select OK to close the Project Structure window and apply your changes to the project.

建立使用者介面Create user interface

我們將建立應用程式的基本使用者介面。We will create a basic user interface for the application. 編輯您主要活動的版面配置 activity_main.xmlEdit the layout for your main activity, activity_main.xml. 最初,版面配置會包含具有您應用程式名稱的標題列,以及包含 "Hello World!" 文字的 TextView。Initially, the layout includes a title bar with your application's name, and a TextView containing the text "Hello World!".

  • 按一下 TextView 元素。Click the TextView element. hello 的右上角,變更其 ID 屬性。Change its ID attribute in the upper-right corner to hello.

  • activity_main.xml 視窗左上方的 [調色盤] 中,將按鈕拖曳至文字上方的空白處。From the Palette in the upper left of the activity_main.xml window, drag a button into the empty space above the text.

  • 在右側按鈕的屬性中,針對 onClick 屬性的值,輸入 onSpeechButtonClickedIn the button's attributes on the right, in the value for the onClick attribute, enter onSpeechButtonClicked. 我們將以此名稱撰寫用來處理按鈕事件的方法。We'll write a method with this name to handle the button event. button 的右上角,變更其 ID 屬性。Change its ID attribute in the upper-right corner to button.

  • 使用設計工具頂端的魔術棒圖示,推斷版面配置條件約束。Use the magic wand icon at the top of the designer to infer layout constraints.

    魔術棒圖示的螢幕擷取畫面

UI 的文字和圖形化表示法現在應會顯示如下:The text and graphical representation of your UI should now look like this:

<?xml version="1.0" encoding="utf-8"?>
<android.support.constraint.ConstraintLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:app="http://schemas.android.com/apk/res-auto"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    tools:context=".MainActivity">

    <TextView
        android:id="@+id/hello"
        android:layout_width="366dp"
        android:layout_height="295dp"
        android:text="Hello World!"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintLeft_toLeftOf="parent"
        app:layout_constraintRight_toRightOf="parent"
        app:layout_constraintTop_toTopOf="parent"
        app:layout_constraintVertical_bias="0.925" />

    <Button
        android:id="@+id/button"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:layout_marginStart="16dp"
        android:onClick="onSpeechButtonClicked"
        android:text="Button"
        app:layout_constraintBottom_toTopOf="@+id/hello"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toTopOf="parent"
        app:layout_constraintVertical_bias="0.072" />

</android.support.constraint.ConstraintLayout>

新增範例程式碼Add sample code

  1. 開啟來源檔案 MainActivity.javaOpen the source file MainActivity.java. 以下列程式碼取代此檔案中的所有程式碼。Replace all the code in this file with the following.

    package com.microsoft.cognitiveservices.speech.samples.quickstart;
    
    import android.support.v4.app.ActivityCompat;
    import android.support.v7.app.AppCompatActivity;
    import android.os.Bundle;
    import android.util.Log;
    import android.view.View;
    import android.widget.TextView;
    
    import com.microsoft.cognitiveservices.speech.ResultReason;
    import com.microsoft.cognitiveservices.speech.SpeechConfig;
    import com.microsoft.cognitiveservices.speech.SpeechRecognitionResult;
    import com.microsoft.cognitiveservices.speech.SpeechRecognizer;
    
    import java.util.concurrent.Future;
    
    import static android.Manifest.permission.*;
    
    public class MainActivity extends AppCompatActivity {
    
        // Replace below with your own subscription key
        private static String speechSubscriptionKey = "YourSubscriptionKey";
        // Replace below with your own service region (e.g., "westus").
        private static String serviceRegion = "YourServiceRegion";
    
        @Override
        protected void onCreate(Bundle savedInstanceState) {
            super.onCreate(savedInstanceState);
            setContentView(R.layout.activity_main);
    
            // Note: we need to request the permissions
            int requestCode = 5; // unique code for the permission request
            ActivityCompat.requestPermissions(MainActivity.this, new String[]{RECORD_AUDIO, INTERNET}, requestCode);
        }
    
        public void onSpeechButtonClicked(View v) {
            TextView txt = (TextView) this.findViewById(R.id.hello); // 'hello' is the ID of your text view
    
            try {
                SpeechConfig config = SpeechConfig.fromSubscription(speechSubscriptionKey, serviceRegion);
                assert(config != null);
    
                SpeechRecognizer reco = new SpeechRecognizer(config);
                assert(reco != null);
    
                Future<SpeechRecognitionResult> task = reco.recognizeOnceAsync();
                assert(task != null);
    
                // Note: this will block the UI thread, so eventually, you want to
                //        register for the event (see full samples)
                SpeechRecognitionResult result = task.get();
                assert(result != null);
    
                if (result.getReason() == ResultReason.RecognizedSpeech) {
                    txt.setText(result.toString());
                }
                else {
                    txt.setText("Error recognizing. Did you update the subscription info?" + System.lineSeparator() + result.toString());
                }
    
                reco.close();
            } catch (Exception ex) {
                Log.e("SpeechSDKDemo", "unexpected " + ex.getMessage());
                assert(false);
            }
        }
    }
    
    • onCreate 方法包含要求麥克風和網際網路權限,以及初始化原生平台繫結的程式碼。The onCreate method includes code that requests microphone and internet permissions, and initializes the native platform binding. 原生平台繫結只需要設定一次。Configuring the native platform bindings is only required once. 此設定應該在應用程式初始化期間即已完成。It should be done early during application initialization.

    • 如先前所述,方法 onSpeechButtonClicked 是按鈕點擊處理常式。The method onSpeechButtonClicked is, as noted earlier, the button click handler. 按下按鈕就會觸發語音轉換文字的轉譯。A button press triggers speech to text transcription.

  2. 在相同檔案中,以您的訂用帳戶金鑰取代 YourSubscriptionKey 字串。In the same file, replace the string YourSubscriptionKey with your subscription key.

  3. 同時以與您的訂用帳戶 (例如,免費試用訂用帳戶的 westus) 相關聯的區域取代 YourServiceRegion 字串。Also replace the string YourServiceRegion with the region associated with your subscription (for example, westus for the free trial subscription).

建置並執行應用程式Build and run the app

  1. 將 Android 裝置連接到開發電腦。Connect your Android device to your development PC. 請確定您已在裝置上啟用開發模式和 USB 偵錯Make sure you have enabled development mode and USB debugging on the device.

  2. 若要建置應用程式,請按 Ctrl+F9,或從功能表列中選擇 [建置] > [建立專案] 。To build the application, press Ctrl+F9, or choose Build > Make Project from the menu bar.

  3. 若要啟動應用程式,請按 Shift+F10,或選擇 [執行] > [執行應用程式] 。To launch the application, press Shift+F10, or choose Run > Run 'app'.

  4. 在出現的 [部署目標] 視窗中,選擇您的 Android 裝置。In the deployment target window that appears, choose your Android device.

    選取部署目標視窗的螢幕擷取畫面

按下應用程式中的按鈕,開始使用 [語音辨識] 區段。Press the button in the application to begin a speech recognition section. 接下來 15 秒的英文語音會傳送到語音服務,並進行轉譯。The next 15 seconds of English speech will be sent to the Speech Services and transcribed. 結果會出現在 Android 應用程式中,以及 Android Studio 的 Logcat 視窗中。The result appears in the Android application, and in the logcat window in Android Studio.

Android 應用程式的螢幕擷取畫面

後續步驟Next steps

另請參閱See also