快速入門:使用語音 SDK 在瀏覽器中以 JavaScript 辨識語音Quickstart: Recognize speech in JavaScript in a browser using the Speech SDK

在此文章中,您將了解如何使用認知服務語音 SDK 的 JavaScript 繫結,將語音謄寫為文字以建立網站。In this article, you'll learn how to create a website using the JavaScript binding of the Cognitive Services Speech SDK to transcribe speech to text. 此應用程式以適用於 JavaScript 的語音 SDK 為基礎 (請下載 1.5.0 版)。The application is based on the Speech SDK for JavaScript (Download version 1.5.0).

必要條件Prerequisites

  • 語音服務適用的訂用帳戶金鑰。A subscription key for the Speech service. 請參閱免費試用語音服務See Try the Speech Services for free.
  • PC 或 Mac 以及運作正常的麥克風。A PC or Mac, with a working microphone.
  • 文字編輯器。A text editor.
  • Chrome、Microsoft Edge 或 Safari 的目前版本。A current version of Chrome, Microsoft Edge, or Safari.
  • (選擇性) 支援裝載 PHP 指令碼的 Web 伺服器。Optionally, a web server that supports hosting PHP scripts.

建立新的網站資料夾Create a new Website folder

建立新的空白資料夾。Create a new, empty folder. 如果您想要在 Web 伺服器上裝載範例,請確定 Web 伺服器可以存取該資料夾。In case you want to host the sample on a web server, make sure that the web server can access the folder.

將適用於 JavaScript 的語音 SDK 解壓縮到該資料夾Unpack the Speech SDK for JavaScript into that folder

重要

下載此頁面上的任何「Azure 認知服務的語音 SDK」元件,即表示您知悉其授權。By downloading any of the Speech SDK for Azure Cognitive Services components on this page, you acknowledge its license. 請參閱語音 SDK 的 Microsoft 軟體授權條款See the Microsoft Software License Terms for the Speech SDK.

下載 .zip 套件形式的語音 SDK,並將它解壓縮到新建立的資料夾。Download the Speech SDK as a .zip package and unpack it into the newly created folder. 這會解壓縮出兩個檔案,microsoft.cognitiveservices.speech.sdk.bundle.jsmicrosoft.cognitiveservices.speech.sdk.bundle.js.mapThis results in two files being unpacked, microsoft.cognitiveservices.speech.sdk.bundle.js and microsoft.cognitiveservices.speech.sdk.bundle.js.map. 第二個檔案是選擇性的,並適用於對 SDK 程式碼進行偵錯。The latter file is optional, and is useful for debugging into the SDK code.

建立 index.html 網頁Create an index.html page

在資料夾中建立名為 index.html 的新檔案,並使用文字編輯器開啟此檔案。Create a new file in the folder, named index.html and open this file with a text editor.

  1. 建立下列 HTML 基本架構:Create the following HTML skeleton:

    <html>
    <head>
       <title>Speech SDK JavaScript Quickstart</title>
    </head>
    <body>
     <!-- UI code goes here -->
    
     <!-- SDK reference goes here -->
    
     <!-- Optional authorization token request goes here -->
    
     <!-- Sample code goes here -->
    </body>
    </html>
    
  2. 將下列 UI 程式碼新增至您檔案中的第一個註解下方:Add the following UI code to your file, below the first comment:

    <div id="warning">
      <h1 style="font-weight:500;">Speech Recognition Speech SDK not found (microsoft.cognitiveservices.speech.sdk.bundle.js missing).</h1>
    </div>
    
    <div id="content" style="display:none">
      <table width="100%">
        <tr>
          <td></td>
          <td><h1 style="font-weight:500;">Microsoft Cognitive Services Speech SDK JavaScript Quickstart</h1></td>
        </tr>
        <tr>
          <td align="right"><a href="https://docs.microsoft.com/azure/cognitive-services/speech-service/get-started" target="_blank">Subscription</a>:</td>
          <td><input id="subscriptionKey" type="text" size="40" value="subscription"></td>
        </tr>
        <tr>
          <td align="right">Region</td>
          <td><input id="serviceRegion" type="text" size="40" value="YourServiceRegion"></td>
        </tr>
        <tr>
          <td></td>
          <td><button id="startRecognizeOnceAsyncButton">Start recognition</button></td>
        </tr>
        <tr>
          <td align="right" valign="top">Results</td>
          <td><textarea id="phraseDiv" style="display: inline-block;width:500px;height:200px"></textarea></td>
        </tr>
      </table>
    </div>
    
  3. 新增對語音 SDK 的參考Add a reference to the Speech SDK

    <!-- Speech SDK reference sdk. -->
    <script src="microsoft.cognitiveservices.speech.sdk.bundle.js"></script>
    
  4. 連接辨識按鈕的處理常式、辨識結果,以及 UI 程式碼定義的訂用帳戶相關欄位:Wire up handlers for the recognition button, recognition result, and subscription-related fields defined by the UI code:

    <!-- Speech SDK USAGE -->
    <script>
      // status fields and start button in UI
      var phraseDiv;
      var startRecognizeOnceAsyncButton;
    
      // subscription key and region for speech services.
      var subscriptionKey, serviceRegion;
      var authorizationToken;
      var SpeechSDK;
      var recognizer;
    
      document.addEventListener("DOMContentLoaded", function () {
        startRecognizeOnceAsyncButton = document.getElementById("startRecognizeOnceAsyncButton");
        subscriptionKey = document.getElementById("subscriptionKey");
        serviceRegion = document.getElementById("serviceRegion");
        phraseDiv = document.getElementById("phraseDiv");
    
        startRecognizeOnceAsyncButton.addEventListener("click", function () {
          startRecognizeOnceAsyncButton.disabled = true;
          phraseDiv.innerHTML = "";
    
          // if we got an authorization token, use the token. Otherwise use the provided subscription key
          var speechConfig;
          if (authorizationToken) {
            speechConfig = SpeechSDK.SpeechConfig.fromAuthorizationToken(authorizationToken, serviceRegion.value);
          } else {
            if (subscriptionKey.value === "" || subscriptionKey.value === "subscription") {
              alert("Please enter your Microsoft Cognitive Services Speech subscription key!");
              return;
            }
            speechConfig = SpeechSDK.SpeechConfig.fromSubscription(subscriptionKey.value, serviceRegion.value);
          }
    
          speechConfig.speechRecognitionLanguage = "en-US";
          var audioConfig  = SpeechSDK.AudioConfig.fromDefaultMicrophoneInput();
          recognizer = new SpeechSDK.SpeechRecognizer(speechConfig, audioConfig);
    
          recognizer.recognizeOnceAsync(
            function (result) {
              startRecognizeOnceAsyncButton.disabled = false;
              phraseDiv.innerHTML += result.text;
              window.console.log(result);
    
              recognizer.close();
              recognizer = undefined;
            },
            function (err) {
              startRecognizeOnceAsyncButton.disabled = false;
              phraseDiv.innerHTML += err;
              window.console.log(err);
    
              recognizer.close();
              recognizer = undefined;
            });
        });
    
        if (!!window.SpeechSDK) {
          SpeechSDK = window.SpeechSDK;
          startRecognizeOnceAsyncButton.disabled = false;
    
          document.getElementById('content').style.display = 'block';
          document.getElementById('warning').style.display = 'none';
    
          // in case we have a function for getting an authorization token, call it.
          if (typeof RequestAuthorizationToken === "function") {
              RequestAuthorizationToken();
          }
        }
      });
    </script>
    

建立權杖來源 (選擇性)Create the token source (optional)

如果想要在 Web 伺服器上裝載網頁,可以選擇性提供示範應用程式的權杖來源。In case you want to host the web page on a web server, you can optionally provide a token source for your demo application. 這樣一來,訂用帳戶金鑰將永遠不會離開您的伺服器,同時可讓使用者不需要輸入任何授權代碼就能使用語音功能。That way, your subscription key will never leave your server while allowing users to use speech capabilities without entering any authorization code themselves.

  1. 建立名為 token.php 的新檔案。Create a new file named token.php. 在此範例中,我們會假設您的 Web 伺服器支援 PHP 指令碼語言。In this example we assume your web server supports the PHP scripting language. 輸入下列程式碼:Enter the following code:

    <?php
    header('Access-Control-Allow-Origin: ' . $_SERVER['SERVER_NAME']);
    
    // Replace with your own subscription key and service region (e.g., "westus").
    $subscriptionKey = 'YourSubscriptionKey';
    $region = 'YourServiceRegion';
    
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, 'https://' . $region . '.api.cognitive.microsoft.com/sts/v1.0/issueToken');
    curl_setopt($ch, CURLOPT_POST, 1);
    curl_setopt($ch, CURLOPT_POSTFIELDS, '{}');
    curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: application/json', 'Ocp-Apim-Subscription-Key: ' . $subscriptionKey)); 
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    echo curl_exec($ch);
    ?>
    
  2. 編輯 index.html 檔案,並將下列程式碼加到您的檔案中:Edit the index.html file and add the following code to your file:

    <!-- Speech SDK Authorization token -->
    <script>
    // Note: Replace the URL with a valid endpoint to retrieve
    //       authorization tokens for your subscription.
    var authorizationEndpoint = "token.php";
    
    function RequestAuthorizationToken() {
      if (authorizationEndpoint) {
        var a = new XMLHttpRequest();
        a.open("GET", authorizationEndpoint);
        a.setRequestHeader("Content-Type", "application/x-www-form-urlencoded");
        a.send("");
        a.onload = function() {
            var token = JSON.parse(atob(this.responseText.split(".")[1]));
            serviceRegion.value = token.region;
            authorizationToken = this.responseText;
            subscriptionKey.disabled = true;
            subscriptionKey.value = "using authorization token (hit F5 to refresh)";
            console.log("Got an authorization token: " + token);
        }
      }
    }
    </script>
    

注意

授權權杖的存留期是有限的。Authorization tokens only have a limited lifetime. 這個簡化範例不會說明如何自動重新整理授權權杖。This simplified example does not show how to refresh authorization tokens automatically. 如果您是使用者,可以手動重新載入或按 F5 以重新整理頁面。As a user, you can manually reload the page or hit F5 to refresh.

在本機建置並執行範例Build and run the sample locally

若要啟動應用程式,請按兩下 index.html 檔案或使用最愛的網頁瀏覽器開啟 index.html。To launch the app, double-click on the index.html file or open index.html with your favorite web browser. 它將會提供簡單的 GUI,供您輸入訂用帳戶金鑰與區域,並使用麥克風觸發辨識。It will present a simple GUI allowing you to enter your subscription key and region and trigger a recognition using the microphone.

注意

此方法無法在 Safari 瀏覽器上運作。This method doesn't work on the Safari browser. 在 Safari 上,範例網頁必須裝載在 Web 伺服器上;Safari 不允許從本機檔案載入的網站使用麥克風。On Safari, the sample web page needs to be hosted on a web server; Safari doesn't allow websites loaded from a local file to use the microphone.

透過 Web 伺服器建置並執行範例Build and run the sample via a web server

若要啟動應用程式,請開啟您最愛的網頁瀏覽器並指向資料夾裝載所在的公用 URL,輸入您的地區,並使用麥克風觸發辨識。To launch your app, open your favorite web browser and point it to the public URL that you host the folder on, enter your region, and trigger a recognition using the microphone. 如果已設定,它將會從您的權杖來源取得權杖。If configured, it will acquire a token from your token source.

後續步驟Next steps