快速入門:使用適用於 Node.js 的語音 SDK 來辨識語音Quickstart: Recognize speech with the Speech SDK for Node.js

本文說明如何使用 Azure 認知服務語音 SDK 的 JavaScript 繫結來建立 Node.js 專案,以將語音轉譯為文字。This article shows you how to create a Node.js project by using the JavaScript binding of the Speech SDK for Azure Cognitive Services to transcribe speech to text. 此應用程式是以適用於 JavaScript 的語音 SDK 作為基礎。The application is based on the Speech SDK for JavaScript.

必要條件Prerequisites

建立新專案Create a new project

建立新的資料夾及初始化專案:Create a new folder and initialize the project:

npm init -f

這個命令會使用預設值來初始化 package.json 檔案。This command initializes the package.json files with default values. 您後續應該會再度編輯此檔案。You'll probably want to edit this file later.

安裝語音 SDKInstall the Speech SDK

將語音 SDK 新增到您的 Node.js 專案:Add the Speech SDK to your Node.js project:

npm install microsoft-cognitiveservices-speech-sdk

這個命令會從 npmjs 下載並安裝最新版本的語音 SDK,以及任何所需的必要條件。This command downloads and installs the latest version of the Speech SDK and any required prerequisites from npmjs. SDK 會安裝到您專案資料夾內的 node_modules 目錄中。The SDK installs in the node_modules directory inside your project folder.

使用語音 SDKUse the Speech SDK

在資料夾中建立名為 index.js 的新檔案,並使用文字編輯器開啟此檔案。Create a new file in the folder, named index.js, and open this file with a text editor.

注意

在 Node.js 中,語音 SDK 並不支援麥克風或 File 資料類型。In Node.js, the Speech SDK doesn't support the microphone or the File data type. 只有在瀏覽器上才支援這兩者。Both are only supported on browsers. 因此,請改為使用針對語音 SDK 的 Stream 介面,這可透過 AudioInputStream.createPushStream()AudioInputStream.createPullStream() 來達成。Instead, use the Stream interface to the Speech SDK, either through AudioInputStream.createPushStream() or AudioInputStream.createPullStream().

在此範例中,我們會使用 PushAudioInputStream 介面。In this example, we use the PushAudioInputStream interface.

新增這個 JavaScript 程式碼:Add this JavaScript code:

"use strict";

// pull in the required packages.
var sdk = require("microsoft-cognitiveservices-speech-sdk");
var fs = require("fs");

// replace with your own subscription key,
// service region (e.g., "westus"), and
// the name of the file you want to run
// through the speech recognizer.
var subscriptionKey = "YourSubscriptionKey";
var serviceRegion = "YourServiceRegion"; // e.g., "westus"
var filename = "YourAudioFile.wav"; // 16000 Hz, Mono

// create the push stream we need for the speech sdk.
var pushStream = sdk.AudioInputStream.createPushStream();

// open the file and push it to the push stream.
fs.createReadStream(filename).on('data', function(arrayBuffer) {
  pushStream.write(arrayBuffer.buffer);
}).on('end', function() {
  pushStream.close();
});

// we are done with the setup
console.log("Now recognizing from: " + filename);

// now create the audio-config pointing to our stream and
// the speech config specifying the language.
var audioConfig = sdk.AudioConfig.fromStreamInput(pushStream);
var speechConfig = sdk.SpeechConfig.fromSubscription(subscriptionKey, serviceRegion);

// setting the recognition language to English.
speechConfig.speechRecognitionLanguage = "en-US";

// create the speech recognizer.
var recognizer = new sdk.SpeechRecognizer(speechConfig, audioConfig);

// start the recognizer and wait for a result.
recognizer.recognizeOnceAsync(
  function (result) {
    console.log(result);

    recognizer.close();
    recognizer = undefined;
  },
  function (err) {
    console.trace("err - " + err);

    recognizer.close();
    recognizer = undefined;
  });

執行範例Run the sample

若要開啟應用程式,請將 YourSubscriptionKeyYourServiceRegionYourAudioFile.wav 納入您的設定。To open the app, adapt YourSubscriptionKey, YourServiceRegion, and YourAudioFile.wav to your configuration. 然後藉由呼叫此命令來加以執行:Then run it by calling this command:

node index.js

其會使用所提供的檔名來觸發辨識。It triggers a recognition by using the provided filename. 然後會在主控台上顯示輸出。And it presents the output on the console.

此範例是您在更新訂用帳戶金鑰並使用檔案 whatstheweatherlike.wav 之後,執行 index.js 時所產生的輸出:This sample is the output when you run index.js after you update the subscription key and use the file whatstheweatherlike.wav:

SpeechRecognitionResult {
  "privResultId": "9E30EEBD41AC4571BB77CF9164441F46",
  "privReason": 3,
  "privText": "What's the weather like?",
  "privDuration": 15900000,
  "privOffset": 300000,
  "privErrorDetails": null,
  "privJson": {
    "RecognitionStatus": "Success",
    "DisplayText": "What's the weather like?",
    "Offset": 300000,
    "Duration": 15900000
  },
  "privProperties": null
}

透過 Visual Studio Code 安裝及使用語音 SDKInstall and use the Speech SDK with Visual Studio Code

您也可以從 Visual Studio Code 執行範例。You can also run the sample from Visual Studio Code. 請遵循這些步驟來安裝、開啟及執行快速入門:Follow these steps to install, open, and run the quickstart:

  1. 啟動 Visual Studio Code。Start Visual Studio Code. 選取 [開啟資料夾] 。Select Open Folder. 然後瀏覽至 quickstart 資料夾。Then browse to the quickstart folder.

    開啟資料夾

  2. 在 Visual Studio Code 中開啟終端機。Open a terminal in Visual Studio Code.

    終端機視窗

  3. 執行 npm 以安裝相依性。Run npm to install the dependencies.

    npm 安裝

  4. 現在您已準備好開啟 index.js 並設定中斷點。Now you're ready to open index.jsand set a breakpoint.

    在行 16 上具有中斷點的 index.js

  5. 若要開始偵錯,請選取 F5 或從功能表選取 [偵錯/開始偵錯] 。To start debugging, either select F5 or select Debug/Start Debugging from the menu.

    [偵錯] 功能表

  6. 到達中斷點時,您便可以檢查呼叫堆疊和變數。When a breakpoint is hit, you can inspect the call stack and variables.

    偵錯工具

  7. 所有輸出都會顯示於 [偵錯主控台] 視窗中。Any output shows in the debug console window.

    偵錯主控台

後續步驟Next steps