快速入門:使用語音 SDK 在 iOS 上以 Objective-C 辨識語音Quickstart: Recognize speech in Objective-C on iOS using the Speech SDK

在本文中,您將了解如何使用認知服務語音 SDK 以 Objective-C 建立 iOS 應用程式,以將從麥克風或含有錄製音訊的檔案輸出的語音轉譯成文字。In this article, you learn how to create an iOS app in Objective-C using the Cognitive Services Speech SDK to transcribe speech to text from microphone or from a file with recorded audio.

必要條件Prerequisites

開始之前,以下為必要條件清單:Before you get started, here's a list of prerequisites:

取得適用於 iOS 的語音 SDKGet the Speech SDK for iOS

重要

下載此頁面上的任何「Azure 認知服務的語音 SDK」元件,即表示您知悉其授權。By downloading any of the Speech SDK for Azure Cognitive Services components on this page, you acknowledge its license. 請參閱語音 SDK 的 Microsoft 軟體授權條款See the Microsoft Software License Terms for the Speech SDK.

認知服務語音 SDK 目前的版本為 1.6.0The current version of the Cognitive Services Speech SDK is 1.6.0.

適用於 iOS 的認知服務語音 SDK 目前以 Cocoa Framework 的形式散發。The Cognitive Services Speech SDK for iOS is currently distributed as a Cocoa Framework. 您可以從這裡加以下載。It can be downloaded from here. 請將檔案下載到您的主目錄。Download the file to your home directory.

建立 Xcode 專案Create an Xcode Project

啟動 Xcode,然後按一下 [檔案] > [新增] > [專案] 以啟動新專案。Start Xcode, and start a new project by clicking File > New > Project. 在範本選取對話方塊中,選擇「iOS 單一檢視應用程式」範本。In the template selection dialog, choose the "iOS Single View App" template.

在後續的對話方塊中,進行下列選取:In the dialogs that follow, make the following selections:

  1. 專案選項對話方塊Project Options Dialog
    1. 輸入快速入門應用程式的名稱,例如 helloworldEnter a name for the quickstart app, for example helloworld.
    2. 如果您已有 Apple 開發人員帳戶,請輸入適當的組織名稱和組織識別碼。Enter an appropriate organization name and organization identifier, if you already have an Apple developer account. 基於測試用途,您可以直接挑選類似於 testorg 的任何名稱。For testing purposes, you can just pick any name like testorg. 若要簽署應用程式,您需要適當的佈建設定檔。To sign the app, you need a proper provisioning profile. 如需詳細資訊,請參閱 Apple 開發人員網站Refer to the Apple developer site for details.
    3. 確定已選擇 Objective-C 作為專案的語言。Make sure Objective-C is chosen as the language for the project.
    4. 停用測試和核心資料的所有核取方塊。Disable all checkboxes for tests and core data. 專案設定Project Settings
  2. 選取專案目錄Select project directory
    1. 選擇用來放置專案的主目錄。Choose your home directory to put the project in. 這會在您的主目錄中建立 helloworld 目錄,其中包含 Xcode 專案檔的所有檔案。This creates a helloworld directory in your home directory that contains all the files for the Xcode project.
    2. 停用為此範例專案建立 Git 存放庫的功能。Disable the creation of a Git repo for this example project.
    3. 在 [專案設定] 中調整 SDK 的路徑。Adjust the paths to the SDK in the Project Settings.
      1. 在 [內嵌的二進位檔] 標頭下方的 [一般] 索引標籤中,新增 SDK 程式庫作為架構:[新增內嵌的二進位檔] > [新增其他...] > 瀏覽至您的主目錄,然後選擇檔案 MicrosoftCognitiveServicesSpeech.frameworkIn the General tab under the Embedded Binaries header, add the SDK library as a framework: Add embedded binaries > Add other... > Navigate to your home directory and choose the file MicrosoftCognitiveServicesSpeech.framework. 這會將 SDK 程式庫自動加入連結的架構和程式庫標頭。This adds the SDK library to the header Linked Framework and Libraries automatically. 新增的架構Added Framework
      2. 移至 [組建設定] 索引標籤,然後啟動 [所有] 設定。Go to the Build Settings tab and activate All settings.
      3. 將目錄 $(SRCROOT)/.. 新增至 [搜尋路徑] 標題下方的 [架構搜尋路徑] 。Add the directory $(SRCROOT)/.. to the Framework Search Paths under the Search Paths heading. 架構搜尋路徑設定Framework Search Path setting

設定 UISet up the UI

範例應用程式將提供一個非常簡單的 UI:以檔案或是麥克風輸入啟動語音辨識的兩個按鈕,以及顯示結果的文字標籤。The example app will have a very simple UI: Two buttons to start speech recognition either from file or from microphone input, and a text label to display the result. 此 UI 可在專案的 Main.storyboard 部分中設定。The UI is set up in the Main.storyboard part of the project. 以滑鼠右鍵按一下專案樹狀結構的 Main.storyboard 項目,然後選取 [開啟形式...] > [原始程式碼] ,以開啟分鏡腳本的 XML 檢視。Open the XML view of the storyboard by right-clicking the Main.storyboard entry of the project tree and selecting Open As... > Source Code. 請將自動產生的 XML 取代為此程式碼:Replace the autogenerated XML with this code:

<?xml version="1.0" encoding="UTF-8"?>
<document type="com.apple.InterfaceBuilder3.CocoaTouch.Storyboard.XIB" version="3.0" toolsVersion="14113" targetRuntime="iOS.CocoaTouch" propertyAccessControl="none" useAutolayout="YES" useTraitCollections="YES" useSafeAreas="YES" colorMatched="YES" initialViewController="BYZ-38-t0r">
    <device id="retina4_7" orientation="portrait">
        <adaptation id="fullscreen"/>
    </device>
    <dependencies>
        <deployment identifier="iOS"/>
        <plugIn identifier="com.apple.InterfaceBuilder.IBCocoaTouchPlugin" version="14088"/>
        <capability name="Safe area layout guides" minToolsVersion="9.0"/>
        <capability name="documents saved in the Xcode 8 format" minToolsVersion="8.0"/>
    </dependencies>
    <scenes>
        <!--View Controller-->
        <scene sceneID="tne-QT-ifu">
            <objects>
                <viewController id="BYZ-38-t0r" customClass="ViewController" sceneMemberID="viewController">
                    <view key="view" contentMode="scaleToFill" id="8bC-Xf-vdC">
                        <rect key="frame" x="0.0" y="0.0" width="375" height="667"/>
                        <autoresizingMask key="autoresizingMask" widthSizable="YES" heightSizable="YES"/>
                        <subviews>
                            <button opaque="NO" contentMode="scaleToFill" fixedFrame="YES" contentHorizontalAlignment="center" contentVerticalAlignment="center" buttonType="roundedRect" lineBreakMode="middleTruncation" translatesAutoresizingMaskIntoConstraints="NO" id="qFP-u7-47Q">
                                <rect key="frame" x="84" y="247" width="207" height="82"/>
                                <autoresizingMask key="autoresizingMask" flexibleMaxX="YES" flexibleMaxY="YES"/>
                                <accessibility key="accessibilityConfiguration" hint="Start speech recognition from file" identifier="recognize_file_button">
                                    <accessibilityTraits key="traits" button="YES" staticText="YES"/>
                                    <bool key="isElement" value="YES"/>
                                </accessibility>
                                <fontDescription key="fontDescription" type="system" pointSize="30"/>
                                <state key="normal" title="Recognize (File)"/>
                                <connections>
                                    <action selector="recognizeFromFileButtonTapped:" destination="BYZ-38-t0r" eventType="touchUpInside" id="Vfr-ah-nbC"/>
                                </connections>
                            </button>
                            <label opaque="NO" userInteractionEnabled="NO" contentMode="center" horizontalHuggingPriority="251" verticalHuggingPriority="251" fixedFrame="YES" text="Recognition result" textAlignment="center" lineBreakMode="tailTruncation" numberOfLines="5" baselineAdjustment="alignBaselines" adjustsFontSizeToFit="NO" translatesAutoresizingMaskIntoConstraints="NO" id="tq3-GD-ljB">
                                <rect key="frame" x="20" y="408" width="335" height="148"/>
                                <autoresizingMask key="autoresizingMask" flexibleMaxX="YES" flexibleMaxY="YES"/>
                                <accessibility key="accessibilityConfiguration" hint="The result of speech recognition" identifier="result_label">
                                    <accessibilityTraits key="traits" notEnabled="YES"/>
                                    <bool key="isElement" value="NO"/>
                                </accessibility>
                                <fontDescription key="fontDescription" type="system" pointSize="30"/>
                                <color key="textColor" red="0.5" green="0.5" blue="0.5" alpha="1" colorSpace="custom" customColorSpace="sRGB"/>
                                <nil key="highlightedColor"/>
                            </label>
                            <button opaque="NO" contentMode="scaleToFill" fixedFrame="YES" contentHorizontalAlignment="center" contentVerticalAlignment="center" buttonType="roundedRect" lineBreakMode="middleTruncation" translatesAutoresizingMaskIntoConstraints="NO" id="91d-Ki-IyR">
                                <rect key="frame" x="16" y="209" width="339" height="30"/>
                                <autoresizingMask key="autoresizingMask" flexibleMaxX="YES" flexibleMaxY="YES"/>
                                <accessibility key="accessibilityConfiguration" hint="Start speech recognition from microphone" identifier="recognize_microphone_button"/>
                                <fontDescription key="fontDescription" type="system" pointSize="30"/>
                                <state key="normal" title="Recognize (Microphone)"/>
                                <connections>
                                    <action selector="recognizeFromMicButtonTapped:" destination="BYZ-38-t0r" eventType="touchUpInside" id="2n3-kA-ySa"/>
                                </connections>
                            </button>
                        </subviews>
                        <color key="backgroundColor" red="1" green="1" blue="1" alpha="1" colorSpace="custom" customColorSpace="sRGB"/>
                        <viewLayoutGuide key="safeArea" id="6Tk-OE-BBY"/>
                    </view>
                    <connections>
                        <outlet property="recognitionResultLabel" destination="tq3-GD-ljB" id="kP4-o4-s0Q"/>
                    </connections>
                </viewController>
                <placeholder placeholderIdentifier="IBFirstResponder" id="dkx-z0-nzr" sceneMemberID="firstResponder"/>
            </objects>
            <point key="canvasLocation" x="135.19999999999999" y="132.68365817091455"/>
        </scene>
    </scenes>
</document>

新增範例程式碼Add the sample code

  1. 以滑鼠右鍵按一下連結,然後選擇 [另存目標...] ,以下載範例 wav 檔案。將 wav 檔案從 [搜尋工具] 視窗拖曳到 [專案] 檢視的根層級中,以將其新增為專案的資源。Download the sample wav file by right-clicking the link and choosing Save target as.... Add the wav file to the project as a resource by dragging it from a Finder window into the root level of the Project view. 在下列對話方塊中按一下 [完成] ,而不變更設定。Click Finish in the following dialog without changing the settings.

  2. 將自動產生的 ViewController.m 檔案內容取代為:Replace the contents of the autogenerated ViewController.m file by:

    #import "ViewController.h"
    #import <MicrosoftCognitiveServicesSpeech/SPXSpeechApi.h>
    
    @interface ViewController () {
        NSString *speechKey;
        NSString *serviceRegion;
    }
    
    @property (weak, nonatomic) IBOutlet UIButton *recognizeFromFileButton;
    @property (weak, nonatomic) IBOutlet UIButton *recognizeFromMicButton;
    @property (weak, nonatomic) IBOutlet UILabel *recognitionResultLabel;
    - (IBAction)recognizeFromFileButtonTapped:(UIButton *)sender;
    - (IBAction)recognizeFromMicButtonTapped:(UIButton *)sender;
    @end
    
    @implementation ViewController
    
    - (void)viewDidLoad {
        speechKey = @"YourSubscriptionKey";
        serviceRegion = @"YourServiceRegion";
    }
    
    - (IBAction)recognizeFromFileButtonTapped:(UIButton *)sender {
        dispatch_async(dispatch_get_global_queue(QOS_CLASS_DEFAULT, 0), ^{
            [self recognizeFromFile];
        });
    }
    
    - (IBAction)recognizeFromMicButtonTapped:(UIButton *)sender {
        dispatch_async(dispatch_get_global_queue(QOS_CLASS_DEFAULT, 0), ^{
            [self recognizeFromMicrophone];
        });
    }
    
    - (void)recognizeFromFile {
        NSBundle *mainBundle = [NSBundle mainBundle];
        NSString *weatherFile = [mainBundle pathForResource: @"whatstheweatherlike" ofType:@"wav"];
        NSLog(@"weatherFile path: %@", weatherFile);
        if (!weatherFile) {
            NSLog(@"Cannot find audio file!");
            [self updateRecognitionErrorText:(@"Cannot find audio file")];
            return;
        }
    
        SPXAudioConfiguration* weatherAudioSource = [[SPXAudioConfiguration alloc] initWithWavFileInput:weatherFile];
        if (!weatherAudioSource) {
            NSLog(@"Loading audio file failed!");
            [self updateRecognitionErrorText:(@"Audio Error")];
            return;
        }
    
        SPXSpeechConfiguration *speechConfig = [[SPXSpeechConfiguration alloc] initWithSubscription:speechKey region:serviceRegion];
        if (!speechConfig) {
            NSLog(@"Could not load speech config");
            [self updateRecognitionErrorText:(@"Speech Config Error")];
            return;
        }
    
        [self updateRecognitionStatusText:(@"Recognizing...")];
    
        SPXSpeechRecognizer* speechRecognizer = [[SPXSpeechRecognizer alloc] initWithSpeechConfiguration:speechConfig audioConfiguration:weatherAudioSource];
        if (!speechRecognizer) {
            NSLog(@"Could not create speech recognizer");
            [self updateRecognitionResultText:(@"Speech Recognition Error")];
            return;
        }
    
        SPXSpeechRecognitionResult *speechResult = [speechRecognizer recognizeOnce];
        if (SPXResultReason_Canceled == speechResult.reason) {
            SPXCancellationDetails *details = [[SPXCancellationDetails alloc] initFromCanceledRecognitionResult:speechResult];
            NSLog(@"Speech recognition was canceled: %@. Did you pass the correct key/region combination?", details.errorDetails);
            [self updateRecognitionErrorText:([NSString stringWithFormat:@"Canceled: %@", details.errorDetails ])];
        } else if (SPXResultReason_RecognizedSpeech == speechResult.reason) {
            NSLog(@"Speech recognition result received: %@", speechResult.text);
            [self updateRecognitionResultText:(speechResult.text)];
        } else {
            NSLog(@"There was an error.");
            [self updateRecognitionErrorText:(@"Speech Recognition Error")];
        }
    }
    
    - (void)recognizeFromMicrophone {
        SPXSpeechConfiguration *speechConfig = [[SPXSpeechConfiguration alloc] initWithSubscription:speechKey region:serviceRegion];
        if (!speechConfig) {
            NSLog(@"Could not load speech config");
            [self updateRecognitionErrorText:(@"Speech Config Error")];
            return;
        }
        
        [self updateRecognitionStatusText:(@"Recognizing...")];
        
        SPXSpeechRecognizer* speechRecognizer = [[SPXSpeechRecognizer alloc] init:speechConfig];
        if (!speechRecognizer) {
            NSLog(@"Could not create speech recognizer");
            [self updateRecognitionResultText:(@"Speech Recognition Error")];
            return;
        }
        
        SPXSpeechRecognitionResult *speechResult = [speechRecognizer recognizeOnce];
        if (SPXResultReason_Canceled == speechResult.reason) {
            SPXCancellationDetails *details = [[SPXCancellationDetails alloc] initFromCanceledRecognitionResult:speechResult];
            NSLog(@"Speech recognition was canceled: %@. Did you pass the correct key/region combination?", details.errorDetails);
            [self updateRecognitionErrorText:([NSString stringWithFormat:@"Canceled: %@", details.errorDetails ])];
        } else if (SPXResultReason_RecognizedSpeech == speechResult.reason) {
            NSLog(@"Speech recognition result received: %@", speechResult.text);
            [self updateRecognitionResultText:(speechResult.text)];
        } else {
            NSLog(@"There was an error.");
            [self updateRecognitionErrorText:(@"Speech Recognition Error")];
        }
    }
    
    - (void)updateRecognitionResultText:(NSString *) resultText {
        dispatch_async(dispatch_get_main_queue(), ^{
            self.recognitionResultLabel.textColor = UIColor.blackColor;
            self.recognitionResultLabel.text = resultText;
        });
    }
    
    - (void)updateRecognitionErrorText:(NSString *) errorText {
        dispatch_async(dispatch_get_main_queue(), ^{
            self.recognitionResultLabel.textColor = UIColor.redColor;
            self.recognitionResultLabel.text = errorText;
        });
    }
    
    - (void)updateRecognitionStatusText:(NSString *) statusText {
        dispatch_async(dispatch_get_main_queue(), ^{
            self.recognitionResultLabel.textColor = UIColor.grayColor;
            self.recognitionResultLabel.text = statusText;
        });
    }
    
    @end
    
  3. 將字串 YourSubscriptionKey 取代為您的訂用帳戶金鑰。Replace the string YourSubscriptionKey with your subscription key.

  4. 以與您的訂用帳戶 (例如,免費試用訂用帳戶的 westus) 相關聯的區域取代 YourServiceRegion 字串。Replace the string YourServiceRegion with the region associated with your subscription (for example, westus for the free trial subscription).

  5. 將麥克風存取的要求。Add the request for microphone access. 以滑鼠右鍵按一下專案樹狀結構的 Info.plist 項目,然後選取 [Open As...] (開啟為) > [Source Code] (原始程式碼)。Right-click the Info.plist entry of the project tree and select Open As... > Source Code. 將以下幾行新增至 <dict> 區段中,然後儲存檔案。Add the following lines into the <dict> section and then save the file.

    <key>NSMicrophoneUsageDescription</key>
    <string>Need microphone access for speech recognition from microphone.</string>
    

建置並執行範例Building and Running the Sample

  1. 顯示偵錯輸出 ([檢視] > [偵錯區域] > [啟動主控台] )。Make the debug output visible (View > Debug Area > Activate Console).

  2. 在 [產品] > [目的地] 功能表中的清單中,選擇 iOS 模擬器或是連接到您開發電腦的 iOS 裝置,當做應用程式的目的地。Choose either the iOS simulator or an iOS device connected to your development machine as the destination for the app from the list in the Product > Destination menu.

  3. 建置範例程式碼,然後從功能表中選取 [產品] > [執行] 或按一下 [播放] 按鈕,在 iOS 模擬器中加以執行。Build and run the example code in the iOS simulator by selecting Product > Run from the menu or clicking the Play button.

  4. 按一下應用程式中的 [辨識 (檔案)] 按鈕後,應該會在螢幕的下半部看到音訊檔案的內容After you click the "Recognize (File)" button in the app, you should see the contents of the audio file "What's the weather like?" 「天氣如何?」。on the lower part of the screen.

    模擬的 iOS 應用程式

  5. 按一下應用程式中的 [辨識 (麥克風)] 按鈕,並說出幾個字後,應該會在螢幕的下半部看到說出的文字。After you click the "Recognize (Microphone)" button in the app and say a few words, you should see the text you have spoken on the lower part of the screen.

後續步驟Next steps