Quickstart: Recognize speech in Objective-C on macOS using the Speech SDK

In this article, you learn how to create a macOS app in Objective-C using the Cognitive Services Speech SDK to transcribe speech recorded from a microphone to text.

Prerequisites

Before you get started, here's a list of prerequisites:

Get the Speech SDK for macOS

Important

By downloading any of the Speech SDK for Azure Cognitive Services components on this page, you acknowledge its license. See the Microsoft Software License Terms for the Speech SDK.

The current version of the Cognitive Services Speech SDK is 1.6.0.

The Cognitive Services Speech SDK for Mac is distributed as a framework bundle. It can be used in Xcode projects as a CocoaPod, or downloaded from https://aka.ms/csspeech/macosbinary and linked manually. This guide uses a CocoaPod.

Create an Xcode project

Start Xcode, and start a new project by clicking File > New > Project. In the template selection dialog, choose the "Cocoa App" template.

In the dialogs that follow, make the following selections:

  1. Project Options Dialog
    1. Enter a name for the quickstart app, for example helloworld.
    2. Enter an appropriate organization name and an organization identifier, if you already have an Apple developer account. For testing purposes, you can just pick any name like testorg. To sign the app, you need a proper provisioning profile. Refer to the Apple developer site for details.
    3. Make sure Objective-C is chosen as the language for the project.
    4. Disable the checkboxes to use storyboards and to create a document-based application. The simple UI for the sample app will be created programmatically.
    5. Disable all checkboxes for tests and core data. Project Settings
  2. Select project directory
    1. Choose a directory to put the project in. This creates a helloworld directory in your home directory that contains all the files for the Xcode project.
    2. Disable the creation of a Git repo for this example project.
  3. Set the entitlements for network and microphone access. Click the app name in the first line in the overview on the left to get to the app configuration, and then choose the "Capabilities" tab.
    1. Enable the "App sandbox" setting for the app.
    2. Enable the checkboxes for "Outgoing Connections" and "Microphone" access. Sandbox Settings
  4. The app also needs to declare use of the microphone in the Info.plist file. Click on the file in the overview, and add the "Privacy - Microphone Usage Description" key, with a value like "Microphone is needed for speech recognition". Settings in Info.plist
  5. Close the Xcode project. You will use a different instance of it later after setting up the CocoaPods.

Install the SDK as a CocoaPod

  1. Install the CocoaPod dependency manager as described in its installation instructions.
  2. Navigate to the directory of your sample app (helloworld). Place a text file with the name Podfile and the following content in that directory:
    target 'helloworld' do
      platform :osx, '10.13'
      pod 'MicrosoftCognitiveServicesSpeech-macOS', '~> 1.6.0'
    end
    
  3. Navigate to the helloworld directory in a terminal and run the command pod install. This will generate a helloworld.xcworkspace Xcode workspace containing both the sample app and the Speech SDK as a dependency. This workspace will be used in the following.

Add the sample code

  1. Open the helloworld.xcworkspace workspace in Xcode.
  2. Replace the contents of the autogenerated AppDelegate.m file by:
    #import "AppDelegate.h"
    #import <MicrosoftCognitiveServicesSpeech/SPXSpeechApi.h>
    
    
    @interface AppDelegate ()
    @property (weak) IBOutlet NSWindow *window;
    @property (weak) NSButton *button;
    @property (strong) NSTextField *label;
    @end
    
    @implementation AppDelegate
    
    - (void)applicationDidFinishLaunching:(NSNotification *)aNotification {
        self.button = [NSButton buttonWithTitle:@"Start Recognition" target:nil action:nil];
        [self.button setTarget:self];
        [self.button setAction:@selector(buttonPressed:)];
        [self.window.contentView addSubview:self.button];
    
        self.label = [[NSTextField alloc] initWithFrame:NSMakeRect(100, 100, 200, 17)];
        [self.label setStringValue:@"Press Button!"];
        [self.window.contentView addSubview:self.label];
    }
    
    - (void)buttonPressed:(NSButton *)button {
        // Creates an instance of a speech config with specified subscription key and service region.
        // Replace with your own subscription key // and service region (e.g., "westus").
        NSString *speechKey = @"YourSubscriptionKey";
        NSString *serviceRegion = @"YourServiceRegion";
    
        SPXAudioConfiguration *audioConfig = [[SPXAudioConfiguration alloc] initWithMicrophone:nil];
        SPXSpeechConfiguration *speechConfig = [[SPXSpeechConfiguration alloc] initWithSubscription:speechKey region:serviceRegion];
        SPXSpeechRecognizer *speechRecognizer = [[SPXSpeechRecognizer alloc] initWithSpeechConfiguration:speechConfig audioConfiguration:audioConfig];
    
        NSLog(@"Say something...");
    
        // Starts speech recognition, and returns after a single utterance is recognized. The end of a
        // single utterance is determined by listening for silence at the end or until a maximum of 15
        // seconds of audio is processed.  The task returns the recognition text as result.
        // Note: Since recognizeOnce() returns only a single utterance, it is suitable only for single
        // shot recognition like command or query.
        // For long-running multi-utterance recognition, use startContinuousRecognitionAsync() instead.
        SPXSpeechRecognitionResult *speechResult = [speechRecognizer recognizeOnce];
    
        // Checks result.
        if (SPXResultReason_Canceled == speechResult.reason) {
            SPXCancellationDetails *details = [[SPXCancellationDetails alloc] initFromCanceledRecognitionResult:speechResult];
            NSLog(@"Speech recognition was canceled: %@. Did you pass the correct key/region combination?", details.errorDetails);
            [self.label setStringValue:([NSString stringWithFormat:@"Canceled: %@", details.errorDetails ])];
        } else if (SPXResultReason_RecognizedSpeech == speechResult.reason) {
            NSLog(@"Speech recognition result received: %@", speechResult.text);
            [self.label setStringValue:(speechResult.text)];
        } else {
            NSLog(@"There was an error.");
            [self.label setStringValue:(@"Speech Recognition Error")];
        }
    }
    
    - (void)applicationWillTerminate:(NSNotification *)aNotification {
        // Insert code here to tear down your application
    }
    
    - (BOOL)applicationShouldTerminateAfterLastWindowClosed:(NSApplication *)theApplication {
        return YES;
    }
    
    @end
    
  3. Replace the string YourSubscriptionKey with your subscription key.
  4. Replace the string YourServiceRegion with the region associated with your subscription (for example, westus for the free trial subscription).

Build and run the sample

  1. Make the debug output visible (View > Debug Area > Activate Console).
  2. Build and run the example code by selecting Product > Run from the menu or clicking the Play button.
  3. After you click the button and say a few words, you should see the text you have spoken on the lower part of the screen. When you run the app for the first time, you should be prompted to give the app access to your computer's microphone.

Next steps