com.microsoft.cognitiveservices.speech

Classes

AudioDataStream

Represents audio data stream used for operating audio data as a stream. Added in version 1.7.0

AutoDetectSourceLanguageConfig

Represents auto detect source language configuration used for specifying the possible source language candidates Updated in version 1.13.0

AutoDetectSourceLanguageResult

Represents the result of auto detecting source languages Added in version 1.8.0

CancellationDetails

Contains detailed information about why a result was canceled.

ClassLanguageModel

Represents a ClassLanguageModel.

ClassLanguageModels are only usable in specific scenarios and are not generally available.

Added in version 1.7.0

Connection

Connection is a proxy class for managing connection to the speech service of the specified Recognizer. By default, a Recognizer autonomously manages connection to service when needed. The Connection class provides additional methods for users to explicitly open or close a connection and to subscribe to connection status changes. The use of Connection is optional. It is intended for scenarios where fine tuning of application behavior based on connection status is needed. Users can optionally call openConnection() to manually initiate a service connection before starting recognition on the Recognizer associated with this Connection. After starting a recognition, calling openConnection() or closeConnection() might fail. This will not impact the Recognizer or the ongoing recognition. Connection might drop for various reasons, the Recognizer will always try to reinstitute the connection as required to guarantee ongoing operations. In all these cases Connected/Disconnected events will indicate the change of the connection status. Note: close() must be called in order to relinquish underlying resources held by the object. Added in version 1.2.0.

ConnectionEventArgs

Defines payload for connection events like Connected/Disconnected. Added in version 1.2.0

Grammar

Represents a generic grammar used to assist in improving speech recogniton accuracy.

GrammarList

Allows adding multiple grammars to a SpeechRecognizer to improve the accuracy of speech recognition.

GrammarLists are only usable in specific scenarios and are not generally available.

KeywordRecognitionEventArgs

Defines content of an keyword recognizing/recognized events.

KeywordRecognitionModel

Represents a keyword recognition model for recognizing when the user says a keyword to initiate further speech recognition. Note: Keyword spotting (KWS) functionality might work with any microphone type, official KWS support, however, is currently limited to the microphone arrays found in the Azure Kinect DK hardware or the Speech Devices SDK.

KeywordRecognitionResult

Defines result of keyword recognition.

KeywordRecognizer

Performs keyword recognition on the speech input. Note: close() must be called in order to relinquish underlying resources held by the object.

NoMatchDetails

Contains detailed information for NoMatch recognition results.

PhraseListGrammar

Allows additions of new phrases to improve speech recognition.

Phrases added to the recognizer are effective at the start of the next recognition, or the next time the SpeechSDK must reconnect to the speech service. Added in version 1.5.0.

PropertyCollection

Represents collection of properties and their values.

RecognitionEventArgs

Defines payload for recognition events like Speech Start/End Detected

RecognitionResult

Contains detailed information about result of a recognition operation.

Recognizer

Defines the base class Recognizer which mainly contains common event handlers.

SessionEventArgs

Defines payload for SessionStarted/Stopped events.

SourceLanguageConfig

Represents source language configuration used for specifying recognition source language Added in version 1.8.0

SpeechConfig

Speech configuration. Changed in version 1.7.0.

SpeechRecognitionCanceledEventArgs

Defines payload of speech recognition canceled events.

SpeechRecognitionEventArgs

Defines contents of speech recognizing/recognized event.

SpeechRecognitionResult

Defines result of speech recognition.

SpeechRecognizer

Performs speech recognition from microphone, file, or other audio input streams, and gets transcribed text as result. Note: close() must be called in order to relinquish underlying resources held by the object.

SpeechSynthesisCancellationDetails

Contains detailed information about why a speech synthesis was canceled. Added in version 1.7.0

SpeechSynthesisEventArgs

Defines contents of speech synthesis related event. Added in version 1.7.0

SpeechSynthesisResult

Contains detailed information about result of a speech synthesis operation. Added in version 1.7.0

SpeechSynthesisWordBoundaryEventArgs

Defines contents of speech synthesis word boundary event. Added in version 1.7.0

SpeechSynthesizer

Performs speech synthesis to speaker, file, or other audio output streams, and gets synthesized audio as result. Added in version 1.7.0

Enums

CancellationErrorCode

Defines error code in case that CancellationReason is Error. Added in version 1.1.0.

CancellationReason

Defines the possible reasons a recognition result might be canceled.

NoMatchReason

Defines the possible reasons a recognition result might not be recognized.

OutputFormat

Define Speech Recognizer output formats.

ProfanityOption

Define profanity option for response result. Added in version 1.5.0.

PropertyId

Defines property ids. Changed in version 1.8.0.

ResultReason

Defines the possible reasons a recognition result might be generated. Changed in version 1.7.0.

ServicePropertyChannel

Defines channels used to send service properties. Added in version 1.5.0.

SpeechSynthesisOutputFormat

Defines the possible speech synthesis output audio format. Added in version 1.7.0

StreamStatus

Defines the possible status of audio data stream. Added in version 1.7.0