com.microsoft.cognitiveservices.speech

Definition

Classes

AudioDataStream

Represents audio data stream used for operating audio data as a stream. Note: close() must be called in order to release underlying resources held by the object. Added in version 1.7.0

AutoDetectSourceLanguageConfig

Represents auto detect source language configuration used for specifying the possible source language candidates Note: close() must be called in order to release underlying resources held by the object. Updated in version 1.13.0

AutoDetectSourceLanguageResult

Represents the result of auto detecting source languages Added in version 1.8.0

CancellationDetails

Contains detailed information about why a result was canceled.

ClassLanguageModel

Represents a ClassLanguageModel.

ClassLanguageModels are only usable in specific scenarios and are not generally available. Note: close() must be called in order to release underlying resources held by the object. Added in version 1.7.0

Connection

Connection is a proxy class for managing connection to the speech service of the specified Recognizer. By default, a Recognizer autonomously manages connection to service when needed. The Connection class provides additional methods for users to explicitly open or close a connection and to subscribe to connection status changes. The use of Connection is optional. It is intended for scenarios where fine tuning of application behavior based on connection status is needed. Users can optionally call openConnection() to manually initiate a service connection before starting recognition on the Recognizer associated with this Connection. After starting a recognition, calling openConnection() or closeConnection() might fail. This will not impact the Recognizer or the ongoing recognition. Connection might drop for various reasons, the Recognizer will always try to reinstitute the connection as required to guarantee ongoing operations. In all these cases Connected/Disconnected events will indicate the change of the connection status. Note: close() must be called in order to release underlying resources held by the object. Updated in version 1.17.0.

ConnectionEventArgs

Defines payload for connection events like Connected/Disconnected. Added in version 1.2.0

ConnectionMessage

ConnectionMessage represents implementation specific messages sent to and received from the speech service. These messages are provided for debugging purposes and should not be used for production use cases with the Azure Cognitive Services Speech Service. Messages sent to and received from the Speech Service are subject to change without notice. This includes message contents, headers, payloads, ordering, etc. Note: close() must be called in order to release underlying resources held by the object. Added in version 1.15.0.

ConnectionMessageEventArgs

Defines payload for Connection's MessageReceived events. Added in version 1.15.0.

Diagnostics

Native logging and other diagnostics

Grammar

Represents a generic grammar used to assist in improving speech recogniton accuracy. Note: close() must be called in order to release underlying resources held by the object.

GrammarList

Allows adding multiple grammars to a SpeechRecognizer to improve the accuracy of speech recognition.

GrammarLists are only usable in specific scenarios and are not generally available. Note: close() must be called in order to release underlying resources held by the object.

KeywordRecognitionEventArgs

Defines content of an keyword recognizing/recognized events.

KeywordRecognitionModel

Represents a keyword recognition model for recognizing when the user says a keyword to initiate further speech recognition. Note: Keyword spotting (KWS) functionality might work with any microphone type, official KWS support, however, is currently limited to the microphone arrays found in the Azure Kinect DK hardware or the Speech Devices SDK. Note: close() must be called in order to release underlying resources held by the object.

KeywordRecognitionResult

Defines result of keyword recognition.

KeywordRecognizer

Performs keyword recognition on the speech input. Note: close() must be called in order to release underlying resources held by the object.

NoMatchDetails

Contains detailed information for NoMatch recognition results.

PhraseListGrammar

Allows additions of new phrases to improve speech recognition.

Phrases added to the recognizer are effective at the start of the next recognition, or the next time the SpeechSDK must reconnect to the speech service. Note: close() must be called in order to release underlying resources held by the object. Added in version 1.5.0

PronunciationAssessmentConfig

Represents pronunciation assessment configuration. Note: close() must be called in order to release underlying resources held by the object. Added in version 1.14.0

PronunciationAssessmentResult

Represents the result of pronunciation assessment. Added in version 1.14.0

PropertyCollection

Represents collection of properties and their values. Note: close() must be called in order to release underlying resources held by the object.

RecognitionEventArgs

Defines payload for recognition events like Speech Start/End Detected

RecognitionResult

Contains detailed information about result of a recognition operation.

Recognizer

Defines the base class Recognizer which mainly contains common event handlers. Note: close() must be called in order to release underlying resources held by the object.

SessionEventArgs

Defines payload for SessionStarted/Stopped events.

SourceLanguageConfig

Represents source language configuration used for specifying recognition source language. Note: close() must be called in order to release underlying resources held by the object. Added in version 1.8.0

SpeechConfig

Speech configuration. Note: close() must be called in order to release underlying resources held by the object. Changed in version 1.7.0

SpeechRecognitionCanceledEventArgs

Defines payload of speech recognition canceled events.

SpeechRecognitionEventArgs

Defines contents of speech recognizing/recognized event.

SpeechRecognitionResult

Defines result of speech recognition.

SpeechRecognizer

Performs speech recognition from microphone, file, or other audio input streams, and gets transcribed text as result. Note: close() must be called in order to release underlying resources held by the object.

SpeechSynthesisBookmarkEventArgs

Defines contents of speech synthesis bookmark event. Added in version 1.16.0

SpeechSynthesisCancellationDetails

Contains detailed information about why a speech synthesis was canceled. Added in version 1.7.0

SpeechSynthesisEventArgs

Defines contents of speech synthesis related event. Note: close() must be called in order to release underlying resources held by the object. Added in version 1.7.0

SpeechSynthesisResult

Contains detailed information about result of a speech synthesis operation. Note: close() must be called in order to release underlying resources held by the object. Added in version 1.7.0

SpeechSynthesisVisemeEventArgs

Defines contents of speech synthesis viseme event. Added in version 1.16.0

SpeechSynthesisWordBoundaryEventArgs

Defines contents of speech synthesis word boundary event. Added in version 1.7.0

SpeechSynthesizer

Performs speech synthesis to speaker, file, or other audio output streams, and gets synthesized audio as result. Note: close() must be called in order to release underlying resources held by the object. Updated in version 1.16.0

SynthesisVoicesResult

Contains detailed information about the retrieved synthesis voices list. Note: close() must be called in order to release underlying resources held by the object. Added in version 1.16.0

VoiceInfo

Contains detailed information about the synthesis voice information. Note: close() must be called in order to release underlying resources held by the object. Updated in version 1.17.0

Enums

CancellationErrorCode

Defines error code in case that CancellationReason is Error. Added in version 1.1.0.

CancellationReason

Defines the possible reasons a recognition result might be canceled.

NoMatchReason

Defines the possible reasons a recognition result might not be recognized.

OutputFormat

Define Speech Recognizer output formats.

ProfanityOption

Define profanity option for response result. Added in version 1.5.0.

PronunciationAssessmentGradingSystem

Defines the point system for pronunciation score calibration; default value is FivePoint. Added in version 1.14.0

PronunciationAssessmentGranularity

Defines the pronunciation evaluation granularity; default value is Phoneme. Added in version 1.14.0

PropertyId

Defines property ids. Changed in version 1.8.0.

ResultReason

Defines the possible reasons a recognition result might be generated. Changed in version 1.7.0.

ServicePropertyChannel

Defines channels used to send service properties. Added in version 1.5.0.

SpeechSynthesisOutputFormat

Defines the possible speech synthesis output audio format. Updated in version 1.17.0

StreamStatus

Defines the possible status of audio data stream. Added in version 1.7.0

SynthesisVoiceGender

Define synthesis voice gender. Added in version 1.17.0

SynthesisVoiceType

Define synthesis voice type. Added in version 1.16.0