microsoft-cognitiveservices-speech-sdk package

Classes

Agent
utils
CertCheckAgent
ConsoleLoggingListener
FileAudioSource
MicAudioSource
PcmRecorder
ProxyInfo
ReplayableAudioNode
RestConfigBase
RestMessageAdapter
WebsocketConnection
WebsocketMessageAdapter
AddedLmIntent
AgentConfig

Represents the JSON used in the agent.config message sent to the speech service.

CognitiveSubscriptionKeyAuthentication
CognitiveTokenAuthentication
ConnectionFactoryBase
DialogConnectionFactory
DialogServiceAdapter
DialogServiceTurnState
DialogServiceTurnStateManager
DynamicGrammarBuilder

Responsible for building the object to be sent to the speech service to support dynamic grammars.

EnumTranslation
HeaderNames
AuthInfo
IntentConnectionFactory
IntentServiceRecognizer
QueryParameterNames
ConnectingToServiceEvent
ListeningStartedEvent
RecognitionEndedEvent
RecognitionStartedEvent
RecognitionTriggeredEvent
SpeechRecognitionEvent
Context
Device
OS
RecognizerConfig
SpeechServiceConfig
System
RequestSession
ActivityPayloadResponse
DetailedSpeechPhrase
IntentResponse
SimpleSpeechPhrase
SpeechDetected
SpeechHypothesis
SynthesisAudioMetadata
TranslationHypothesis
TranslationPhrase
TranslationSynthesisEnd
TurnStatusResponsePayload
ServiceRecognizerBase
ServiceTelemetryListener
SpeakerIdMessageAdapter

Implements methods for speaker recognition classes, sending requests to endpoint and parsing response into expected format

SpeakerRecognitionConfig
SpeechConnectionFactory
SpeechConnectionMessage
SpeechContext

Represents the JSON used in the speech.context message sent to the speech service. The dynamic grammar is always refreshed from the encapsulated dynamic grammar object.

SpeechServiceRecognizer
SpeechSynthesisConnectionFactory
SynthesisAdapterBase
SynthesisContext

Represents the JSON used in the synthesis.context message sent to the speech service. The dynamic grammar is always refreshed from the encapsulated dynamic grammar object.

ConnectingToSynthesisServiceEvent
SpeechSynthesisEvent
SynthesisStartedEvent
SynthesisTriggeredEvent
SynthesisTurn
SynthesizerConfig
TranscriberConnectionFactory
ConversationConnectionConfig
ConversationConnectionFactory

Create a connection to the Conversation Translator websocket for sending instant messages and commands, and for receiving translated messages. The conversation must already have been started or joined.

ConversationConnectionMessage
ConversationManager
ConversationRequestSession

Placeholder class for the Conversation Request Session. Based off RequestSession. TODO: define what telemetry is required.

ConversationServiceAdapter

The service adapter handles sending and receiving messages to the Conversation Translator websocket.

ConversationReceivedTranslationEventArgs
LockRoomEventArgs
MuteAllEventArgs
ParticipantAttributeEventArgs
ParticipantEventArgs
ParticipantsListEventArgs
InternalParticipants

Users participating in the conversation

ConversationRecognizerFactory
ConversationTranslatorRecognizer

Sends messages to the Conversation Translator websocket and listens for incoming events containing websocket messages. Based off the recognizers in the SDK folder.

ConversationWebsocketMessageFormatter

Based off WebsocketMessageFormatter. The messages for Conversation Translator have some variations from the Speech messages.

CommandResponsePayload
ParticipantPayloadResponse
ParticipantsListPayloadResponse
SpeechResponsePayload
TextResponsePayload
TranscriberRecognizer
TranscriptionServiceRecognizer
TranslationConnectionFactory
TranslationServiceRecognizer
WebsocketMessageFormatter
AudioSourceErrorEvent
AudioSourceEvent
AudioSourceInitializingEvent
AudioSourceOffEvent
AudioSourceReadyEvent
AudioStreamNodeAttachedEvent
AudioStreamNodeAttachingEvent
AudioStreamNodeDetachedEvent
AudioStreamNodeErrorEvent
AudioStreamNodeEvent
BackgroundEvent
ChunkedArrayBufferStream
ConnectionClosedEvent
ConnectionErrorEvent
ConnectionEstablishErrorEvent
ConnectionEstablishedEvent
ConnectionEvent
ConnectionMessageReceivedEvent
ConnectionMessageSentEvent
ConnectionStartEvent
ServiceEvent
ConnectionMessage

ConnectionMessage represents implementation specific messages sent to and received from the speech service. These messages are provided for debugging purposes and should not be used for production use cases with the Azure Cognitive Services Speech Service. Messages sent to and received from the Speech Service are subject to change without notice. This includes message contents, headers, payloads, ordering, etc. Added in version 1.11.0.

ConnectionOpenResponse
DialogEvent
SendingAgentContextMessageEvent
ArgumentNullError

The error that is thrown when an argument passed in is null.

InvalidOperationError

The error that is thrown when an invalid operation is performed in the code.

ObjectDisposedError

The error that is thrown when an object is disposed.

EventSource
Events
List
OCSPCacheEntryExpiredEvent
OCSPCacheEntryNeedsRefreshEvent
OCSPCacheFetchErrorEvent
OCSPCacheHitEvent
OCSPCacheMissEvent
OCSPCacheUpdateErrorEvent
OCSPCacheUpdateNeededEvent
OCSPCacheUpdatehCompleteEvent
OCSPDiskCacheHitEvent
OCSPDiskCacheStoreEvent
OCSPEvent
OCSPMemoryCacheHitEvent
OCSPMemoryCacheStoreEvent
OCSPResponseRetrievedEvent
OCSPStapleReceivedEvent
OCSPVerificationFailedEvent
OCSPWSUpgradeStartedEvent
PlatformEvent
Deferred
PromiseResult
PromiseResultEventSource
Sink
Queue
RawWebsocketMessage
RiffPcmEncoder
Stream
Timeout
ActivityReceivedEventArgs

Defines contents of received message/events.

AudioConfig

Represents audio input configuration used for specifying what type of input to use (microphone, file, stream).

AudioOutputConfigImpl
AudioFileWriter
AudioInputStream

Represents audio input stream used for custom audio input configurations.

PullAudioInputStream
PushAudioInputStream

Represents memory backed push audio input stream used for custom audio input configurations.

AudioOutputStream

Represents audio output stream used for custom audio output configurations.

PullAudioOutputStream

Represents memory backed push audio output stream used for custom audio output configurations.

PushAudioOutputStream
AudioStreamFormat

Represents audio stream format used for custom audio input configurations.

BaseAudioPlayer

Base audio player class TODO: Plays only PCM for now.

PullAudioInputStreamCallback

An abstract base class that defines callback methods (read() and close()) for custom audio input streams).

PushAudioOutputStreamCallback

An abstract base class that defines callback methods (write() and close()) for custom audio output streams).

SpeakerAudioDestination

Represents the speaker playback audio destination, which only works in browser. Note: the SDK will try to use Media Source Extensions to play audio. Mp3 format has better supports on Microsoft Edge, Chrome and Safari (desktop), so, it's better to specify mp3 format for playback.

AutoDetectSourceLanguageConfig

Language auto detect configuration.

AutoDetectSourceLanguageResult

Output format

BotFrameworkConfig

Class that defines configurations for the dialog service connector object for using a Bot Framework backend.

CancellationDetails

Contains detailed information about why a result was canceled.

CancellationDetailsBase

Contains detailed information about why a result was canceled.

CancellationEventArgsBase

Defines content of a CancellationEvent.

Connection

Connection is a proxy class for managing connection to the speech service of the specified Recognizer. By default, a Recognizer autonomously manages connection to service when needed. The Connection class provides additional methods for users to explicitly open or close a connection and to subscribe to connection status changes. The use of Connection is optional, and mainly for scenarios where fine tuning of application behavior based on connection status is needed. Users can optionally call Open() to manually set up a connection in advance before starting recognition on the Recognizer associated with this Connection. If the Recognizer needs to connect or disconnect to service, it will setup or shutdown the connection independently. In this case the Connection will be notified by change of connection status via Connected/Disconnected events. Added in version 1.2.1.

ConnectionEventArgs

Defines payload for connection events like Connected/Disconnected. Added in version 1.2.0

ConnectionMessageImpl
ConnectionMessageEventArgs
ConversationTranscriptionCanceledEventArgs

Defines content of a RecognitionErrorEvent.

CustomCommandsConfig

Class that defines configurations for the dialog service connector object for using a CustomCommands backend.

DialogServiceConfig

Class that defines base configurations for dialog service connector

DialogServiceConfigImpl

Dialog Service configuration.

DialogServiceConnector

Dialog Service Connector

IntentRecognitionCanceledEventArgs

Define payload of intent recognition canceled result events.

IntentRecognitionEventArgs

Intent recognition result event arguments.

IntentRecognitionResult

Intent recognition result.

IntentRecognizer

Intent recognizer.

KeywordRecognitionModel

Represents a keyword recognition model for recognizing when the user says a keyword to initiate further speech recognition.

LanguageUnderstandingModel

Language understanding model

NoMatchDetails

Contains detailed information for NoMatch recognition results.

PhraseListGrammar

Allows additions of new phrases to improve speech recognition. Phrases added to the recognizer are effective at the start of the next recognition, or the next time the SpeechSDK must reconnect to the speech service.

PronunciationAssessmentConfig

Pronunciation assessment configuration.

PronunciationAssessmentResult

Pronunciation assessment results.

PropertyCollection

Represents collection of properties and their values.

RecognitionEventArgs

Defines payload for session events like Speech Start/End Detected

RecognitionResult

Defines result of speech recognition.

Recognizer

Defines the base class Recognizer which mainly contains common event handlers.

ServiceEventArgs

Defines payload for any Service message event Added in version 1.9.0

SessionEventArgs

Defines content for session events like SessionStarted/Stopped, SoundStarted/Stopped.

SourceLanguageConfig

Source Language configuration.

SpeakerIdentificationModel

Defines SpeakerIdentificationModel class for Speaker Recognition Model contains a set of profiles against which to identify speaker(s)

SpeakerRecognitionCancellationDetails
SpeakerRecognitionResult

Output format

SpeakerRecognizer

Defines SpeakerRecognizer class for Speaker Recognition Handles operations from user for Voice Profile operations (e.g. createProfile, deleteProfile)

SpeakerVerificationModel

Defines SpeakerVerificationModel class for Speaker Recognition Model contains a profile against which to verify a speaker

SpeechConfig

Speech configuration.

SpeechConfigImpl
SpeechRecognitionCanceledEventArgs
ConversationTranscriptionEventArgs

Defines contents of speech recognizing/recognized event.

SpeechRecognitionEventArgs

Defines contents of speech recognizing/recognized event.

SpeechRecognitionResult

Defines result of speech recognition.

SpeechRecognizer

Performs speech recognition from microphone, file, or other audio input streams, and gets transcribed text as result.

SpeechSynthesisBookmarkEventArgs

Defines contents of speech synthesis bookmark event.

SpeechSynthesisEventArgs

Defines contents of speech synthesis events.

SpeechSynthesisResult

Defines result of speech synthesis.

SpeechSynthesisVisemeEventArgs

Defines contents of speech synthesis viseme event.

SpeechSynthesisWordBoundaryEventArgs

Defines contents of speech synthesis word boundary event.

SpeechSynthesizer

Defines the class SpeechSynthesizer for text to speech. Updated in version 1.16.0

SynthesisRequest
SpeechTranslationConfig

Speech translation configuration.

Conversation
ConversationImpl
ConversationCommon
ConversationExpirationEventArgs
ConversationParticipantsChangedEventArgs
ConversationTranscriber
ConversationTranslationCanceledEventArgs
ConversationTranslationEventArgs
ConversationTranslationResult
ConversationTranslator

Join, leave or connect to a conversation.

Participant
User
TranslationRecognitionCanceledEventArgs

Define payload of speech recognition canceled result events.

TranslationRecognitionEventArgs

Translation text result event arguments.

TranslationRecognitionResult

Translation text result.

TranslationRecognizer

Translation recognizer

TranslationSynthesisEventArgs

Translation Synthesis event arguments

TranslationSynthesisResult

Defines translation synthesis result, i.e. the voice output of the translated text in the target language.

Translations

Represents collection of parameters and their values.

TurnStatusReceivedEventArgs

Defines contents of received message/events.

VoiceProfile

Defines Voice Profile class for Speaker Recognition

VoiceProfileClient

Defines VoiceProfileClient class for Speaker Recognition Handles operations from user for Voice Profile operations (e.g. createProfile, deleteProfile)

VoiceProfileEnrollmentCancellationDetails
VoiceProfileEnrollmentResult

Output format

VoiceProfileCancellationDetails
VoiceProfileResult

Output format

Interfaces

Request
Response
VerifyOptions
IRecorder
IRequestOptions

HTTP request helper

IRestParams
IRestResponse
IAgentConfig
IDynamicGrammar

Top level grammar node

IDynamicGrammarGeneric

Generic phrase based dynamic grammars

IDynamicGrammarGroup

Group of Dynamic Grammar items of a common type.

IDynamicGrammarPeople
IAuthentication
IConnectionFactory
ISynthesisConnectionFactory
ISpeechConfigAudio
ISpeechConfigAudioDevice
IActivityPayloadResponse
IDetailedSpeechPhrase
IPhrase
IIntentEntity
IIntentResponse
ISingleIntent
IPrimaryLanguage
ISimpleSpeechPhrase
ISpeechDetected
ISpeechHypothesis
ISynthesisAudioMetadata
ISynthesisMetadata
ITranslationHypothesis
ITranslationPhrase
ITranslationSynthesisEnd
ITurnStatusResponsePayload
IMetric
ITelemetry
IResultErrorDetails
ISpeechEndDetectedResult
ITranslation
ITranslations
ITurnStart
ITurnStartContext
ISynthesisResponse
ISynthesisResponseAudio
ISynthesisResponseContext
ConversationRecognizer

Recognizer for handling Conversation Translator websocket messages

IChangeNicknameCommand

Change nickname command

IClientMessage

Base message command

ICommandMessage

Command message

IConversationResponseError

Error returned from the Conversation Translator websocket

IConversationResponseErrorMessage

Error message returned from the Conversation Translator websocket

IEjectParticipantCommand

Remove participant command

IInstantMessageCommand

Text message command

IInternalConversation

Internal conversation data

IInternalParticipant

The user who is participating in the conversation.

ILockConversationCommand

Lock command

IMuteAllCommand

Mute all command

IMuteCommand

Mute participant command

IResponse

HTTP response helper

ICommandResponsePayload

Defines the payload for incoming websocket commands

IParticipantPayloadResponse

Defines the payload for incoming participant

IParticipantsListPayloadResponse

Defines the payload for incoming list of participants

ISpeechResponsePayload
ITextResponsePayload
ITranslationCommandMessage
ITranslationResponsePayload

Defines the payload for incoming translation messages

IAudioDestination
IAudioSource
IAudioStreamNode
IConnection
IDetachable
INumberDictionary
IStringDictionary
IDisposable
IErrorMessages
IEventListener
IEventSource
ITimer
IWebsocketMessageFormatter
IList
IDeferred
IQueue
IStreamChunk
IWorkerTimers
IPlayer

Represents audio player interface to control the audio playback, such as pause, resume, etc.

CancellationEventArgs
ConversationHandler
ConversationTranscriptionHandler

A conversation transcriber that enables a connected experience where conversations can logged with each participant recognized.

IConversationTranslator

A conversation translator that enables a connected experience where participants can use their own devices to see everyone else's recognitions and IMs in their own languages. Participants can also speak and send IMs to others.

ConversationInfo
IConversation

Manages conversations. Added in version 1.4.0

IParticipant

Represents a participant in a conversation. Added in version 1.4.0

IUser

Represents a user in a conversation. Added in version 1.4.0

TranscriptionParticipant
VoiceSignature
IEnrollmentResultDetails

Type Aliases

Callback

Enums

RestRequestType
RecognitionCompletionStatus
RecognitionMode
SpeechResultFormat
connectivity
type
MessageDataStreamType
RecognitionStatus
MetadataType
SynthesisServiceType
TranslationStatus

Defines translation status.

MessageType
ConnectionState
EventType
PromiseState
AudioFormatTag
CancellationErrorCode

Defines error code in case that CancellationReason is Error. Added in version 1.1.0.

CancellationReason

Defines the possible reasons a recognition result might be canceled.

NoMatchReason

Defines the possible reasons a recognition result might not be recognized.

OutputFormat

Define Speech Recognizer output formats.

ProfanityOption

Profanity option. Added in version 1.7.0.

PronunciationAssessmentGradingSystem

Defines the point system for pronunciation score calibration; default value is FivePoint. Added in version 1.15.0

PronunciationAssessmentGranularity

Defines the pronunciation evaluation granularity; default value is Phoneme. Added in version 1.15.0

PropertyId

Defines speech property ids.

ResultReason

Defines the possible reasons a recognition result might be generated.

ServicePropertyChannel

Defines channels used to pass property settings to service. Added in version 1.7.0.

SpeakerRecognitionResultType
SpeechSynthesisOutputFormat

Define speech synthesis audio output formats.

SpeechState
ParticipantChangedReason
VoiceProfileType

Output format

Functions

request("get" | "post" | "delete", string, any, any, IRequestOptions, any)
check(any, (error: Error, res: any) => void)
verify(VerifyOptions, (error: string, res: any) => void)
PromiseToEmptyCallback<T>(Promise<T>, Callback, Callback)
extractHeaderValue(string, string)
marshalPromiseToCallbacks<T>(Promise<T>, (value: T) => void, (error: string) => void)

Function Details

request("get" | "post" | "delete", string, any, any, IRequestOptions, any)

function request(method: "get" | "post" | "delete", url: string, queryParams: any, body: any, options: IRequestOptions, callback: any)

Parameters

method

"get" | "post" | "delete"

url

string

queryParams

any

body

any

options
IRequestOptions
callback

any

Returns

any

check(any, (error: Error, res: any) => void)

function check(options: any, cb: (error: Error, res: any) => void)

Parameters

options

any

cb

(error: Error, res: any) => void

Returns

any

verify(VerifyOptions, (error: string, res: any) => void)

function verify(options: VerifyOptions, cb: (error: string, res: any) => void)

Parameters

options
VerifyOptions
cb

(error: string, res: any) => void

PromiseToEmptyCallback<T>(Promise<T>, Callback, Callback)

function PromiseToEmptyCallback<T>(promise: Promise<T>, cb?: Callback, err?: Callback)

Parameters

promise

Promise<T>

err
Callback

extractHeaderValue(string, string)

function extractHeaderValue(headerKey: string, headers: string)

Parameters

headerKey

string

headers

string

Returns

string

marshalPromiseToCallbacks<T>(Promise<T>, (value: T) => void, (error: string) => void)

function marshalPromiseToCallbacks<T>(promise: Promise<T>, cb?: (value: T) => void, err?: (error: string) => void)

Parameters

promise

Promise<T>

cb

(value: T) => void

err

(error: string) => void