Microsoft Speech Platform

ISpVoice

The ISpVoice interface enables an application to perform speech synthesis operations. Applications can speak text strings and text files, or play audio files through this interface. All of these can be done synchronously or asynchronously.

A voice is an instance of a speech synthesis (text-to-speech, or TTS) engine that specifies a voice token to use for synthesizing speech from text. Applications can choose a specific TTS voice token using ISpVoice::SetVoice. If no voice token is selected, the TTS engine will use the default voice token, which is specified at the following registry key: HKEY_CURRENT_USER\Software\Microsoft\Speech Server\v11.0\Voices\DefaultTokenId.

Your applications can modify the characteristics of a voice (for example, rate, pitch, and volume), by embedding Speech Synthesis Markup Language (SSML) XML tags into the text to be spoken. See Use SSML to Create Prompts and Control TTS. Some attributes, like rate and volume, can be changed in real time using ISpVoice::SetRate and ISpVoice::SetVolume. Applications can set the priority of a voice using ISpVoice_SetPriority.htm.

ISpVoice inherits from the ISpEventSource interface. An ISpVoice object forwards events back to the application when the corresponding audio data has been rendered to the output device.

Associated Class IDs

The following class IDs (CLSID) may be used with this interface.

CLSID_SpVoice

See Application Object Classes for a complete CLSID listing for all interfaces.

Methods in Vtable Order

ISpVoice Methods	Description
ISpEventSource inherited methods	All methods of ISpEventSource are accessible from this interface.
SetOutput	Sets the current output object. A value of NULL may be used to select the default audio device.
GetOutputObjectToken	Retrieves the object token for the current audio output object.
GetOutputStream	Retrieves a pointer to the current output stream.
Pause	Pauses the voice at the nearest alert boundary and closes the output device.
Resume	Sets the output device to the RUN state and resumes rendering.
SetVoice	Sets the identity of the voice used for text synthesis.
GetVoice	Retrieves the object token that identifies the voice used in text synthesis.
Speak	Speaks the contents of a text string or file.
SpeakStream	Speaks the contents of a stream.
GetStatus	Retrieves the current rendering and event status associated with this ISpVoice instance.
Skip	Causes the voice to skip forward or backward the specified number of items within the text of the current speak call.
SetPriority	Sets the priority for the voice. Normal, Alert, Over.
GetPriority	Retrieves the current voice priority level.
SetAlertBoundary	Specifies which event should be used as the insertion point for alerts.
GetAlertBoundary	Retrieves the event that is currently being used as the insertion point for alerts.
SetRate	Sets the text rendering rate adjustment in real time.
GetRate	Retrieves the current text rendering rate adjustment.
SetVolume	Sets the synthesizer output volume level in real time.
GetVolume	Retrieves the current output volume level of the synthesizer.
WaitUntilDone	Blocks the caller until either the voice has completed speaking or the specified time interval has elapsed.
SetSyncSpeakTimeout	Sets the timeout interval in milliseconds after which, synchronous Speak and SpeakStream calls to this instance of the voice will timeout.
GetSyncSpeakTimeout	Retrieves the timeout interval for synchronous speech operations for this ISpVoice instance.
SpeakCompleteEvent	Returns an event handle that will be signaled when the voice has completed speaking all pending requests.
IsUISupported	Determines if the specified type of UI is supported.
DisplayUI	Displays the requested UI.

Microsoft Speech Platform

ISpVoice

Methods in Vtable Order

Additional resources