Share via


Supported Features in VoiceXML Applications

This content is no longer actively maintained. It is provided as is, for anyone who may still be using these technologies, with no warranties or claims of accuracy with regard to the most recent product version or service release.

Speech Server supports VoiceXML application features, VoiceXML elements, and a variety of properties.

Application Features

Description Support Level Standard Note

Referencing scripts dynamically

Full

2.1

Referencing grammars dynamically

Full

2.1

Recording user utterances while attempting recognition

Full

2.1

VoiceXML 2.1 extends the <field>, <initial>, <link>, <menu>, <record>, and <transfer> elements to allow the interpreter to conditionally enable recording while simultaneously gathering input from the user.

Platform-specific commands

Full

The following platform-specific commands are supported: Help, Cancel, and Exit.

Built-in grammars

Full

The following built-in grammar types are supported: Boolean, date, digits, currency, number, phone, and time.

Keyword grammars

Full

These grammars are created using Speech Grammar Editor and comply with the W3C SRGS specification.

Prompt Engine Markup Language (PEML)

Full

PEML is markup language that specifies the text input used by the prompt engine to produce speech output.

Bridge and consultation transfers

Bridge transfer is not supported in Speech Server.

VoiceXML Elements

Element Support Level Standard Note

<assign>

Full

2.0

<audio>

Full

2.0

<block>

Full

2.0

<catch>

Full

2.0

<choice>

Full

2.0

<clear>

Full

2.0

<data>

Limited

2.1

Ignored by Speech Server.

<disconnect>

Full

2.0

The 2.1 namelist attribute is supported.

<else>

Full

2.0

<elseif>

Full

2.0

<enumerate>

Full

2.0

<error>

Full

2.0

<exit>

Full

2.0

<field>

Full

2.0

<filled>

Full

2.0

<foreach>

Full

2.1

<form>

Full

2.0

<goto>

Full

2.0

<grammar>

Full

2.1

Only SRGS (Speech Recognition Grammar Specification) or CFG (context-free grammar) grammars are supported. See the following example.

<grammar src="Grammar.grxml#rule1" type="application/srgs+cfg"/>
 </grammar>

ABNF (Augmented Backus-Naur Form) grammars are not supported.

<help>

Full

2.0

<if>

Full

2.0

<initial>

Full

2.0

<link>

Full

2.0

<log>

Full

2.0

<mark>

Partial

2.1

<mark> is recognized by Speech Server and the VoiceXML interpreter. However, it is not supported in the TIM (Telephony Interface Manager)/TIMC (Telephony Interface Manager Connector) scenario.

<menu>

Full

2.0

<meta>

Full

2.0

<metadata>

Full

2.0

<noinput>

Full

2.0

<nomatch>

Full

2.0

<object>

Limited

2.0

Ignored by Speech Server; no action is taken.

<option>

Full

2.0

<param>

Full

2.0

The <object> tag has limited support.

<prompt>

Full

2.0

<property>

Full

2.1

The <data>-related properties have no support and are ignored by Speech Server. The <recording>-related property is supported.

<record>

Full

2.0

<reprompt>

Full

2.0

<return>

Full

2.0

<script>

Full

2.1

The 2.1 srcexpr= attribute is supported.

<subdialog>

Full

2.0

<submit>

Full

2.0

<throw>

Full

2.0

<transfer>

Partial

2.1

Only blind and consultation transfers are supported. Follow the 2.1 specification and use the type= attribute.

<value>

Full

2.0

<var>

Full

2.0

<vxml>

Full

2.0

Speech Recognizer Properties

For more information about the following properties, see W3C VoiceXML 2.0 Introduction.

Property Support Level Note

confidencelevel

Full

Specifies the speech recognition confidence level, which is a float value from 0.0 to 1.0. Results are rejected and a nomatch event is thrown when application.lastresult$.confidence is below this threshold. A value of 0.0 means minimum confidence is needed for a recognition, and a value of 1.0 means maximum confidence is needed. The value is a Real Number Designation. The default value is 0.5.

sensitivity

Full

Sets the sensitivity level. A value of 1.0 means that the speech recognizer is highly sensitive to quiet input. A value of 0.0 means it is least sensitive to noise. The value is a Real Number Designation. The default value is 0.5.

speedsvsaccuracy

Full

Provides a hint specifying the desired balance of speed versus accuracy. A value of 0.0 means fastest recognition. A value of 1.0 means best accuracy. The value is a Real Number Designation. The default is value 0.5.

completetimeout

Full

Specifies the length of silence required following user speech before the speech recognizer finalizes a result, either accepting it or throwing a nomatch event. The complete time-out is used when the speech is a complete match of an active grammar. By contrast, the incomplete time-out is used when the speech is an incomplete match to an active grammar. A long complete time-out value delays the result completion and therefore makes the computer's response slower. A short complete time-out can lead to an utterance being broken up inappropriately. Reasonable complete time-out values are typically in the range of 0.3 to 1.0 second. The value is a Time Designation. The default is .5 seconds.

incompletetimeout

Full

Specifies the required length of silence following user speech after which a speech recognizer finalizes a result. The incomplete time-out applies when the speech prior to the silence is an incomplete match of all active grammars. In this case, when the time-out is triggered, the partial result is rejected with a nomatch event. The incomplete time-out also applies when the speech prior to the silence is a complete match of an active grammar, but where it is possible to speak further and still match the grammar. By contrast, the complete time-out is used when the speech is a complete match to an active grammar and no further words can be spoken. A long incomplete time-out value delays the result completion and therefore makes the computer's response slower. A short incomplete time-out can lead to an utterance being broken up inappropriately. The incomplete time-out is usually longer than the complete time-out to allow users to pause mid-utterance, for example, to breathe. The default is 1.0 second.

maxspeechtimeout

Full

Specifies the maximum duration of user speech. If this time elapses before the user stops speaking, the maxspeechtimeout event is thrown. The value is a Time Designation. The default duration is 20 seconds.

Generic DTMF Properties

Property Support Level Note

interdigittimeout

Full

The inter-digit time-out value to use when recognizing DTMF (dual tone multi-frequency) input. The value is a Time Designation. The default is 1.0 second.

termtimeout

Full

The terminating time-out to use when recognizing DTMF input. The value is a Time Designation. The default value is 0.

termchar

Full

The terminating DTMF character for DTMF input recognition. The default value is #.

Prompt and Collect Properties

Property Support Level Note

bargein

Full

Specifies the bargein attribute for prompts. Setting this to true allows bargein by default. Setting it to false disallows bargein. The default value is true.

bargeintype

Full

Sets the type of bargein to be speech or hotword. The default value is speech.

timeout

Full

Specifies the time after which a noinput event is thrown by Speech Server. The value is a Time Designation. The default value is 5.0 seconds.

Miscellaneous Properties

Property Support Level Note

inputmodes

Full

universals

Full

maxnbest

Full

maxtime

Full

The maximum duration to record. The value is a Time Designation. This is an attribute of the record element. The default duration is 5 minutes.

See Also

Tasks

How to: Create a VoiceXML Application Project

Other Resources

Speech Application Development Guide