Supported Features in VoiceXML Applications

Artikel
08/18/2014

This content is no longer actively maintained. It is provided as is, for anyone who may still be using these technologies, with no warranties or claims of accuracy with regard to the most recent product version or service release.

Speech Server supports VoiceXML application features, VoiceXML elements, and a variety of properties.

Application Features

Description	Support Level	Standard	Note
Referencing scripts dynamically	Full	2.1
Referencing grammars dynamically	Full	2.1
Recording user utterances while attempting recognition	Full	2.1	VoiceXML 2.1 extends the <field>, <initial>, <link>, <menu>, <record>, and <transfer> elements to allow the interpreter to conditionally enable recording while simultaneously gathering input from the user.
Platform-specific commands	Full		The following platform-specific commands are supported: Help, Cancel, and Exit.
Built-in grammars	Full		The following built-in grammar types are supported: Boolean, date, digits, currency, number, phone, and time.
Keyword grammars	Full		These grammars are created using Speech Grammar Editor and comply with the W3C SRGS specification.
Prompt Engine Markup Language (PEML)	Full		PEML is markup language that specifies the text input used by the prompt engine to produce speech output.
Bridge and consultation transfers			Bridge transfer is not supported in Speech Server.

VoiceXML Elements

Element	Support Level	Standard	Note
<assign>	Full	2.0
<audio>	Full	2.0
<block>	Full	2.0
<catch>	Full	2.0
<choice>	Full	2.0
<clear>	Full	2.0
<data>	Limited	2.1	Ignored by Speech Server.
<disconnect>	Full	2.0	The 2.1 namelist attribute is supported.
<else>	Full	2.0
<elseif>	Full	2.0
<enumerate>	Full	2.0
<error>	Full	2.0
<exit>	Full	2.0
<field>	Full	2.0
<filled>	Full	2.0
<foreach>	Full	2.1
<form>	Full	2.0
<goto>	Full	2.0
<grammar>	Full	2.1	Only SRGS (Speech Recognition Grammar Specification) or CFG (context-free grammar) grammars are supported. See the following example. `<grammar src="Grammar.grxml#rule1" type="application/srgs+cfg"/> </grammar>` ABNF (Augmented Backus-Naur Form) grammars are not supported.
<help>	Full	2.0
<if>	Full	2.0
<initial>	Full	2.0
<link>	Full	2.0
<log>	Full	2.0
<mark>	Partial	2.1	<mark> is recognized by Speech Server and the VoiceXML interpreter. However, it is not supported in the TIM (Telephony Interface Manager)/TIMC (Telephony Interface Manager Connector) scenario.
<menu>	Full	2.0
<meta>	Full	2.0
<metadata>	Full	2.0
<noinput>	Full	2.0
<nomatch>	Full	2.0
<object>	Limited	2.0	Ignored by Speech Server; no action is taken.
<option>	Full	2.0
<param>	Full	2.0	The <object> tag has limited support.
<prompt>	Full	2.0
<property>	Full	2.1	The <data>-related properties have no support and are ignored by Speech Server. The <recording>-related property is supported.
<record>	Full	2.0
<reprompt>	Full	2.0
<return>	Full	2.0
<script>	Full	2.1	The 2.1 srcexpr= attribute is supported.
<subdialog>	Full	2.0
<submit>	Full	2.0
<throw>	Full	2.0
<transfer>	Partial	2.1	Only blind and consultation transfers are supported. Follow the 2.1 specification and use the type= attribute.
<value>	Full	2.0
<var>	Full	2.0
<vxml>	Full	2.0

Speech Recognizer Properties

For more information about the following properties, see W3C VoiceXML 2.0 Introduction.

Property	Support Level	Note
confidencelevel	Full	Specifies the speech recognition confidence level, which is a float value from 0.0 to 1.0. Results are rejected and a nomatch event is thrown when application.lastresult$.confidence is below this threshold. A value of 0.0 means minimum confidence is needed for a recognition, and a value of 1.0 means maximum confidence is needed. The value is a Real Number Designation. The default value is 0.5.
sensitivity	Full	Sets the sensitivity level. A value of 1.0 means that the speech recognizer is highly sensitive to quiet input. A value of 0.0 means it is least sensitive to noise. The value is a Real Number Designation. The default value is 0.5.
speedsvsaccuracy	Full	Provides a hint specifying the desired balance of speed versus accuracy. A value of 0.0 means fastest recognition. A value of 1.0 means best accuracy. The value is a Real Number Designation. The default is value 0.5.
completetimeout	Full	Specifies the length of silence required following user speech before the speech recognizer finalizes a result, either accepting it or throwing a nomatch event. The complete time-out is used when the speech is a complete match of an active grammar. By contrast, the incomplete time-out is used when the speech is an incomplete match to an active grammar. A long complete time-out value delays the result completion and therefore makes the computer's response slower. A short complete time-out can lead to an utterance being broken up inappropriately. Reasonable complete time-out values are typically in the range of 0.3 to 1.0 second. The value is a Time Designation. The default is .5 seconds.
incompletetimeout	Full	Specifies the required length of silence following user speech after which a speech recognizer finalizes a result. The incomplete time-out applies when the speech prior to the silence is an incomplete match of all active grammars. In this case, when the time-out is triggered, the partial result is rejected with a nomatch event. The incomplete time-out also applies when the speech prior to the silence is a complete match of an active grammar, but where it is possible to speak further and still match the grammar. By contrast, the complete time-out is used when the speech is a complete match to an active grammar and no further words can be spoken. A long incomplete time-out value delays the result completion and therefore makes the computer's response slower. A short incomplete time-out can lead to an utterance being broken up inappropriately. The incomplete time-out is usually longer than the complete time-out to allow users to pause mid-utterance, for example, to breathe. The default is 1.0 second.
maxspeechtimeout	Full	Specifies the maximum duration of user speech. If this time elapses before the user stops speaking, the maxspeechtimeout event is thrown. The value is a Time Designation. The default duration is 20 seconds.

Generic DTMF Properties

Property	Support Level	Note
interdigittimeout	Full	The inter-digit time-out value to use when recognizing DTMF (dual tone multi-frequency) input. The value is a Time Designation. The default is 1.0 second.
termtimeout	Full	The terminating time-out to use when recognizing DTMF input. The value is a Time Designation. The default value is 0.
termchar	Full	The terminating DTMF character for DTMF input recognition. The default value is #.

Prompt and Collect Properties

Property	Support Level	Note
bargein	Full	Specifies the bargein attribute for prompts. Setting this to true allows bargein by default. Setting it to false disallows bargein. The default value is true.
bargeintype	Full	Sets the type of bargein to be speech or hotword. The default value is speech.
timeout	Full	Specifies the time after which a noinput event is thrown by Speech Server. The value is a Time Designation. The default value is 5.0 seconds.

Miscellaneous Properties

Property	Support Level	Note
inputmodes	Full
universals	Full
maxnbest	Full
maxtime	Full	The maximum duration to record. The value is a Time Designation. This is an attribute of the record element. The default duration is 5 minutes.

Share via

Supported Features in VoiceXML Applications

Application Features

VoiceXML Elements

Speech Recognizer Properties

Generic DTMF Properties

Prompt and Collect Properties

Miscellaneous Properties

See Also

Tasks

Other Resources

Zusätzliche Ressourcen