DTMF Grammars

This topic includes the following sections:

  • DTMF grammar headers
  • Structure of DTMF grammars
  • Simple example
  • Another example
  • Similarities between voice and DTMF modes

DTMF grammar headers

A DTMF grammar must have mode="dtmf" as an attribute of the <grammar> element. A language attribute is not necessary. Other attributes are the same as for an SRGS grammar. For example,

<grammar mode="dtmf"
         type="application/srgs+xml"
         tag-format="semantics/1.0"
         version="1.0" 
         root="topRule">

Structure of DTMF grammars

The permitted values for DTMF grammars are 0 1 2 3 4 5 6 7 8 9 * #.

DTMF grammars are constructed with the same elements that are used for voice grammars:

  1. <grammar> elements are composed of one or more <rule> elements. If there is more than one rule, the <grammar> element must have a root attribute that names the top-level <rule> element.
  2. Lower level or external grammar rules are referenced with <ruleref> elements.
  3. <item> elements enclose expected input tones or groups of tones.
  4. <one-of> elements enclose lists of alternative <item> elements.
  5. <item> elements can have a repeat attribute that allows the content of the item to be optional or repeated. For example, <item repeat="0-1"> means that the item can appear 0 or 1 times (item is optional), <item repeat="1-"> means that the item can appear one or more times, and <item repeat="10"> means that the item must appear exactly 10 times.
  6. The <tag> element encloses either ECMAScript or strings. Its contents are used to return match information to the VoiceXML application.

Simple example

Here is a simple example of a DTMF grammar in a VoiceXML application that you can run in the Tellme Studio Scratchpad.

<?xml version="1.0"?>
<VXML version="2.1" revision="4" xmlns="http://www/w3/org/2001/VXML"  xml:lang="en-US">
<form id="mainDialog">
   <field name="response">
      <prompt>
         please press a number between one and five
      </prompt>

      <grammar version="1.0" root="ROOT" mode="dtmf">
         <rule id="ROOT">
            <one-of>
               <item> 1 </item>
               <item> 2 </item>
               <item> 3 </item>
               <item> 4 </item>
               <item> 5 </item>
            </one-of>
         </rule>
      </grammar>

      <filled>
         <prompt> thank you </prompt>
         <prompt>you pressed <value expr="response"/></prompt>
      </filled>
      <noinput> </noinput>
      <nomatch> </nomatch>
   </field>
</form>
</VXML>

Another example

<?xml version="1.0"?
<VXML version="2.1" revision="4"
      xmlns="http://www.w3.org/2001/06/grammar"
      xml:lang="en-US">
<form id="mainDialog">
   <field name="response">
      <prompt>
         please enter your four digit pin
      </prompt>

      <grammar mode="dtmf" version="1.0" root="pin"
         tag-format="semantics/1.0">
         <rule id="digit">
            <one-of>
               <item> 0 </item>
               <item> 1 </item>
               <item> 2 </item>
               <item> 3 </item>
               <item> 4 </item>
               <item> 5 </item>
               <item> 6 </item>
               <item> 7 </item>
               <item> 8 </item>
               <item> 9 </item>
            </one-of>
         </rule>

         <rule id="pin">
            <tag>out=""</tag>
            <item repeat="4">
               <ruleref uri="#digit"/>
               <tag>out += rules.latest( );</tag>
            </item>
         </rule>
      </grammar>
      <filled>
         <script><![CDATA[
            <!-- put spaces between digits so output can be
                 spoken one digit at a time -->
             response= response.replace(/(.)/g, "$1 ");
         ]]></script>
         <prompt> thank you </prompt>
         <prompt>you pressed <value expr="response"/></prompt>
      </filled>
      <noinput> </noinput>
      <nomatch> </nomatch>
   </field>
</form>
</VXML>

Note

The previous example uses concatenation to accumulate the input ( see Semantic Interpretation. If concatenation is not used, then only the last digit entered is the response variable.

Similarities between voice and DTMF modes

The rules for constructing grammars for DTMF are generally the same as for voice grammars. For example, in the example below, the caller must press the 1, 2, and 3 phone buttons in sequence, all three of them, or no match is returned.

<grammar mode="dtmf" tag-format="semantics/1.0"
         version="1.0" root="location">
   <rule id="location">
      <item>1</item>
      <item>2</item>
      <item>3</item>
   </rule>
</grammar>

This is the same behavior as in the voice example where the caller must speak all three words in sequence (charleston south carolina):

<grammar mode="voice" xml:lang="en-US"
         tag-format="semantics/1.0"
         version="1.0" root="location">
   <rule id="location">
      <item>charleston</item>
      <item>south</item>
      <item>carolina</item>
   </rule>
</grammar>

If the voice example is rewritten as follows, the caller must still say "charleston south carolina" in its entirety as before

<grammar mode="voice" xml:lang="en-US"
         tag-format="semantics/1.0"
         version="1.0" root="location">
   <rule id="location">
      <item>charleston south carolina</item>
   </rule>
</grammar>

In the DTMF case, the analogous grammar:

<grammar mode="dtmf" tag-format="semantics/1.0"
         version="1.0" root="location">
   <rule id="location">
      <item>1 2 3</item>
   </rule>
</grammar>

As before, the caller must press the 1, 2, and 3 phone buttons in sequence, all three of them, or no match is returned.