Secrets of the System.Xml.Schema Namespace

 

Dare Obasanjo
Microsoft Corporation

June 10, 2004

Summary: Dare Obasanjo provides examples of lesser-known class functionality in System.Xml.Schema namespaces. (7 printed pages)

Introduction

Every once in a while, I see questions on newsgroups or mailing lists about how to perform some task involving XML Schema that is provided by the classes in the System.Xml.Schema namespace, but is unobvious as to how to perform the task with the namespace. Over time I've come up with a list of tasks that one can perform with the classes in the System.Xml.Schema namespace that are not readily apparent on first use. This article comprises the top three items from that list.

The Sample Input: The Book Inventory

The following sample schema and XML document is used as the input documents for this article.

Books.xsd
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 
    targetNamespace="http://www.example.com/books"
    xmlns:bk="http://www.example.com/books" 
    attributeFormDefault="unqualified"
    elementFormDefault="qualified">

 <xs:element name="books"> 
  <xs:complexType>
   <xs:sequence> 
    <xs:element name="book" maxOccurs="unbounded">
     <xs:complexType>
      <xs:sequence>
       <xs:element name="title" type="xs:string" />
       <xs:element name="author" type="xs:string" />
       </xs:sequence>
       <xs:attribute name="publisher" type="bk:publisherType" use="required" />
       <xs:attribute name="on-loan" type="xs:string" use="optional" />
      </xs:complexType>
    </xs:element>
   </xs:sequence> 
  </xs:complexType>
 </xs:element>

<xs:annotation>
  <xs:documentation xml:lang="en">
    The publisherType is a list of the publishers I've bought books from. 
    If a publisher is not on the list then it means I don't have any books from them. 
  </xs:documentation> 
 </xs:annotation>

 <xs:simpleType name="publisherType">
  <xs:restriction base="xs:string">
     <xs:enumeration value="WROX" />
     <xs:enumeration value="Prentice Hall" />
     <xs:enumeration value="Addison-Wesley" />
     <xs:enumeration value="APress" />
     <xs:enumeration value="IDG books" />
  </xs:restriction>
 </xs:simpleType>
</xs:schema>

Books.xml
<books xmlns="http://www.example.com/books">
  <book publisher="IDG books" on-loan="Sanjay">
    <title>XML Bible</title>
    <author>Elliotte Rusty Harold</author>
  </book>
  <book publisher="Addison-Wesley">
    <title>The Mythical Man Month</title>
    <author>Frederick Brooks</author>
  </book>
  <book publisher="WROX">
    <title>Professional XSLT 2nd Edition</title>
    <author>Michael Kay</author>
  </book>
  <book publisher="Prentice Hall" on-loan="Sander" >
   <title>Definitive XML Schema</title>
   <author>Priscilla Walmsley</author>
  </book>
  <book publisher="APress">
   <title>A Programmer's Introduction to C#</title>
   <author>Eric Gunnerson</author>
  </book>
</books>

Validating Primitive Values Against XSD Simple Type Definitions

InfoPath allows one to create a form based on an XML Schema and tie validation rules to that form. For example, in a form generated from books.xsd, the publisher field would only accept one of WROX, Prentice Hall, IDG books, Addison Wesley, and APress as a valid value. I've seen some people ask how to perform similar validation in Windows Forms applications that utilize XML Schema in the same manner as InfoPath. The answer lies in the ParseValue() method of the System.Xml.Schema.XmlSchemaDatatype class. The ParseValue() method can be used to test whether a particular value conforms to the constraints of a given XML Schema simple type. The following code snippet shows this in action:

using System;
using System.IO;
using System.Xml;
using System.Xml.Schema;
using System.Xml.XPath;

namespace Test
{
   class Program
   {
      static void Main(string[] args)
      {
      
         XmlSchema books = 
XmlSchema.Read(new XmlTextReader("books.xsd"), null);

         books.Compile(null); 

         XmlSchemaSimpleType pubType = 
               (XmlSchemaSimpleType) books.SchemaTypes[new 
XmlQualifiedName("publisherType", "http://www.example.com/books")];      
         

         //works fine
         Console.WriteLine(pubType.Datatype.ParseValue("WROX", new NameTable(), null));

         //throws exception
         //pubType.Datatype.ParseValue("Microsoft Press", new NameTable(), null);

      }
   }
}

The ParseValue() method also converts the input string value to a primitive CLR type based on the mapping defined in the documentation provided on Data Type Support between XML Schema (XSD) Types and .NET Framework Types. For example, if the ParseValue() method of an XmlSchemaDatatype instance representing a numeric type such type derived from xs:decimal is invoked with a string such as "5.5" as input, then the method returns an instance of System.Decimal.

Changing the Namespace Prefix When Writing Out Schemas

A few months ago, I got a complaint from one of our users that the Write() method of the System.Xml.Schema.XmlSchema class always uses the prefix "xs" for the "http://www.w3.org/2001/XMLSchema"namespace. Although the prefix bound to a particular namespace is typically not significant, the user complained that he would be sharing schemas with others who were more comfortable with using the prefix "xsd" for the "http://www.w3.org/2001/XMLSchema"namespace. The following code fragment shows how to control the prefix bound to the "http://www.w3.org/2001/XMLSchema"namespace when writing out a schema using the XmlSchema class.

using System;
using System.Xml;
using System.Xml.Serialization;
using System.Xml.Schema;

public class Sample
{
   public static void Main()
   {
      XmlSchema items = XmlSchema.Read(new XmlTextReader("books.xsd"), null);
      items.Namespaces = new XmlSerializerNamespaces();
      items.Namespaces.Add("xsd", "http://www.w3.org/2001/XMLSchema"); 
      items.Write(Console.Out);
      
      Console.ReadLine();

   }
}

The example above writes out an XML Schema document that has the prefix "xsd" bound to the "http://www.w3.org/2001/XMLSchema" instead of the prefix "xs." This change is not programmatically significant because both the XmlSchemaCollection and XmlValidatingReader classes accept schemas that use either prefix.

Retrieving Annotations from the XmlSchema Class

The list of properties of the System.Xml.Schema.XmlSchema class maps to the content model of the <schema> element as described in the W3C XML Schema Part 1: Structures recommendation. The content model of the <schema> element is described as:

((include | import | redefine | annotation)*, (((simpleType | complexType 
| group | attributeGroup) | element | attribute | notation), 
annotation*)*)

It should be noted each of the groups of allowed child elements of the <schema> element has a corresponding property in the XmlSchema class except one, <annotation> elements. This has led some to believe that one cannot obtain annotations from the XmlSchema class, which is not the case. Annotations can be retrieved from the Items property of the XmlSchema class. The following code sample shows how to print the contents of the annotation in the books.xsd schema.

using System;
using System.Xml;
using System.Xml.Schema;

namespace Test
{
  class Program
    {         
      
      static void Main(string[] args)
     {
       
       XmlSchema books = 
         XmlSchema.Read(new XmlTextReader("books.xsd"), null);

       books.Compile(null);
       
       foreach(XmlSchemaObject xso in books.Items){

         XmlSchemaAnnotation xsa = xso as XmlSchemaAnnotation; 

         if(xsa != null){
      string comment = 
        ((XmlSchemaDocumentation)xsa.Items[0]).Markup[0].InnerText;
      Console.WriteLine(comment);
         }
       }
         
     }      
  }
}

Conclusion

In this article, I addressed the top three questions about the classes in the System.Xml.Schema namespace I have encountered while working as the Program Manager responsible for that namespace. These questions include validating primitive values against XSD simple type definitions, changing the namespace prefix when writing out schemas, and retrieving annotations from the XmlSchema class.

Dare Obasanjo is a member of Microsoft's WebData team, which among other things develops the components within the System.Xml and System.Data namespace of the .NET Framework, Microsoft XML Core Services (MSXML), and Microsoft Data Access Components (MDAC).

Feel free to post any questions or comments about this article on the Extreme XML message board on GotDotNet.