Best Practices for Representing XML in the .NET Framework

 

Dare Obasanjo
Microsoft Corporation

March 16, 2004

Summary: Dare Obasanjo looks at the available options for representing XML-based data that is shared between components within a single process and AppDomain, and discusses the design tradeoffs of each approach. (10 printed pages)

Introduction

After a recent design review, a fellow PM asked if there are any design guidelines for exposing XML in an API because he'd seen lots of different approaches and he wasn't sure which to choose. I told him I believed there were some guidelines available on MSDN, but when I checked I only came across an episode of MSDN TV entitled Passing XML Data Inside the CLR, which does contain such information but not in an easily consumable form. Thus the idea of providing a printer-friendly version of Don Box's MSDN TV episode, along with some of my own experiences working on XML APIs at Microsoft thrown in, was created.

A First Look at the Guidelines

There are three primary situations when developers need to consider what APIs to use for representing XML. The situations and guidelines are briefly described below:

  • Classes with fields or properties that hold XML: If a class has a field or property that is an XML document or fragment, it should provide mechanisms for manipulating the property as both a string and as an XmlReader.
  • Methods that accept XML input or return XML as output: Methods that accept or return XML should favor returning XmlReader or XPathNavigator unless the user is expected to be able to edit the XML data, in which case XmlDocument should be used.
  • Converting an object to XML: If an object wants to provide an XML representation of itself for serialization purposes, then it should use the XmlWriter if it needs more control of the XML serialization process than what is provided by the XmlSerializer. If the object wants to provide an XML representation of itself that enables it to participate fully as a member of the XML world, such as allow XPath queries or XSLT transformations over the object, then it should implement the IXPathNavigable interface.

In the following sections, I describe the situations mentioned above in more detail and explain how I arrived at the guidelines.

The Usual Suspects

When deciding to accept or return XML from a method or property, a number of classes in the .NET Framework come to mind as being suited for this task. Below is a list of the five classes in the .NET Framework best suited for representing XML input or output with a brief description of their pros and cons.

  1. System.Xml.XmlReader: The XmlReader is the pull model XML parser of .NET Framework . During pull model processing, the consumer of the XML controls the program flow by requesting events from the XML producer as needed. Pull model XML parsers, such as the XmlReader, operate in a forward-only, streaming fashion while only showing information about a single node at any given time. The fact that an XmlReader doesn't require that the entire XML document be loaded in memory and is read-only, makes it a good choice for creating an XML façade over non-XML data sources. One example of such an XML façade over a non-XML data source is the XmlCsvReader. Some may view the forward-only characteristics of the XmlReader as a limitation because it prevents users from making multiple passes over parts of an XML document.
  2. System.Xml.XPath.XPathNavigator: The XPathNavigator is a read-only cursor over XML data sources. An XML cursor acts like a lens that focuses on one XML node at a time, but unlike pull-based APIs like the XmlReader, the cursor can be positioned anywhere along the XML document at any given time. In a way, pull model APIs are forward-only versions of a cursor model. The XPathNavigator is also a good candidate for implementing an XML façade over non-XML data because it allows one to construct the XML views of a data source in a just-in-time manner, without having to convert the entire data source to an XML tree. An example of using the XPathNavigator to create an XML view of non-XML data is the ObjectXPathNavigator. The fact that the XPathNavigator is read-only and lacks some user-friendly properties, such as InnerXml and OuterXml makes it less palatable than using the XmlDocument or XmlNode in cases that require those features.
  3. System.Xml.XmlWriter: The XmlWriter provides a generic mechanism for pushing an XML document into an underlying store. This underlying store can be anything from a file if using the XmlTextWriter to an XmlDocument if using the XmlNodeWriter. Using an XmlWriter as a parameter to methods that return XML provides a robust way to support a wide range of possible return types including file streams, strings, and XmlDocument instances. This is why the XmlSerializer.Serialize() method accepts an XmlWriter as a parameter on one of its overloads. As its name implies, the XmlWriter is only useful for writing XML and cannot be used for reading or processing XML.
  4. System.Xml.XmlDocument/XmlNode: The XmlDocument is an implementation of the W3C Document Object Model (DOM). The DOM is an in-memory representation of an XML document made up of a hierarchical tree of XmlNode objects that represent logical components of the XML document, such as elements, attributes, and text nodes. The DOM is the most popular API for manipulating XML in the .NET Framework because it provides a straightforward way to load, process, and save XML documents. The main drawback of the DOM is that its design requires the entire XML document to be loaded in-memory.
  5. System.String: XML is a text based format and what better to represent text than the String class. The primary benefit of using a string as the way of representing XML is that it is the lowest common denominator. Strings are easy to write to log files or print to the console, and if one wants to actually process the XML using XML APIs, then one can load the string into an XmlDocument or XPathDocument. There are several problems with using strings as the primary input or output to methods or properties that utilize XML. The first problem with strings is that like the DOM, they require the entire XML document be loaded in memory. Secondly, representing XML as strings places a burden on the producer to generate an XML string, which may be cumbersome to do in some cases. An example of a case where generating an XML string could be cumbersome is when the XML is obtained from an XmlReader or XPathNavigator. Thirdly, representing XML as strings can cause confusion with regards to character encodings because regardless of what encoding declaration is placed in the XML, a string in the .NET Framework is always in the UTF-16 character encoding. Finally, representing XML as strings makes it difficult to create an XML processing pipeline because each layer in the pipeline has to reparse the document.

Classes with Fields or Properties that Hold XML

In certain cases, an object may have a field or property that is an XML document or an XML fragment. The following sample class represents an e-mail message in which the content is XHTML. The XML content of the message is represented as a string and exposed by the Body property of the class:

public class Email{  
  private string from;
  public string From{
    get { return from; }
    set {from = value; }
  }
  private string to;
  public string To{
    get { return to; }
    set {to = value; }
  }
  private string subject;
  public string Subject{
    get { return subject; }
    set {subject = value; }
  }
  private DateTime sent;
  public DateTime Sent{
    get { return sent; }
    set {sent = value; }
  }
  private XmlDocument body = new XmlDocument(); 
  public string Body{
    get { return body.OuterXml; }
    set { body.Load(new System.IO.StringReader(value));}
  }    
}

Using a string as the primary representation of a field or property that represents an XML document is the most user-friendly representation because the System.String class is the most familiar of the XML Usual Suspects to the average developer. However, this places a burden on users of the class who may now have to deal with the cost of parsing the XML document twice. For example, consider a situation where such a property was set from XML resulting from the SqlCommand.ExecuteXmlReader() method or the XslTransform.Transform() method. In such cases, the user would have to parse the document twice as shown in the example below:

     Email email   = new Email(); 
    email.From    = "dareo@example.com"; 
    email.To      = "michealb@example.org"; 
    email.Subject = "Hello World"; 

    XslTransform transform = new XslTransform(); 
    transform.Load("format-body.xsl"); 

    XmlDocument body = new XmlDocument(); 
    //1. XML is parsed by XmlDocument.Load()
    body.Load(transform.Transform(new XPathDocument("body.xml"), null));

    //2. The exact same XML is parsed again by XmlDocument.Load() in Email.Body property
    email.Body = body.OuterXml;
 

In the example above, the same XML is parsed twice because it has to be loaded from the XmlReader into the XmlDocument before it can be converted to a string, which is then used to set the Body property of the Email class, which itself parses the XML into an XmlDocument internally. A good rule of thumb is to provide access to a XML property as an XmlReader as well as a string. Thus the following methods would also be added to the Email class to promote maximum flexibility for users of the class:

public void SetBody(XmlReader reader){
    body.Load(reader); 
  }

  public XmlReader GetBody(){
    return new XmlNodeReader(body); 
  }

This gives users of the Email class a way to pass, set, and retrieve XML data within the Body property in an efficient manner when needed.

**Guideline   **If a class has a field or property that is an XML document or fragment, it should provide mechanisms for manipulating the property as both a string and as an XmlReader.

Astute readers may notice that if one directly exposed an XmlDocument as a property, then one would satisfy the guideline and also enable users of the property to make fine-grained changes to the XML.

Methods that Accept XML Input or Return XML as Output

When designing methods that produce or consume XML, developers have a responsibility to make such methods flexible in the input they accept. In the case of methods that accept XML as input, one could divide these into methods that expect to modify the data in place and those that only require read-only access to the XML. The only one of the XML Usual Suspects that supports read-write access is the XmlDocument. The code sample below shows such a method:

public void ApplyDiscount(XmlDocument priceList){
    foreach(XmlElement price in priceList.SelectNodes("//price")){
      price.InnerText = (Double.Parse(price.InnerText) * 0.85).ToString();      
    }
  }

For methods that require read-only access to the XML there are two primary options:

  • XmlReader
  • XPathNavigator

The XmlReader provides forward-only access to the XML, while the XPathNavigator provides random access to an underlying XML source, as well as the ability to perform XPath queries against the data source. The following code samples print the artist and title from the following XML document:

<items>
    <compact-disc>
      <price>16.95</price>
      <artist>Nelly</artist>
      <title>Nellyville</title>
    </compact-disc>
    <compact-disc>
      <price>17.55</price>
       <artist>Baby D</artist>
       <title>Lil Chopper Toy</title>
    </compact-disc>
  </items>
XmlReader: 
public static void PrintArtistAndPrice(XmlReader reader){
    reader.MoveToContent(); //move from root node to document element (items)
    /* keep reading until we get to the first <artist> element */
    while(reader.Read()){

      if((reader.NodeType == XmlNodeType.Element) && reader.Name.Equals("artist")){

        artist = reader.ReadElementString();
        title  = reader.ReadElementString(); 
        break; 
      }
    }
    Console.WriteLine("Artist={0}, Title={1}", artist, title);
  }
}

XPathNavigator:

public static void PrintArtistAndPrice(XPathNavigator nav){

    XPathNodeIterator iterator = nav.Select("/items/compact-disc[1]/artist 
       | /items/compact-disc[1]/title");

    iterator.MoveNext();
    Console.WriteLine("Artist={0}", iterator.Current);

    iterator.MoveNext();
    Console.WriteLine("Title={0}", iterator.Current);
  }

In general, similar rules apply for methods that return XML. If the XML is expected to be edited by the receiver, then an XmlDocument should be returned. Otherwise an XmlReader or XPathNavigator should be returned depending on whether or not the method needs to provide streaming, forward-only access to the XML data.

**Guideline   **Methods that accept or return XML should favor returning XmlReader or XPathNavigator unless the user is expected to be able to edit the XML data, in which case XmlDocument should be used.

The above guideline implies that methods that return XML should favor returning the XmlReader because it fits more user cases than any of the other types. Also, in cases where callers of the method require more functionality, they can load an XmlDocument or XPathDocument from the returned XmlReader.

Converting an Object to XML

The ubiquity of XML as the common language for information interchange makes it the obvious choice for certain objects that want to represent themselves as XML, either for purposes of serialization or to gain access to other XML technologies such as a query using XPath or transformation using XSLT.

When converting an object to XML for serialization purposes, the obvious choice is to use the XML Serialization technology in the .NET Framework. However, in certain cases one may require more control of the generated XML than is provided by the XmlSerializer. In such cases, the XmlWriter is a handy class to have in your toolkit because it frees you from the requirement for there to be a one to one mapping between the structure of the class and the generated XML. The following example shows the XML generated by using an XmlWriter to serialize the Email class mentioned in previous sections.

public void Save(XmlWriter writer){

    writer.WriteStartDocument(); 
    writer.WriteStartElement("email"); 
    writer.WriteStartElement("headers");

    writer.WriteStartElement("header");
    writer.WriteElementString("name", "to"); 
    writer.WriteElementString("value", this.To); 
    writer.WriteEndElement(); //header

    writer.WriteStartElement("header");
    writer.WriteElementString("name", "from"); 
    writer.WriteElementString("value", this.From); 
    writer.WriteEndElement(); //header

    writer.WriteStartElement("header");
    writer.WriteElementString("name", "subject"); 
    writer.WriteElementString("value", this.Subject); 
    writer.WriteEndElement(); //header
    
    writer.WriteStartElement("header");
    writer.WriteElementString("name", "sent"); 
    writer.WriteElementString("value", XmlConvert.ToString(this.Sent)); 
    writer.WriteEndElement(); //header

    writer.WriteEndElement(); //headers;
    
    writer.WriteStartElement("body");
    writer.WriteRaw(this.Body); 

    writer.WriteEndDocument(); //closes all open tags
  }
Which generates the following XML document 
<email>
  <headers>
    <header>
      <name>to</name>
      <value>michealb@example.org</value>
    </header>
    <header>
      <name>from</name>
      <value>dareo@example.com</value>
    </header>
    <header>
      <name>subject</name>
      <value>Hello World</value>
    </header>
    <header>
      <name>sent</name>
      <value>2004-03-05T15:54:13.5446771-08:00</value>
    </header>
  </headers>
  <body><p>Hello World is my favorite sample application.</p></body>
</email>

The above XML document would be impossible to generate using only the basic capabilities of the XmlSerializer. The other advantage of the XmlWriter is that it abstracts away from the underlying target to which the data is being written to so it could be anything from a file on disk to an in-memory string or even an XmlDocument courtesy of the XmlNodeWriter.

If the requirement is to provide a way for the class to participate more fully in the XML world, such as interact with XML technologies like XPath or XSLT, then the best option is for the class to implement the IXPathNavigable interface and provide an XPathNavigator over the class. An example of doing this is the ObjectXPathNavigator, which provides an XML view over arbitrary objects that enables one to execute XPath queries or run XSLT transformations on said objects.

**Guideline   **If an object wants to provide an XML representation of itself for serialization purposes, then it should use the XmlWriter if it needs more control of the XML serialization process than is provided by the XmlSerializer. If the object wants to provide an XML representation of itself that enables it to participate fully as a member of the XML world, such as allow XPath queries or XSLT transformations over the object, then it should implement the IXPathNavigable interface.

Conclusion

In future versions of the .NET Framework, more emphasis will be placed on cursor-based XML APIs like the XPathNavigator exposed by the IXPathNavigable interface. Such cursors will be the primary mechanisms for interacting with XML in the .NET Framework.

Dare Obasanjo is a member of Microsoft's WebData team, which among other things develops the components within the System.Xml and System.Data namespace of the .NET Framework, Microsoft XML Core Services (MSXML), and Microsoft Data Access Components (MDAC).

Feel free to post any questions or comments about this article on the Extreme XML message board on GotDotNet.