This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

XML, XML, Everywhere

Rob Macdonald

There's no avoiding XML in the .NET world. XML isn't just used in Web applications, it's at the heart of the way data is stored, manipulated, and exchanged in .NET systems. Whether you're an XML virgin, a late bloomer, or an old XML hand, the good news is that Visual Studio .NET has some great tools for working with XML. Rob Macdonald offers a guided tour.

You probably know that XML is basically structured ASCII with embedded tags that allow data to be self-describing and extensible. XML is relatively easy to read whether you're a computer or a human, and, because it's so simple, every modern platform is capable of working with it, making it a very Web-friendly technology. Here's a very simple XML document (see orders.xml in the accompanying Download file) that I'll be using as the basis of this month's column:

<?xml version="1.0" ?>
<Orders>
  <Order>
    <OrderID>1001</OrderID>
    <Customer>Fred</Customer>
    <ProductID>ABC</ProductID>
  </Order>
  <Order>
    <OrderID>1002</OrderID>
    <Customer>Jenny</Customer>
    <ProductID>DEF</ProductID>
  </Order>
</Orders>

Visual Studio .NET has an XML editor that takes the grind out of typing in sample files. Even better, if your data has a sufficiently regular structure, the IDE will automatically generate an Access-like edit window for entering XML sample data (see Figure 1). While my example is trivial, it appears that the IDE is perfectly capable of handling industrial-strength, "real-world" examples.

XML schemas describe XML data the same way database schemas describe the structure of database objects such as tables. XML schemas provide a powerful way not only of understanding the data contained within a document, but also of validating XML documents. Until recently, XML schemas have typically been created in the form of Document Type Definitions (DTDs), but Visual Studio .NET uses a newer standard called XML Schema Definition (XSD), which has the advantage of using XML syntax to define a schema—meaning that the same parsers can process both data and schemas.

Visual Studio .NET can generate XSD schemas automatically based on an XML document. You can then edit the schema graphically to add additional features such as constraints and data types. Here's the schema generated for our orders.xml file:

<xsd:schema id="Orders" targetNamespace="…" 
       xmlns="…" xmlns:xsd="…" xmlns:msdata="…">
  <xsd:element name="Order">
    <xsd:complexType content="elementOnly">
      <xsd:all>
        <xsd:element name="OrderID" 
                minOccurs="0" type="xsd:string"/>
        <xsd:element name="Customer" 
                minOccurs="0" type="xsd:string"/>
        <xsd:element name="ProductID" 
                minOccurs="0" type="xsd:string"/>
      </xsd:all>
    </xsd:complexType>
  </xsd:element>
  <xsd:element name="Orders" msdata:IsDataSet="True">
    <xsd:complexType>
      <xsd:choice maxOccurs="unbounded">
        <xsd:element ref="Order"/>
      </xsd:choice>
    </xsd:complexType>
  </xsd:element>
</xsd:schema>

This schema basically states which elements should be found in an Order and what their data types should be. The graphical XSD offers point-and-click schema editing (see Figure 2).

Working with XML documents

XML is a standard way of representing data, and an XML Document Object Model (or DOM) provides a standard way of manipulating XML. Visual Studio .NET uses the XmlDocument class (in the XML NameSpace), which implements the W3C XML Document Object Model (DOM) Level 1 and Level 2 standards. The XmlDocument class allows you to navigate an XML document as a hierarchy of XMLNode objects.

Here's some code that loops through orders.xml and prints out details of each node it encounters:

Dim doc As New Xml.XmlDocument()
Dim nodeOuter, nodeInner As Xml.XmlNode
Dim i, j As Integer
        
doc.Load("..\orders.xml")
  For i = 1 To doc.DocumentElement.ChildNodes.Count
    nodeOuter = doc.DocumentElement.ChildNodes(i - 1)
    Debug.WriteLine(nodeOuter.Name)
    For j = 1 To nodeOuter.ChildNodes.Count
      nodeInner = nodeOuter.ChildNodes(j - 1)
      Debug.WriteLine(CStr("  " & nodeInner.Name & _
         " " & nodeInner.InnerText))
    Next
  Next
End Sub

The printout it produces is as follows:

Order
  OrderID 1001
  Customer Fred
  ProductID ABC
Order
  OrderID 1002
  Customer Jenny
  ProductID DEF

As well as navigating through the entire document, it's also possible to pull out the details for specific tags. For example, the following code picks out just the Customer tags using an XmlNodeList object:

Dim doc As New Xml.XmlDocument()
Dim list As Xml.XmlNodeList
Dim i As Integer
doc.Load("..\orders.xml")
list = _
  doc.DocumentElement.GetElementsByTagName("Customer")
For i = 0 To list.Count - 1
  Debug.WriteLine(list(i).InnerText)
Next

It prints:

Fred
Jenny

This is just the beginning of what you can do with XML, and if you're new to the subject, it's definitely worth setting some time aside to gain more experience working with XML DOMs. However, not even the most basic introduction to XML is complete without mentioning XSL (eXtensible Stylesheet Language)—a language that allows transformations to be applied to an XML document.

XSL transformations

XSL has been christened by some as "the new SQL." Like SQL, XSL allows you to search out specific structures in data, to produce a new set of data in a chosen format. XSL can be used to turn one XML document into another XML document with a different structure. For example, if you needed a version of my orders.xml document with all of the Customer tags removed and all of the ProductId tags renamed as ItemId tags, you could use an XSL transformation to effect this.

Another very popular type of XSL transformation converts an XML document into an HTML document that can be displayed in a browser. Take a look at the following XSL:

<xsl:stylesheet xmlns:xsl="…"> 
  <xsl:template match="Orders">
    <html><body><table border="1">
    <xsl:apply-templates select="Order"/>
    </table></body></html>
  </xsl:template>
  <xsl:template match="Order">
    <tr>
    <td><xsl:value-of select="Customer"/></td>
    <td><xsl:value-of select="ProductID"/></td>
    </tr>
  </xsl:template>
</xsl:stylesheet>

Without getting too bogged down in the details, note that the XSL contains two templates (all of the XSL-specific terms are prefixed with the xsl: namespace). The first one matches an Orders tag. Whenever it encounters an Orders tag in an input document, it churns out some standard HTML tags that generate a table in an output document. The second template kicks in to match each Order tag and generates one HTML row inside the table created by the first template for each order it matches.

You can write code to apply an XSL transform to an XML document using an XmlTransform object in .NET, as the following code does. Although the code contains some classes that I haven't mentioned yet, the key feature is how the Transform method generates a new document from XML and XSL input files.

Dim doc As New Xml.XmlDocument()
Dim tfm As New Xsl.XslTransform()
Dim newDoc As New IO.StringWriter()
doc.Load("..\orders.xml")
tfm.Load("..\transform.xslt")
tfm.Transform(New Xml.DocumentNavigator(doc), _
   Nothing, newdoc)
Debug.WriteLine(newdoc.ToString)

Here's the output, formatted for easier reading:

<html>
  <body>
    <table border="1">
      <tr><td>Fred</td><td>ABC</td></tr>
      <tr><td>Jenny</td><td>DEF</td></tr>
    </table>
  </body>
</html>

This will appear as a straightforward HTML table if displayed in a browser. XSL makes it easy to keep data and its presentation separate, because XSL can be used to create the output HTML for a given piece of data. This is good from a design point of view, as it results in a cleaner structure. It also allows different styles of output to be generated for the same data.

XML and DataSets

In last month's column, I introduced ADO.NET. You saw how ADO.NET takes the disconnected Recordset approach from classic ADO forward into the .NET world. I mentioned that DataSets in ADO.NET are highly XML-oriented, and, in fact, DataSets store their data internally as XML. It's reasonable to say that a DataSet is really just another form of XML DOM—one that's geared to work with tabular or relational data. While a DataSet can't work with all XML documents, it can provide specific functionality that a general-purpose XML DOM can't provide.

Both of the following code segments produce the output shown in Figure 3. The difference is that in the first segment, a schema is specifically provided, while in the second, the DataSet is required to deduce the structure of the XML from the actual XML data itself:

Dim ds As New DataSet()
ds.ReadXmlSchema("..\orders.xsd")
ds.ReadXmlData("..\orders.xml")
DataGrid1.DataSource = ds.Tables("Order")

and

Dim ds As New DataSet()
ds.ReadXml("..\orders.xml")
DataGrid1.DataSource = ds.Tables("Order")

You can now start to see just how powerful the concept of a DataSet is. It doesn't matter whether data comes from a database (via an ADODataSetCommand) or is simply loaded as XML from a file. Once it's in a DataSet, it can be manipulated and applied in a standard fashion.

You can also create Typed DataSets for XML manipulation simply by right-clicking on an XSD Schema in the Visual Studio .NET XSD Designer and then selecting "Generate DataSet," much as we saw last month with an ADODataSetCommand. Using either standard or Typed DataSets, a Visual Studio .NET developer can easily manipulate XML without needing to be fully conversant with an XML DOM.

Conclusion

For some of you, this month's column will have been full of rather new material, whereas for others, it will have simply provided some new twists on existing XML knowledge. XML is so important in the .NET world that it simply isn't possible to get much further without some basic XML knowledge. For example, XML is at the heart of SOAP, the Simple Object Access Protocol introduced by Cuneyt Varol in the November 2000 issue of Visual Basic Developer (see "Is COM+ Cleaner with SOAP?"). SOAP is the preferred method for communication between processes and computers in .NET, and therefore it underpins the distributed, cross-platform capability of the .NET architecture. At a far more mundane level, many of the configuration files that are used to manage Visual Studio .NET applications are stored as XML.

If this weren't enough, XML and XSL provide powerful ways to store and manipulate data in many types of applications. There's already a huge body of literature on XML, and it's bound to explode as .NET becomes a reality*. [Readers might also want to subscribe to Pinnacle's monthly XML Developer newsletter, edited by Peter Vogel (www.xmldevelopernewsletter.com), and the free bi-weekly XML eXTRA e-mail newsletter that's written by Visual Basic Developer contributor Jon Kilburn (www.FREEeNewsletters.com). —Ed.]*

To find out more about Visual Basic Developer and Pinnacle Publishing, visit their website at http://www.pinpub.com/

Note: This is not a Microsoft Corporation website. Microsoft is not responsible for its content.

This article is reproduced from the February 2001 issue of Visual Basic Developer. Copyright 2001, by Pinnacle Publishing, Inc., unless otherwise noted. All rights are reserved. Visual Basic Developer is an independently produced publication of Pinnacle Publishing, Inc. No part of this article may be used or reproduced in any fashion (except in brief quotations used in critical articles and reviews) without prior consent of Pinnacle Publishing, Inc. To contact Pinnacle Publishing, Inc., please call 1-800-788-1900.