Extra Chunky XML, in Client-Size Servings


Charlie Heinemann
Program Manager, XML
Microsoft Corporation

Download the source code for this article (34.3K)


Using XSL to Filter Data on the ServerTaking the Hit up FrontGetting at Schema Information through the DOMIn Short

Way back in December, I let you in on the wonders of ID and IDREF. In the article "Cross-Reference Your XML Data," I showed you how to use ID and IDREF declarations within schema to cross-reference your data and quickly search your data for the information you need. This month, I'd like to take this one step further. Using the same data, I'll show you how to utilize ID and IDREFs within XSL to break your data up into more manageable chunks. I'll also show you how to navigate from an XML node to the schema that defines it in order to gather pertinent information about that node.

Using XSL to Filter Data on the Server

The December article contained some XML data concerning classes, teachers, and students. Although the XML document was not extremely lengthy, it very well could have been. Large documents can become cumbersome and can take a long time to download. Now, you may have to endure this hit if the client needs all of the information within the document. However, much of the time, the data that is being viewed on the client will be a small fraction of the entire data set.

For instance, take the following XML:

<schedule xmlns="x-schema:schedSchema.xml">
    <class code="ENGL6004" title="From Here to Eternity: Studies in the Future and Other Temporal Genres"
      units="4" taughtBy="T31330" attendedBy="S50245 S87901 S19272 S48984"/>
    <class code="HIST6010" title="The You Decade: A History of Finger Pointing in Post-War America"
      units="4" taughtBy="T72100" attendedBy="S60912 S87901 S84281 S44098"/>
    <class code="ENGL6020" title="Reading Between the Lines: The Literature of Waiting"
      units="4" taughtBy="T31330 T72100" attendedBy="S84281 S19272 S48984 S44098"/>
    <teacher id="T31330" name="Margaret Doornan" position="Associate_Professor"/>
    <teacher id="T72100" name="Hal Canter" position="Instructor"/>
    <student id="S44098" name="Kelly Griftman" year="Senior" status="full-time"/>
    <student id="S48984" name="Norbert James" year="Senior" status="full-time"/>
    <student id="S19272" name="Mitch Milton" year="Junior" status="full-time"/>
    <student id="S84281" name="Jasmine Green" year="Senior" status="full-time"/>
    <student id="S87901" name="John Atterly" year="Senior" status="full-time"/>
    <student id="S60912" name="Ellen Carson" year="Junior" status="part-time"/>
    <student id="S50245" name="Maggie Trudeau" year="Junior" status="part-time"/>

This document contains the data for only three classes, two teachers, and seven students. In a true scenario, these numbers would probably be much larger. In order to serve the client only the information from this document that the user needs, we must divide the document into chunks. The Web application will determine the shape of these chunks, and XSL will be used to create them.

I briefly put together a Class Viewer Web page, which allows the user to choose from the list of classes, and then view the information concerning the chosen class (number of units, teacher(s), and students). Most likely, the user will not want to sift through all the classes. Therefore, it makes sense to break the data down into chunks containing information for a specific class and ship only the chunk pertaining to the specific class down to the client. The following is an example of what this chunk might look like:

  title="The You Decade: A History of Finger Pointing in Post-War America"
  attendedBy="S60912 S87901 S84281 S44098"
    <teacher id="T72100" name="Hal Canter" position="Instructor"/>
    <student id="S60912" name="Ellen Carson" year="Junior" status="part-time"/>
    <student id="S87901" name="John Atterly" year="Senior" status="full-time"/>
    <student id="S84281" name="Jasmine Green" year="Senior" status="full-time"/>
    <student id="S44098" name="Kelly Griftman" year="Senior" status="full-time"/>

The key to creating the above document from the initial one through XSL is the id() method within a query. It is possible to pass the id() method an IDREF string (such as "S60912 S87901 S84281 S44098") and have it return a list of nodes that possess the corresponding IDs. In the initial document, the "taughtBy" and "attendedBy" attributes on the "class" elements are IDREFs. This means that they will be treated as a series of references to specific nodes within the document. The values of the "taughtBy" attribute refer to specific teacher nodes and the values of the "attendedBy" attribute refer to specific student nodes. Because each class has direct references to the teacher(s) and students for that class, we can use XSL and the id() method to create a document that contains the "class" information and the "teacher" and "student" information for that specific class.

The id() method is used within the XSL document to select the nodes that will be processed in the <xsl:for-each> elements:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">
  <xsl:template match="//">
    <xsl:attribute name="code"><xsl:value-of select="@code"/></xsl:attribute>
    <xsl:attribute name="title"><xsl:value-of select="@title"/></xsl:attribute>
    <xsl:attribute name="units"><xsl:value-of select="@units"/></xsl:attribute>
    <xsl:attribute name="taughtBy"><xsl:value-of select="@taughtBy"/></xsl:attribute>
    <xsl:attribute name="attendedBy"><xsl:value-of select="@attendedBy"/></xsl:attribute>
    <xsl:attribute name="xmlns">x-schema:http://cheinw98/extreme04/classSchema.xml</xsl:attribute>
  <xsl:element name="teachers">
  <xsl:for-each select="id(@taughtBy)">
       <xsl:attribute name="id"><xsl:value-of select="@id"/></xsl:attribute>
       <xsl:attribute name="name"><xsl:value-of select="@name"/></xsl:attribute>
       <xsl:attribute name="position"><xsl:value-of select="@position"/></xsl:attribute>
  <xsl:element name="students">
  <xsl:for-each select="id(@attendedBy)">
       <xsl:attribute name="id"><xsl:value-of select="@id"/></xsl:attribute>
       <xsl:attribute name="name"><xsl:value-of select="@name"/></xsl:attribute>
       <xsl:attribute name="year"><xsl:value-of select="@year"/></xsl:attribute>
       <xsl:attribute name="status"><xsl:value-of select="@status"/></xsl:attribute>

Although the above style sheet looks rather complicated, it is actually fairly simple. It creates a replica of the "class" node to which the style sheet was applied, tacking on a schema declaration (to be used later). The "teacher" nodes—the IDs of which match the IDREFs contained in the "taughtBy" attribute's value—are then copied to the output. The same is done with the "student" nodes—the IDs of which match the IDREFs contained in the "attendedBy" attribute's value.

Taking the Hit up Front

We can absorb the time it takes to load the large XML document and the XSL style sheet, by loading the document and style sheet into global variables while the UI is being downloaded. The global variables are set using the following file:

<SCRIPT LANGUAGE="Javascript" RUNAT="Server">
function Application_OnStart()
        var xmldocSession = new ActiveXObject("Microsoft.FreeThreadedXMLDOM"); //Free Threaded
        var xsldocSession = new ActiveXObject("Microsoft.FreeThreadedXMLDOM"); //Free Threaded
        Application("classesDoc") = xmldocSession;
        Application("classXSL") = xsldocSession;

With the large XML document and the XSL style sheet loaded in global variables, we can now load only the processed XML chunks on the client. This is done using an Active Server Pages (ASP) file:

  var idCode = Request.QueryString("code");
  var schedule = Application("classesDoc");

  var classSS = Application("classXSL");

  var indexClass = schedule.nodeFromID(idCode);

  var newDoc = indexClass.transformNode(classSS.documentElement);

The XML returned by the above ASP code is loaded on the client using the load method:

classDoc.load("getClass.asp?code=" + code);

The "code" variable is set when the user chooses a class. The value of this variable corresponds to the ID of the desired class node from the complete class list.

Before we move on to the client side of things, I'd also like to note that this method of retrieving the XML data also allows you to work around some of the security restrictions on the client. With the above method, the application has access to data across domains and protocols (provided that the hosting server has access to the directory in which the data is being held).

Getting at Schema Information through the DOM

In the future, I would like to expand my Class Viewer Web application to allow the user to update the information concerning the class. Take, for instance, the "units" information. I'd like users to be able to enter in a new amount and have that be used to update the specific "units" field. This is easy enough to do. However, because the "units" attribute is typed as an enumeration, only certain values are allowed. This means that I must tell my user what those values are. The best and most robust way to do this would be to have the schema determine the acceptable values.

If I navigate to the "units" attribute on the class node returned from the above ASP code, I can use the "definition" property to return the schema definition of that node:

schemaDef = unitsAtt.definition

The variable "schemaDef" holds the parsed node representing the schema definition for that node. Calling the xml property on "schemaDef" returns the following:

<AttributeType name="units" dt:type="enumeration" dt:values=".5 1 2 3 4"/>

Now that I can navigate to the schema definition, I can easily get the value of the "dt:values" attribute and parse it:

var enumArray = new Array();
enumArray = schemaDef.getAttribute("dt:values").split(" ");

In my Class Viewer Web application, I can then use the values of "enumArray" to populate the select box from which the user will choose the new value.

In Short

Processing your XML on the server gives you some advantages. First, you can load large files while you load your UI, and then break those large files into chunks to be delivered to the client. Second, you can load the data based on the security context of the Web application and not on the security context of the client.

With the definition property, you can get schema information through the document object model (DOM), allowing you to customize your UI based on the information in the schema.

Charlie Heinemann is a program manager for Microsoft's Weblications team. Coming from Texas, he knows how to think big.