Netting C++

Configuration with XML

Stanley B. Lippman

Contents

XmlReader: the Firehose
Document Object Model

This is the second column in a series that explores using C++/CLI as a pure Microsoft® .NET Framework-compliant language rather than as a transitional bridge in order to bring native C++ code into the managed environment. My application, EEK!, as described in the last column, is a simulation of how a mouse might behave in an experimental environment over some fixed length of time. If the premise of the application leaves you disquieted, you can think of it instead as the logic programming of Non Player Characters (NPCs) in an online game. The task in this column is to initialize the virtual world by reading in a world configuration file stored as XML. To do this, I will use the .NET XML namespace.

Just one note before I start. In the initial EEK! column, I indicated that I would attempt to simulate the actual fish tank environment in which I kept mice while I observed their behaviors. But reality, I discovered, offers a rather tedious constraint to a virtual world; it just didn’t make for an interesting simulation, and so I’ve removed that constraint from the application. Instead, the environment is simply a terrain limited only by one’s imagination. I need it big and complex enough to make it a challenge both for the mouse to explore and for me to program. I apologize for changing the specification, but if you’re a software professional, you should be somewhat used to that by now.

Here’s what I have to do. First, I need to lay out the virtual world in which to run the simulation. In a game studio, this is done with some GUI-driven world-building tool. When the designer presses Save, the geometry and associated state values of everything that has been placed in the world are persisted into an XML file. Second, I need to be able to read the XML file and recreate that virtual world. I’ll call the file, EEKWorld.xml, and write it to a default directory. If I can poke some of my more artistic colleagues, maybe we can produce a simple EEK!-builder as well. For this column, we’ll just assume it exists.

So I’ll read in the generated XML file and store it within a hierarchy of class abstractions that represent each entity or relationship between entities within the world. I will look at those classes in the next column, and I’ll also cover in detail the nature of the EEK! world and how I have designed its internal representation. That’s the fun part, but first I have to get the XML itself into the application before I can make sense of it.

So, in the rest of this column, I’ll review the two primary strategies for parsing an XML file—the firehose and the Document Object Model (DOM) models. Firehose is meant to suggest a one-way stream of data; you can’t retrieve the data once it flows by. The DOM is a traversable in-memory representation of the XML file. My actual implementation will use the DOM, but it’s important to be aware that both are available.

XmlReader: the Firehose

One way to process an XML document is to read it one node at a time. In other words, I grab the node, process it, maintain the necessary state context, and proceed to the next node. I do that under .NET with the abstract XmlReader class (which has several non-abstract derived types, such as XmlTextReader). This allows me to skip nodes that are not of interest to me. The XmlReader is defined within the System.Xml namespace. Figure 1 represents the general code engine to pump through each node.

Figure 1 General Code Engine

using namespace System;       // for String, Console
using namespace System::Xml;  // for XML classes and enums
using namespace System::IO;   // for FileStream

int main()
{
    String^ fname = “EEKWorld.xml”;

    FileStream^ fs = File::OpenRead( fname );
    XmlReader^ xrd = XmlReader::Create( fs );

    while ( xrd->Read() )
    {
        // process each node ...
    }
 
    xrd->Close();
    fs->Close();
}

The FileStream object reads the actual text. The XmlReader provides the smarts to understand and structure what it all means. Each invocation of Read reads the next node from the FileStream. It evaluates to false when it reaches the end-of-stream. I access the content of the node through the reader object. Some of the properties available for the current node include Name, NodeType, HasValue, Value, HasAttributes, AttributeCount, and NamespaceURI. Figure 2 illustrates how I might program that—specifically, it fills in the while loop of Figure 1.

Figure 2 Pumping through the Nodes

while ( xrd->Read() )
{
    Console::WriteLine( “node type is {0}”, xrd->NodeType );

    if ( xrd->Name != String::Empty )
        Console::WriteLine( “\tname is {0}”, xrd->Name);
    else Console::WriteLine(“\tThis node has no name”);

    if ( xrd->HasValue )
        Console::WriteLine(“\tvalue is {0}”, xrd->Value ); 
            
    if ( xrd->NodeType == XmlNodeType::Element &&
         xrd->HasAttributes )
    {
        Console::Write( “\thas {0} attributes: “, xrd->AttributeCount );

        while ( xrd->MoveToNextAttribute() ) 
            Console::Write( “{0} = {1} “, xrd->Name, xrd->Value );

        Console::WriteLine();
    }
    else Console::WriteLine( “\thas no attributes” ); 
}

To discover the type of the current node, just examine the NodeType property. For example:

if ( xrd->NodeType == XmlNodeType::Whitespace ) continue;

The XmlNodeType enum serves as a kind of isA flag to identify the type of the current node. For example:

enum class XmlNodeType{ None, Element, Attribute, ... };

Note that attributes are not treated as children of an element. Rather, I must explicitly access them. MoveToAttribute moves to a specified attribute, using one of the following overloaded instances:

// move to the attribute with the specified index 
void MoveToAttribute( int index ) 

// move to the attribute with the specified name
bool MoveToAttribute( String^ name ) 

// move to the attribute with these characteristics
bool MoveToAttribute( String^ localName, String^ namespaceURI ) 

To iterate across all the attributes of an element, I can also use the MoveToNextAttribute method. If the current node is an element node, this method moves to the first attribute. Each subsequent invocation moves to the next attribute in turn until there are no more attributes. To get back to the Element node containing the attributes, invoke MoveToElement. So although this represents a forward-only, non-cached reading of the document, it affords some flexibility.

Even though it’s efficient, I don’t personally like using the firehose model. It has two primary drawbacks. First, there is no way to move backwards. Therefore, anything you think you might have need for has to be extracted and stored locally. Second, this switch-on-a-node-type-and-process coding style is essentially typeless in terms of the individual nodes; everything is through the text reader element. My preferred model is the DOM, in which the entire document is stored in memory as a hierarchy of navigable typed class node representing the types supported by XML.

That said, of course, reading the entire document into memory prior to processing is not always the best solution, especially if the document is humongous or you are constrained on available memory. Thus, the firehose model is sometimes the right choice.

Document Object Model

Under DOM, I still use a FileStream and XmlReader pair to read and structure the XML. The difference is that I do not directly manipulate the XmlReader object. Rather, I pass the XmlReader object to the Load method of the XmlDocument class. Load builds the in-memory tree representation of the document, which I can navigate. The individual node types are represented by the XmlNode class hierarchy, which contains such derived classes as XmlElement and XmlAttribute. Figure 3 shows an example of setting up the XML file in memory.

Figure 3 Setting Up the XML File in Memory

using namespace System;
using namespace System::Xml; 
using namespace System::IO;

int main()
{
    String^ fname = “EEKWorld.xml”; 
    FileStream^ fs = File::OpenRead( fname );
    XmlReader^ xrd = XmlReader::Create( fs );

    // build the in-memory representation
    XmlDocument^ xd = gcnew XmlDocument;
    xd->Load( xrd );

    ProcessDom(xd); // work with the DOM

    xrd->Close();
    fs->Close();
}

The XmlDocument class provides a set of properties supporting querying about the in-memory representation of the document. For example, Figure 4 illustrates the ProcessDom function invoked from the main function in Figure 3. The interior nodes of the document are each represented by an individual object, all of which are rooted in the abstract XmlNode base class. In fact, the XmlDocument class itself inherits from XmlNode.

Figure 4 ProcessDom

void ProcessDom( XmlDocument^ xd )
{
    Console::WriteLine( “XmlDocument {0} :: {1} “, xd->Name, xd->Value );

    XmlAttributeCollection^ xac = xd->Attributes;
    if ( xac->Count != 0 ) 
    {
        // process attributes ...
    }
   
    Console::WriteLine(“Retrieving the {0} XmlDocument Children\n”,
        xd->ChildNodes->Count );

    XmlNodeList^ children = xd->ChildNodes;
    for each ( XmlNode^ node in children )
    {
      Console::WriteLine( “Child node: {0} of type {1}”,
          node->Name, node->NodeType );
      Console::WriteLine( “Child has children? {0} :: Node’s parent:
        {1}”, node->HasChildNodes, node->ParentNode->Name );
    }
}

Each of the node types is represented by a class derived from the abstract XmlNode base class. XmlElement represents an element, and XmlAttribute represents an attribute, XmlComment, a comment, and so on. The ChildNodes property returns the node’s children maintained within an XmlNodeList object.

The GetElementsByTagName method allows me to retrieve all child elements matching the tag name. In the code in Figure 5, for example, its invocation retrieves all the tile elements within the world configuration file listed in the order they occur within the XML document.

Figure 5 Processing Tile Elements

// root node of the document
XmlElement^ xelem = xd->DocumentElement; 

// returns a list of all elements with a tag name of tile
XmlNodeList^ tiles = xelem->GetElementsByTagName( “tiles” );
Console::WriteLine( “there are {0} tiles to process: “, tiles->Count );

// now let’s process each node ...
for each ( XmlNode^ tile in tiles )
    ProcessTile( static_cast<XmlElement^>( tile ));

C++/CLI is the only .NET-compliant language that provides a compile-time cast through a static_cast operator; all other downcasts of a base class are performed dynamically at run time—within C++/CLI, this includes the c-style cast, the safe_cast, and the dynamic_cast of a base class. The safe_cast is the preferred cast operator for C++/CLI—it will do what’s necessary, and avoids a runtime cast whenever possible. But to suppress a derived-to-base class dynamic cast, you must explicitly use the static_cast operator.

Why do you want to avoid a runtime cast? Isn’t it dangerous not to? You do it for performance when you are absolutely certain from the semantics of the program that the base class you are manipulating is definitely the derived type you need to manipulate. In this case, I have an XmlNode and I need an XmlElement, and I know that is what the XmlNode really is.

Why do I have an XmlNode instead? I suppose it is because the designers of the XML namespace did not wish to provide an XmlElementList, an XmlDocumentList, and so on, and there were no generics available then, so since all these different XML types are derived from XmlNode, they can all be contained by a list of XmlNodes. That would seem to be the only reason to get an inappropriate node type and have to perform a downcast before I can access the correct interface. I think of it as the use tax of an object-oriented design.

To offset that sort of unnecessary performance penalty, we decided within the group working on C++/CLI (and it was a broad group effort) to provide one compile-time hook: the static_cast operator. It was something we all felt was needed in writing real-world code. It just seems naive to think that sort of unnecessary runtime overhead isn’t going to impact performance adversely. The counter argument is that I’m exhibiting a native mindset in believing that, and this is not a significant overhead within a managed environment. That’s the argument of those who think we erred in providing a compile-time cast operator. Ultimately, we don’t really know if it was the right decision; it seems to go against the .NET philosophy, but we did it and it’s there. And since it’s there, I’m going to take advantage of it.

In the next column, I’ll look at the transformation of the XML file being parsed into a collection of C++/CLI classes that represents the world of the simulation. And thus I’ll arrive at the heart of the application.

Send your questions and comments for Stanley to  purecpp@microsoft.com.

Stanley B. Lippman currently works for Perpetual Entertainment, a company specializing in the development of massively multiplayer online games and the infrastructural technology to develop and support them. Stan is best known for working on the original C++ cfront compiler with its creator, Bjarne Stroustrup. His all-time favorite job was Software Technical Director on the Firebird segment of Fantasia 2000. He also worked with the Visual C++ team at Microsoft in the invention of C++/CLI.