Using the Open XML SDK 2.0 Classes Versus Using .Net XML Services

This content is outdated and is no longer being maintained. It is provided as a courtesy for individuals who are still using these technologies. This page may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.

Summary: Learn about some of the benefits of using the typesafe classes in the Open XML Software Development Kit 2.0 for Microsoft Office. In addition, compare code that manipulates document content by using the SDK classes, to code that manipulates that same content by using .NET XML services. (13 printed pages)

Office Visual How To

Applies to: 2007 Microsoft Office System, Microsoft Office Word 2007, Open XML SDK 2.0 for Microsoft Office, Microsoft Visual Studio 2008

Joel Krist, iSoftStone

August 2009

Overview

The Open XMLSoftware Development Kit 2.0 for Microsoft Office makes it possible to create and manipulate Microsoft Office Word 2007, Microsoft Office Excel 2007, and Microsoft Office PowerPoint 2007 documents programmatically via the Open XML formats. The typesafe classes included with the SDK provide a layer of abstraction between the developer and the Open XML formats, simplifying the process of working with Office 2007 documents and enabling the creation of solutions that are not dependent on the presence of the Office client applications to handle document creation.

This visual how-to article discusses some of the benefits of using the Open XML SDK 2.0, and provides sample code that illustrates the differences between using the SDK classes and using the .NET XML services to add a header to a Word 2007 document.

See It Video startup screen

Watch the Video

Length: 08:03 | Size: 13.00 MB | Type: WMV file

Code It | Read It | Explore It

Code It

Download the sample code

This visual how-to article includes sample code that creates a Windows console application that adds or replaces a header in an existing Word 2007 document. The code shows two approaches; one that uses the classes in the Open XML SDK 2.0 for Microsoft Office, and one that uses .NET XML services.

This section walks through the following steps:

  1. Creating a Windows console application solution in Visual Studio 2008.

  2. Adding references to the DocumentFormat.OpenXml and WindowsBase assemblies.

  3. Adding the sample code to the solution.

Creating a Windows Console Application in Visual Studio 2008

This visual how-to article uses a Windows console application to provide the framework for the sample code. However, you could use the same approach that is illustrated here with other application types as well.

To create a Windows Console Application in Visual Studio 2008

  1. Start Microsoft Visual Studio 2008.

  2. On the File menu, point to New, and then click Project.

  3. In the New Project dialog box select the Visual C# Windows type in the Project types pane.

  4. Select Console Application in the Templates pane, and then name the project ReplaceHeader.

    Figure 1. Create new solution in the New Project dialog box

    Create new solution in the New Project dialog box

     

  5. Click OK to create the solution.

Adding References to the DocumentFormat.OpenXml and WindowsBase Assemblies

The sample code uses the classes and enumerations that are in the DocumentFormat.OpenXml.dll assembly that is installed with the Open XML SDK 2.0 for Microsoft Office. To add the reference to the assembly in the following steps or to build the sample code that accompanies this visual how-to, you must first download and install the Open XML SDK 2.0 for Microsoft Office so that the assembly is available.

To add References to the DocumentFormat.OpenXml and WindowsBase Assemblies

  1. Add a reference to the DocumentFormat.OpenXml assembly by doing the following:

    1. On the Project menu in Visual Studio, click Add Reference to open the Add Reference dialog box.

    2. Select the .NET tab, scroll down to DocumenFormat.OpenXml, select it, and then click OK.

      Figure 2. Add Reference to DocumentFormat.OpenXML

      Add Reference to DocumentFormat.OpenXml

       

  2. The classes in the DocumentFormat.OpenXml assembly use the System.IO.Packaging.Package class that is defined in the WindowsBase assembly. Add a reference to the WindowsBase assembly by doing the following:

    1. On the Project menu in Visual Studio, click Add Reference to open the Add Reference dialog box.

    2. Select the .NET tab, scroll down to WindowsBase, select it, and then click OK.

      Figure 3. Add Reference to WindowsBase

      Add Reference to WindowsBase

       

Adding the Sample Code to the Solution

Replace the entire contents of the Program.cs source file with the following code.

using System.IO;
using System.Xml;
using DocumentFormat.OpenXml.Wordprocessing;
using DocumentFormat.OpenXml.Packaging;

class Program
{
  static void Main(string[] args)
  {
    string docName = @"C:\Temp\ReplaceHeader.docx";

    // The AddHeaderViaXMLServices and AddHeaderViaSDKClasses
    // methods both acomplish the same thing: they remove all
    // existing headers and then add a new header to all
    // sections in a document.
    //
    // The AddHeaderViaXMLServices method uses .NET XML services
    // (DOM, XPath, namespace management) to accomplish this
    // while the AddHeaderViaSDKClasses makes use of the Open
    // XML Format SDK 2.0 classes.
    //
    // The two methods are provided so that you can compare the
    // approaches, but you should only call one method at a time.
    // Switch the method call that is commented out to change
    // the approach used.

    AddHeaderViaSDKClasses(docName);
    // AddHeaderViaXMLServices(docName);
  }

  public static void AddHeaderViaSDKClasses(string docName)
  {
    // Declare a string for the header text.
    string newHeaderText =
      "New header via Open XML Format SDK 2.0 classes";

    // Open the document for reading and writing.
    using (WordprocessingDocument wdDoc =
      WordprocessingDocument.Open(docName, true))
    {
      // Get the main document part.
      MainDocumentPart mainDocPart = wdDoc.MainDocumentPart;

      // Delete the existing header parts.
      mainDocPart.DeleteParts(mainDocPart.HeaderParts);

      // Create a new header part and get its relationship id.
      HeaderPart newHeaderPart = mainDocPart.AddNewPart<HeaderPart>();
      string rId = mainDocPart.GetIdOfPart(newHeaderPart);

      // Call the GeneratePageHeaderPart helper method, passing in
      // the header text, to create the header markup and then save
      // that markup to the header part.
      GeneratePageHeaderPart(newHeaderText).Save(newHeaderPart);     

      // Loop through all section properties in the document
      // which is where header references are defined.
      foreach (SectionProperties sectProperties in
        mainDocPart.Document.Descendants<SectionProperties>())
      {
        //  Delete any existing references to headers.
        foreach (HeaderReference headerReference in
          sectProperties.Descendants<HeaderReference>())
            sectProperties.RemoveChild(headerReference);

        //  Create a new header reference that points to the new
        // header part and add it to the section properties.
        HeaderReference newHeaderReference =
          new HeaderReference()
          { Id = rId, Type = HeaderFooterValues.Default };
        sectProperties.Append(newHeaderReference);
      }

      //  Save the changes to the main document part.
      mainDocPart.Document.Save();
    }
  }

  private static Header GeneratePageHeaderPart(string HeaderText)
  {
    var element =
      new Header(
        new Paragraph(
          new ParagraphProperties(
            new ParagraphStyleId() { Val = "Header" }),
          new Run(
            new Text(HeaderText))
        )
      );

    return element;
  }

  public static void AddHeaderViaXMLServices(string docName)
  {
    // The following variable is used simply to help with line
    // wrap when the code is posted on MSDN. Without it the
    // Xml namespace strings are too long to avoid wrapping.
    string schemaUri = @"http://schemas.openxmlformats.org/";

    string newHeaderContent =
    @"<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?>
    <w:hdr xmlns:w=""" + schemaUri + @"wordprocessingml/2006/main""> 
      <w:p>
        <w:pPr>
          <w:pStyle w:val=""Header""/>
        </w:pPr>
        <w:r>
          <w:t>New header via .NET XML services</w:t>
        </w:r>
      </w:p>
    </w:hdr>";

    string wordmlNamespace =
      schemaUri + @"wordprocessingml/2006/main";
    string relationshipNamespace =
      schemaUri + @"officeDocument/2006/relationships";

    // Open the document for reading and writing.
    using (WordprocessingDocument wdDoc =
      WordprocessingDocument.Open(docName, true))
    {
      // Get the main document part.
      MainDocumentPart mainDocPart = wdDoc.MainDocumentPart;

      // Delete the existing header parts.
      mainDocPart.DeleteParts(mainDocPart.HeaderParts);

      // Create a new header part and get its relationship id.
      HeaderPart newHeaderPart = mainDocPart.AddNewPart<HeaderPart>();
      string rId = mainDocPart.GetIdOfPart(newHeaderPart);
      
      // Create an XmlDocument for the header content.
      XmlDocument headerDoc = new XmlDocument();
      headerDoc.LoadXml(newHeaderContent);

      // Write the header out to its document part.
      headerDoc.Save(newHeaderPart.GetStream());

      // Manage namespaces to perform XML XPath queries.
      NameTable nt = new NameTable();
      XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);
      nsManager.AddNamespace("w", wordmlNamespace);

      // Get the document part from the package.
      // Load the XML in the part into an XmlDocument instance.
      XmlDocument xdoc = new XmlDocument(nt);
      xdoc.Load(mainDocPart.GetStream());

      //  Find the document's section property nodes.
      XmlNodeList targetNodes =
        xdoc.SelectNodes("//w:sectPr", nsManager);

      // Loop through all section properties in the document
      // which is where header references are defined.
      foreach (XmlNode targetNode in targetNodes)
      {
        //  Delete any existing references to headers.
        XmlNodeList headerNodes =
          targetNode.SelectNodes("./w:headerReference", nsManager);

        foreach (System.Xml.XmlNode headerNode in headerNodes)
        {
          targetNode.RemoveChild(headerNode);
        }

        //  Create a new header reference that points to the new
        // header part and add it to the section properties.
        XmlElement node =
          xdoc.CreateElement("w:headerReference", wordmlNamespace);

        XmlAttribute idAttr =
          node.Attributes.Append(xdoc.CreateAttribute("r:id",
          relationshipNamespace));
        idAttr.Value = rId;
        node.Attributes.Append(idAttr);

        XmlAttribute typeAttr =
          node.Attributes.Append(xdoc.CreateAttribute("w:type",
          wordmlNamespace));
        typeAttr.Value = "default";
        node.Attributes.Append(typeAttr);

        targetNode.InsertBefore(node, targetNode.FirstChild);
      }

      //  Save the document XML back to its document part.
      xdoc.Save(mainDocPart.GetStream(FileMode.Create));
    }
  }
}

 

Build and run the solution in Visual Studio by pressing CTRL+F5. The sample code adds a new default header to all sections of an existing document, which replaces all existing headers in the process. The code specifies that the document to modify is named ReplaceHeader.docx and is located in the C:\Temp folder. To change the name or location of the document, modify the sample code and change the value of the docName variable defined in the Main method. The document that is referenced by the docName variable in the code must exist for the solution to work. The code does not require any particular contents in the document, so even an empty document will suffice.

The header reference that is added to the document has its Type attribute specified as Default, which means that the document header options determine the visibility of the header in the document. For example, if a section of the document has its header options set to Different First Page, the header will not be displayed on the first page of the section. If the section's header options are set to Different Odd & Even Pages, the header will be displayed on odd pages but not on even pages. For more information about header and footer options in Word 2007, see Insert headers and footers on Microsoft Office Online.

Read It

The Open XML Format SDK 1.0 simplified the manipulation of Open XML packages by providing strongly typed classes and objects that encapsulated many of the common tasks typically performed on OpenXML packages. The Open XML SDK 2.0 for Microsoft Office extends those capabilities with additional typesafe classes and objects that you can use to manipulate the content of OpenXML packages and parts.

The following are some of the benefits of the Open XML SDK 2.0 for Microsoft Office:

  • Strongly Typed Part Classes

    The Open XML document part classes provided with the Open XML SDK 2.0 for Microsoft Office leverage the .NET Language-Integrated Query (LINQ) technology to provide strongly typed object access to the XML content inside the parts of Open XML documents. Instead of writing code that uses generic XML functionality to manipulate document content, the SDK part classes can be used to work with objects that represent the content elements, attributes, and values. The classes remove the requirement that developers be aware of Open XML schema details such as XML namespaces and prefixes as well as the spelling of elements, attributes, and values. All Open XML document schema types are represented as strongly typed Common Language Runtime (CLR) classes and all attribute values as enumerations.

  • LINQ to XML Annotations

    The OpenXmlPartContainer base class provided with the Open XML SDK 2.0 supports annotations in the style of LINQ to XML. Annotations make it possible to associate any arbitrary object of any arbitrary type with any XML component in an XML tree.

  • Support for Validation of Open XML Documents

    With the validation support in the Open XML SDK 2.0, a developer can validate a full document, a package, a part in the package, or a section of content represented by an Element. The errors reported contain both the XPath to the node that has a problem and the run time node object itself.

This section uses code snippets from the Code It section to compare two ways to add a header to a Word 2007 document; one using the classes provided with the Open XML SDK 2.0, and one using .NET XML services.

The two methods that contain the code for the two approaches are AddHeaderViaSDKClasses and AddHeaderViaXMLServices. Both methods accept a path to a Word 2007 document as a parameter. The first major difference between the two methods is the way that each defines the header content and required XML namespaces. The AddHeaderViaSDKClasses method simply declares the string to display in the header. The AddHeaderViaXMLServices method declares a string that defines the markup for the header element as a well as strings that define the different XML namespaces to use when adding the header to the document.

public static void AddHeaderViaSDKClasses(string docName)
{
  // Declare a string for the header text.
  string newHeaderText =
    "New header via Open XML Format SDK 2.0 classes";

 

public static void AddHeaderViaXMLServices(string docName)
{
  // The following variable is used simply to help with line
  // wrap when the code is posted on MSDN. Without it the
  // Xml namespace strings are too long to avoid wrapping.
  string schemaUri = @"http://schemas.openxmlformats.org/";

  string newHeaderContent =
  @"<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?>
  <w:hdr xmlns:w=""" + schemaUri + @"wordprocessingml/2006/main""> 
    <w:p>
      <w:pPr>
        <w:pStyle w:val=""Header""/>
      </w:pPr>
      <w:r>
        <w:t>New header via .NET XML services</w:t>
      </w:r>
    </w:p>
  </w:hdr>";

  string wordmlNamespace =
    schemaUri + @"wordprocessingml/2006/main";
  string relationshipNamespace =
    schemaUri + @"officeDocument/2006/relationships";

Next, both methods use the WordprocessingDocument.Open method to open the package. They then get the package's main document part, delete all existing header parts, and create a new header part.

// Open the document for reading and writing.
using (WordprocessingDocument wdDoc =
  WordprocessingDocument.Open(docName, true))
{
  // Get the main document part.
  MainDocumentPart mainDocPart = wdDoc.MainDocumentPart;

  // Delete the existing header parts.
  mainDocPart.DeleteParts(mainDocPart.HeaderParts);

  // Create a new header part and get its relationship id.
  HeaderPart newHeaderPart = mainDocPart.AddNewPart<HeaderPart>();
  string rId = mainDocPart.GetIdOfPart(newHeaderPart);

 

The AddHeaderViaSDKClasses method then generates the markup for the new header by calling the GeneratePageHeaderPart helper method. The GeneratePageHeaderPart method uses the SDK classes to create the required XML elements. The SDK classes take care of all namespace, element, attribute, and value related issues.

// Call the GeneratePageHeaderPart helper method, passing in
// the header text, to create the header markup and then save
// that markup to the header part.
GeneratePageHeaderPart(newHeaderText).Save(newHeaderPart);
…

private static Header GeneratePageHeaderPart(string HeaderText)
{
  var element =
    new Header(
      new Paragraph(
        new ParagraphProperties(
          new ParagraphStyleId() { Val = "Header" }),
        new Run(
          new Text(HeaderText))
      )
    );

  return element;
}

 

The AddHeaderViaXMLServices method creates an XmlDocument to hold the header content defined previously and then streams the markup to the new header part. The code then creates and initializes a NameTable, XmlNamespaceManager, and another XmlDocument instance and uses them to get the document's section property nodes.

// Create an XmlDocument for the header content.
XmlDocument headerDoc = new XmlDocument();
headerDoc.LoadXml(newHeaderContent);

// Write the header out to its document part.
headerDoc.Save(newHeaderPart.GetStream());

// Manage namespaces to perform XML XPath queries.
NameTable nt = new NameTable();
XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);
nsManager.AddNamespace("w", wordmlNamespace);

// Get the document part from the package.
// Load the XML in the part into an XmlDocument instance.
XmlDocument xdoc = new XmlDocument(nt);
xdoc.Load(mainDocPart.GetStream());

//  Find the document's section property nodes.
XmlNodeList targetNodes =
  xdoc.SelectNodes("//w:sectPr", nsManager

 

Both methods then loop through the document's section properties, delete any exisiting header references, and add a reference to the new header. The AddHeaderViaSDKClasses method uses the SDK classes and does not specify Open XML schema details such as namespaces, prefixes, or element and attribute names.

// Loop through all section properties in the document
// which is where header references are defined.
foreach (SectionProperties sectProperties in
  mainDocPart.Document.Descendants<SectionProperties>())
{
  //  Delete any existing references to headers.
  foreach (HeaderReference headerReference in
    sectProperties.Descendants<HeaderReference>())
      sectProperties.RemoveChild(headerReference);

  //  Create a new header reference that points to the new
  // header part and add it to the section properties.
  HeaderReference newHeaderReference =
    new HeaderReference()
    { Id = rId, Type = HeaderFooterValues.Default };
  sectProperties.Append(newHeaderReference);
}

 

The AddHeaderViaXMLServices method works with generic XmlNode, XmlNodeList, XmlElement, and XmlAttribute objects to achieve the same result. This approach requires knowledge of the Open XML schema because you must specify the correct namespaces, prefixes, elements, and attributes.

// Loop through all section properties in the document
// which is where header references are defined.
foreach (XmlNode targetNode in targetNodes)
{
  //  Delete any existing references to headers.
  XmlNodeList headerNodes =
    targetNode.SelectNodes("./w:headerReference", nsManager);

  foreach (System.Xml.XmlNode headerNode in headerNodes)
  {
    targetNode.RemoveChild(headerNode);
  }

  //  Create a new header reference that points to the new
  // header part and add it to the section properties.
  XmlElement node =
    xdoc.CreateElement("w:headerReference", wordmlNamespace);

  XmlAttribute idAttr =
    node.Attributes.Append(xdoc.CreateAttribute("r:id",
    relationshipNamespace));
  idAttr.Value = rId;
  node.Attributes.Append(idAttr);

  XmlAttribute typeAttr =
    node.Attributes.Append(xdoc.CreateAttribute("w:type",
    wordmlNamespace));
  typeAttr.Value = "default";
  node.Attributes.Append(typeAttr);

  targetNode.InsertBefore(node, targetNode.FirstChild);
}

 

Both methods then save the changes to the main document part. The AddHeaderViaSDKClasses method uses the Save method of the Document property on the main document part.

//  Save the changes to the main document part.
mainDocPart.Document.Save();

 

The AddHeaderViaXMLServices method gets a stream on the main document part and then calls the Save method of the XmlDocument that holds the main document part markup.

//  Save the document XML back to its document part.
xdoc.Save(mainDocPart.GetStream(FileMode.Create));

 

NoteNote

When you use the Open XML SDK 2.0 for Microsoft Office to create a document-generation solution, it is best practice to create a template document first, and then use DocumentReflector, a tool that comes with the SDK. DocumentReflector can generate C# code that uses the SDK typesafe classes to reproduce your template document and the functionality that it contains. You can then use that code to help you add functionality or to help you understand the Open XML document parts and relationships that are required to implement a specific document feature. For more information about best practices and the Open XML SDK 2.0 for Microsoft Office, see Erika Ehrli's blog entry Getting Started Best Practices.

Explore It