Microsoft SDK for Open XML

I was so busy with activities related to the announcement of the new SDK yesterday that I didn't get a chance to blog about it. And now, the morning after, I don't have much to add to all the great posts that have been written by my colleagues around the globe. Nice work, everyone. Here are some links with lots of information, starting with Brian's post that broke the news online yesterday:

Brian Jones:

Kevin Boske:

Erika Ehrli:
(Erika did a ton of work in the last few weeks to get this API delivered, setting up the documentation site with Frank Rice's great samples and also handling all the details of setting up the support forum.)

Art Leonard:

Wouter Van Vugt:

Julien Chable:

Stephen McGibbon:

Chris Bryant (Channel 9 interview):

What It Is

This new API is something we've been wanting to do for a long time. I first heard talk of something like this in March of 2006, when Stephen Peront of Xinnovation met with me, Kevin Boske, and Art Leonard the week of the Office Devcon in Redmond. Stephen was drawing lots of diagrams on a whiteboard in Building 36 while Kevin and Art were talking about "strongly typed parts," and frankly I was just along for the ride that day. Since then, all three of those guys have lobbied hard from inside and outside Microsoft to make something like this happen. (The emails Stephen sent to our management in support of this API are truly classics.)

The basic idea underlying the new API is pretty simple: we now have .NET types that correspond to the parts of an Open XML document. If you're familiar with the structure of Open XML documents, then the type names themselves will tell you what you can do with the new API. Here are a few examples of the new types:

  • OpenXmlPackage
  • WordprocessingDocument, SpreadsheetDocument, PresentationDocument
  • MainDocumentPart, WorkbookPart, PresentationPart
  • ImagePart, CommentsPart, PivotTablePart, WorksheetPart, SlidePart, ThemePart, CustomXmlPart, and many others

In terms of the level of abstraction this new API is a higher-level API than the System.IO.Packaging API. The packaging API knows all about the Open Packaging Convention, but doesn't distinguish one part type from another -- to the packaging API, "parts is parts."

Now you can write could that creates a "ThemePart" or looks for the "CustomXmlPart" within a document. You still have to deal with the XML markup itself, but this API greatly reduces the amount of code you need for various Open XML programming chores. You no longer need to iterate through relationship types that only occur once or things like that -- you can just go straight to the part you need and start working with its content.

Note that this is just a typical .NET API, with no dependencies on anything other than the .NET Framework 3.0 itself. So you don't need Office or VSTO installed, and it works great in a server environment.

Stephen Peront and Doug Mahugh at TechEd

Announcing the API with a hands-on example

I had the pleasure of working with Stephen Peront on our announcement at TechEd this week. Stephen got a copy of the API on Friday, spent two long days writing code (modifying an existing application that had been using the packaging API), and then he got on a plane from Boston to meet me in Orlando Sunday afternoon. We gave a preview of the API to the Regional Directors meeting on Sunday afternoon, and then Stephen wrote a bunch more code late Sunday for a demo he did yesterday morning at my "Open XML Fundamentals" session.

Stephen's company, Xinnovation, is a leader in large-scall document assembly applications. They did a lot of work with the binary formats, automating the Office clients to get the job done, back when that was the only feasible approach. Then when the packaging API came out, Xinnovation immediately started using it to provide more robust server-side document assembly solutions to their clients.

Now, with Xinnovation's new XiDocs "XD4" approach, they're moving document assembly to a whole new level. Using a new technology called AssemblyML, they allow their customers to create templates that can have the rules for document creation within the document itself. These rules are at a very high level, saying in essence "this chart is dynamically generated from SQL data" or "this document should be generated as a presentation." To demonstrate the power of the new SDK for Open XML, Stephen showed how he had hundreds of lines of code using the packaging API in the old version of their product, and now that code is less than 20 lines with the new API.

AssemblyML is a powerful set of abstractions that allow highly customized document-assembly applications to be built with minimum coding or even no coding in many cases. The document type itself is an AssemblyML attribute, so with just one or two changes to the assembly instructions you can have the same SQL data populate fields in a presentation instead of content controls in a word-processing document. If that sounds like something you'd be interested in, contact Stephen at And thanks, Stephen, for the great demonstration!

Come visit us at TechEd

If you're at TechEd in Orlando this week, drop in at the Open XML booth. It's in the green area on the main exhibit floor, across from that band playing all the 70s tunes next to the lunch tables. My colleagues Erika Ehrli of MSDN and Stephanie Krieger (an Office MVP and author of a great book on Office 2007) are helping staff the booth, and we'd love to see you.

Say you saw this post and we'll give you a free Open XML t-shirt, or -- if you're Erika's size or smaller -- a free Open XML dress. :-)

See you there!