Office Open XML final draft!!!

As I already mentioned, in the last face to face meeting in Trondheim, Norway we unanimously voted to approve the final draft of the Office Open XML spec as ready to submit to the Ecma General Assembly. The GA will then review the draft and in December there will be a vote to approve it as an Ecma Standard!

This is a huge milestone, and the entire technical committee has worked extremely hard over the past year. We really had an amazing collection of contributors to this standard, and if you take a look, it will show:

For those of you interested, here is the list of all the organizations contributing to the standard:

  • Apple
  • Barclays Capital
  • BP
  • The British Library
  • Essilor
  • Intel
  • Microsoft
  • NextPage
  • Novell
  • Statoil
  • Toshiba
  • The United States Library of Congress

We posted the draft in three separate formats. There is a PDF version; a tagged PDF version (for accessibility); and a DOCX version.

The final draft is still broken out into 5 separate parts:

  1. Fundamentals – gives an overview of the structure of the formats, and describes all allowed parts; content types; and relationship types.
  2. Open Packaging Conventions – describes the basic conventions used for storing the parts of the file within a ZIP package. *
  3. Primer – gives a great description of all the markup languages and how they work. This serves as a great tutorial.
  4. Markup Language Reference – contains detailed descriptions of each and every element; attribute; and simple type. Serves as a great reference when you want to look up what an element means. **
  5. Markup Compatibility and Extensibility – describes how additional markup can be added to the format while still conforming to the spec

* Part 2 has a couple additional electronic resources. There are a few XSD files, as well as the equivalent RelaxNG files (we were lucky enough to have Rick Jelliffe help in the creation of these).

** Part 4 has a collection of XSD files and the equivalent informative RelaxNG files. There are also a collection of predefined cell and table style references for spreadsheetML, as well as a collection of predefined shape and text warp geometries for drawingML.

I've been giving pretty frequent updates on the progress of the spec, so most of the content at this point won't come as a surprise. We spent the last few weeks in the committee nailing down any potential interoperability issues, which included a new schema that allows applications to clearly define additional characteristics that may assist consumers in better handling their files. For example, it's possible to define what level of arithmetic precision was used for Spreadsheet formula calculations, so that a consuming application can accurately display the same results.

We're already seeing hundreds of developers working with the earlier versions of the draft, and this final version will really help everyone who's been waiting for it to solidify. If you go over to the site, you'll see there are almost 600 registered members and an extremely active discussion forum. There's also talk of starting up a blogging collection so that the members can actively blog about the solutions they are building. It's exciting seeing the diverse set of solutions; from document assembly on a linux box, to mind manager solutions that output wordprocessingML.

I'm already getting excited for what we do with version 2 of the spec (but I could use a little break between now and then). Here are a few fun facts about the work that's gone on over the past year:

  • 72 presentations were given to the technical committee explaining the existing behaviors of features so that discussion on how to best structure and document it could then take place.
  • 66 hours of live meeting discussions (starting at 6am every Thursday for those of us on the west coast of the US)
  • 88 schema files
  • 128 hours of face to face meetings held in Brussels (ECMA); Cupertino, CA (Apple); London (British Library); Sapporro, Japan (Toshiba); Redmond, WA (Microsoft); Trondheim, Norway (StatOil)
  • 6,000 pages of documentation between the 5 parts of the standard
  • 9,422 different items to document (3,114 attributes, 2,500 element, 3,243 enumeration, 567 simple types)