DII Workshop Recap

DII workshop presentations and roundtable discussion, October 23-24

The latest DII workshop took place in Redmond over the last two days. There were presentations at this event from a variety of people, including members of the Office product groups at Microsoft and developers and consultants from several other companies. Topics covered included planning an IS29500 document test library, server-side document assembly strategies, various approaches to document validation, use of content controls in e-courseware, and goals for future DII events. John Head already blogged a few thoughts from the workshop, and I expect we'll see more in the days ahead from John, his colleague Andrew Schwantes, or other blogger attendees such as Dennis Hamilton and Alex Brown. The DII we site has an event summary, and you can find downloadable copies of most of the presentations here.

Implementers have often mentioned the need for a document test library during previous DII events, and at this workshop members of the Word, Excel and PowerPoint teams presented overviews of the major types of Office documents that we've seen from Office users. (Details of these categories are covered in the presentations.) We considered whether this taxonomy might be a suitable first step toward structuring a document test library, and we also discussed the broader topic of how to best assess interoperability between various implementations of ECMA-376 or IS29500 in general.

Thursday afternoon, Shawn Villaron (group program manager for PowerPoint) covered our plans for documenting our implementations of various document formats. We're planning to make information available in the next few months about details such as range choices and application behavior in Office, and Shawn went through some examples of the type of information we're planning to provide. This prompted some interesting discussion of what developers need to know to maximize interoperability, and how seemingly minor things like the specificity of an enumeration (or lack thereof) can enable or constrain future innovation in that area. Because Shawn's presentation was an effort to get early feedback on our plans for documenting our implementation over the coming months, he didn’t post it on the site.

After Shawn's presentation came my favorite part of these events, the roundtable discussion. As expected, a variety of perspectives were represented, including standards professionals like Alex and Dennis, and developers and implementers like John and Andrew from PSC, Ray, Raymond, and others. Believe it or not, a few of us Microsoft people even had an opinion or two on some of the topics we covered.

The roundtable discussions centered around several themes including the future of DII events, structure of a document test library, and approaches to document validation. For future DII events, we discussed the possibility of having conference calls or web meetings to make it easier for people to participate in more events. On the document test library, there was an interesting suggestion to consider a metadata-based approach to classification instead of a taxonomy based on other criteria, and there were also suggestions for additional categories. In the area of document validation, I'd say the main takeaway for me was that validation is a complex topic and there are many factors to consider. John Head talked about the ACID approach as a possible model, and Alex Brown encouraged everyone to take a close look at DSDL technologies in general, and Schematron in particular, as tools for managing document format validation.

Next on the agenda, John Head and Andrew Schwantes did a presentation on PSC's recent experience in server-side generation of presentations in PPTX format. John covered the business issues and Andrew explained the system architecture. There's a case study on the PSC web site with some of the details, and John plans to publish their presentation on his blog soon as well. John and Andrew also shared some specific thoughts on what they most need from the Open XML SDK going forward, both in this session and in a separate meeting with a few of us from the Office team. It's great to get this kind of feedback from people who are out there building interoperable solutions with the tools we're providing.

Friday morning, my good friend Vijay Rajagopalan kicked off the day with an overview of the Document Interoperability Initiative's history and goals, and a deep dive into the status of several translator projects including the ODF translator and HTML translator. We talked about the tradeoffs between the cross-platform reusability of an XSLT-based approach to translation and the high performance (but more limited portability) that comes with a more code-based approach.

David Castillo of CTC presented an interesting overview of how XML is used in the development of e-courseware. Although David's background is mostly in Flash/Adobe technologies, he covered how XML has enabled interoperability between diverse environments and how Office 2007's use of the Open XML formats (content controls, specifically) has created integration opportunities that he is exploiting in courseware development workflows. It was interesting to hear how standards and interoperability issues in courseware formats mirror many of the challenges we face in the world of document formats. (David's presentation, by the way, isn't up on the DII site quite yet but I'll be getting a copy from him in the next few days and will make sure it gets posted on the presentations page.)

The final two presentations on Friday covered the Open XML SDK. Zeyad Rajabi, the PM for the SDK, covered the plans for Version 2 (currently available as an early CTP) and showed a few demos. I won't say much about those demos here because they were previews of content that will be unveiled next week at PDC, but watch Eric White's blog for full postings about them after PDC. These demos showed how far the SDK has come and where it's going, and this prompted much discussion of what developers want from the SDK. And how soon they want it, too — we heard you loud and clear, guys. :-)

Overall, it was very educational event for me and my Microsoft colleagues, and it was great to see some old friends and make some new ones. There was clear direction from the community on what types of tools are needed for document validation, Open XML development, and the document test library, and people would like to see progress in these area in the coming weeks/months. If you didn't get a chance to attend this one, check out the presentations and share your thoughts in the comments here or on the blogs of other participants as listed above.