Harmonization: Finding the Differences

There have been some discussions over the past several years about the harmonization of Open XML with ODF. In reality the world of formats is clearly much broader than just ODF and Open XML, but given the political climate the focus lately has been specifically on those two formats. As most folks know, the first step towards harmonization is to really gain a true understanding of the differences between the two formats. Last year the German national standard body DIN stood up and took a key driving role in this discussion. A number of different groups, including PC-Ware Information Technologies AG, Microsoft Deutschland GmbH, OPENLiMiT Holding AG, Dialogika GmbH, Ecma TC45, and Novell and independent experts have been working with DIN to truly understand what the differences between the formats are.

Gerd Shurmann, from the much respected FOKUS Fraunhofer Institute in Germany, is leading this effort as the Convener of the Working Group in DIN. You can look at the initial press release from last spring in which Gerd lays out the German national standard body's approach:

To place the work in an international context, Fraunhofer FOKUS is supporting the establishment of an international group of experts with DIN. The Working Group will take in account the experience gained in a variety of international projects such as the ODF Translator on sourceforge.

Also said in the same press release (I understand that the group is conducting their work in English):

The Working Group is open and is inviting international experts who would like to contribute to the Working Group's findings.

There is a lot of interest in the harmonization topic, but at this point, it's not clear what the suggestion actually entails as Rob Weir himself acknowledged. Some people (including Rob) have suggested there is 90% overlap between the two standards and that harmonization should be pretty simple additive process where features of one spec are added to the other spec. I don't personally believe this; I've posted before about some pretty big differences in approach (in formulas and spreadsheets performance for example). Rationalizing these is not up to me or Rob Weir (or any specific company). The formats are not sub-sets and super-sets of each other, they are fundamentally different. Any effort for harmonization must begin with some deep thinking about how things like text, tables, styles, graphics and page layout models are different (finding the difference is at the core of the DIN work) and how they can be rationalized when their core design is very different.

It is actually funny to notice that Rob Weir is criticizing translation while at the same time pointing out himself that a lot of the features that were found to be missing in one or the other format were found precisely by looking at the result of translation work between the formats. Translation is a great way to actually get into the details and build a solution that can prove where the differences are.

Remember, this isn't about Microsoft or IBM, this is about standardization and it takes the whole community to drive this. Rob Weir, the head of the ODF working group stated today that he thinks harmonization is important. Well the offer has been on the table for awhile now to join in the work at DIN. As I said previously, it appears OASIS is already discussing with the German national standard body DIN in order to take a more direct role in the Working Group, as indicated by this discussion between Florian from Novell and Rob Weir . It's clearly what the community would like to see happen.

We always agreed that interoperability between formats is important. People have different views of what harmonization really means. Instead of debating the meaning of harmonization in the abstract, let's let the Standard Group (DIN) continue the work of identifying the differences, in detail, between the formats. Once they finish that work, then we can all decide the best way to proceed.