Interoperability: what's in a name?

Interoperability is one of those words that means different things to different people. We all agree that it means something like "the ability of systems to work effectively together," but beyond that our agreements get murky at best. Consider, for example, the classification of of business products and services into "horizontal" and "vertical" markets, as shown here:

The fundamental concept is that a vertical market encompasses all the business activity within a specific industry (healthcare, say) regardless of the type of system or software, whereas a horizontal market (word processing, for instance) corresponds to a specific type of functionality that is used across multiple industries. This chart shows just a few typical examples, and others might include retail or government vertical markets, email or presentation horizontal markets, and many others.

Now, when you use the word interoperability, which do you mean? Horizontal interoperability -- between competing spreadsheet products, say -- or vertical interoperability, such as interoperability between all of the systems used in an insurance company?

I think the answer usually depends on your perspective, and specifically on what you do for a living. For those of us in the software business, it's tempting to view the world in terms of application types: word processing, spreadsheet, email, etc. So we tend to think of interoperability in that way, as a horizontal-market concept: can these two word processing programs share documents? Can these two databases share data?

But if you're in a business where technology is just a tool rather than your core business (i.e., a vertical market), you probably have a quite different view of interoperability. For example, if you work in an manufacturing company, you probably see interoperability in terms of the ability of your various systems -- ERP, accounting, distribution, logistics -- to effectively communicate with one another.

There may be an occasional need for horizontal interoperability within a vertical market, to be sure. For example, you may have upgraded from one word-processing program to another, and you need to be sure you can open the old documents in the new software without any surprises. That's a very common scenario. But the concept of two word-processing programs sitting side-by-side on your desktop is rare in the corporate world, because most organizations standardize their use of horizontal applications wherever possible, to reduce ongoing support and maintenance costs.

Document Formats and Interoperability

Standardized document formats play an interesting role in this horizontal/vertical view of interoperability, because they can enable interoperability of both types. For example, the Open XML formats enable horizontal interoperability by supporting all of the functionality in existing Office binary documents, and they enable vertical interoperability through custom schema support. The ODF translator project further extends horizontal interoperability by providing a bridge between the worlds of Open XML and ODF documents.

Another concept that is closely related to the questions of vertical/horizontal interoperability in document formats is the difference between technical and semantic interoperability. (Thanks to Vijay Kapur for helping me see this perspective clearly.) Technical interoperability between word-processing applications means that we have a standardized method for marking text as bold, how to indicate font size, and similar issues. Semantic interoperability occurs when we standardize how we represent business concepts such as customers, orders, invoices, and so on. Referring back to the chart above, semantic interoperability enables vertical interoperability and technical interoperability enables horizontal interoperability.

Semantic interoperability is the much more challenging thing to achieve, because the needs of each industry are different. Everyone knows the concept of bold text, but only those in the insurance industry think about policy limits and claim numbers, and only those in education think about test scores and classroom scheduling. So each industry or trading group tends to have its own jargon, and that jargon is often codified and standardized in a schema that defines the structure of the XML documents used in that industry.

Open XML's custom schema support

In Open XML, there are two general approaches for supporting these custom schemas. (By the way, a bit of jargon to understand: the Open XML spec refers to its own schemas as reference schemas, and all other schemas -- such as your industry's unique schemas -- are called custom schemas.)

The first type of custom-schema support is custom XML markup, or content tagging. You can use a custom schema to tag the content in a word-processing document, for example, by inserting customXml elements around chunks of content within the body of the document like this:

<customXml uri="http://MyNamespace" element="MyElement"> ... content that is tagged with this element ... < /customXml>

Typical uses of this approach would be to tag content within a document to identify the customer number or order number for an order-processing system, or to tag the title or abstract of a document for a content-management system. This type of content tagging can be nested as required by your schema, and there are options in Microsoft Word to tag content manually through a simple user interface. Custom-generated documents can also have the markup added programmatically, using the syntax shown in the example above.

Note that your own custom elements aren't added to the document markup directly. Rather, there's a customXml element that is already defined in the reference schemas, and your custom elements (e.g., MyElement) appear as attributes of the customXml element. This simple and flexible approach is also used in microformats, which similarly add semantics to XHTML pages without the need to add new elements or change existing reference schemas.

Another way that Open XML supports custom schemas is through custom XML parts, which can be based on any schema at all. These parts are inserted into a document as a stand-alone island of business data, and the nodes in that custom data can be bound to presentation elements such as structured document tags (content controls). This architecture makes it very easy for custom business applications to read and write the business data in a document without processing (or even being exposed to) the document-related markup itself.

I'll provide some specific examples of content tagging and custom XML parts in future posts, but this one is getting a bit long so I'll wrap it up for now. They key concept I wanted to cover today is that interoperability means different things to different people. Those of us in the technology business need to worry about technical/horizontal interoperability, but most people in other industries need to worry more about semantic/vertical interoperability, because that's the key to integrating disparate systems and business processes.

In future posts, I'll show some specific examples of documents that use Open XML's custom schema support to enable vertical interoperability in various scenarios.