Differences in vocabulary

The workshop I attended last week at Harvard was a really great learning experience for me. I spend most of my time focused on technology, but not always as often on the policies and practices that need to be put in place to best leverage those technologies. Over the two day workshop last week I had the chance to listen to a number of the technology leaders in the public sector. They have a big focus on interoperability to help make communication more efficient, and there was a big focus on archival as well.

One interesting things I noticed right away was that we don't always use the same terminology. There were a couple that really stood out to me because of how often I hear folks talk about them in my current job:


For a number of folks in the public sector interoperability primarily meant radio communication. For example, with police, they want to make sure everyone can quickly and efficiently communicate with each other. This is why standards in communication are very important. When I would mention interoperability in relation to document formats and semantics, often times there would be a little confusion at first because of this. The principles are the same, but I usually mean that more generally interoperability applies to creating an efficient way to share information.

Custom defined schema is the key for interoperability in terms of documents. Some of the folks at the session referred to this as "Web 3.0". I'm not sure it needs that grand of a name, as it's a pretty basic concept. It just means that documents are no longer documents in the traditional sense, but instead collections of data. Another way people talk about this basic concept is "micro formats." It doesn't really matter what term you use though, what's important is to realize that in order to truly get quick efficient sharing of data, you need to have the ability to structure that data within your documents.

If the office document that your target users are creating goes beyond simply specifying the display information but also calls out directly all the semantic information, this takes you to a completely different level of interoperability. When I think of interoperability, I think of documents interacting with systems and processes in ways no one is really doing right now.

Open Source

This was a really big eye opener for me. Many people were talking about open source specifically as a content sharing model. In many folks' minds, Wikipedia and Open Source could be thought of an analogous. I think I've adopted a view of open source that is much too narrow. There were folks from the defense department for example who said they wanted to set up an open source model for sharing intelligence information within their organization. I had previously thought about open source more in terms of the licensing model chosen. Well obviously the folks from the defense department weren't thinking they wanted to put all the content under the GPL, but instead they wanted a system where people could easily share information within their targeted community. This is something I believe strongly in. Easy collaboration and sharing of content was one of the big scenarios we were going after with the XML formats in Office. If the server technology is able to interpret the document content, you can build some powerful solutions.

Other Observations

There are a bunch of other things I wanted to write down about the conference, and hopefully I'll get to them in the next couple days. There were some IBM folks there as well, including Bob Sutor. Bob led a discussion around a case study where folks realized they had to bring more emotion back to certain IT systems that had become too rigid and form like. This study focused on child welfare case workers weren't being encouraged to really think about each case, and were forced to use a system that basically just consisted of a series of checkboxes to go through. This moved folks away from focusing on the true need, which focusing on what's in the child's best interest. So they made the move change up the system and to standardize around a separate color scheme for all sites relating to children. At first there was pushback because folks would say: "We can't do that, we have standards, and everything has to be blue." Eventually though there were able to break away from the rigid system and build something smarter and more targeted. It was an interesting talk.

I was also really interested in the document archival case studies. I worked with the British Library and the Library of Congress on the OpenXML standardization last year, so I've already had a lot of exposure to this issue. It's a very important issue that governments are now dealing with. You have content coming in all sorts of formats (documents, video, audio, e-mail, pictures, etc.), and it's important to maintain them all for the public good. The case study that was discussed at the workshop last week was with the State of Washington digital archives. I'll write up a separate post on that as I believe it's a really important topic.