HTML5: A Specification or a Platform?

An interesting discussion broke out in the comments to the blog post on Never Mind the Bullets during the weekend over my allegedly loose usage of the term HTML5 to describe the site.

682383_55089219Here’s the problem: while strictly HTML5 refers to a draft specification published by the W3C, HTML5 is increasingly being used (particularly in the media) as a generic term to describe the web platform that is built on an emerging generation of web standards. In this context, the term is used as a catch-all for a whole suite of specifications. As a fairly typical example, this InfoWorld article quotes a vast array of luminaries in an attempt to suggest that HTML5 is an emerging competitor to plug-ins like Flash or Silverlight.

Yet the HTML5 specification itself is taken up with very different concerns than many people assume – a large portion of the specification concerns itself with tightening the parsing and manipulation of HTML documents; a considerable proportion of the new additions to the specification have to do with form elements and improved semantic markup. While video and audio are included in the core specification, the specification doesn’t define many other technologies that are often considered part of the emerging web. These are instead described in external specifications (of varying levels of completion): for example, SVG 1.1, Canvas 2D API, WOFF File Format, WAI-ARIA, and so on. CSS is itself a whole family of specifications.

Complicating matters still further, some draft features that are not part of the W3C specification are included in a parallel HTML5 proposal from the WHATWG, a separate community led by a subset of browser vendors (Tim Anderson has a good primer for those who are interested).

So we have an overload of terminology: while at times it’s reasonable and appropriate to distinguish between each of these technologies, it creates problems. A purist will argue that the use of the term HTML5 to represent everything from H.264 video to SVG is wildly inaccurate, and it’s a fair point. On the other hand, we need some term to refer to this family of technologies for ease of reference, and expecting common parlance to explicitly reference a list of distinct specifications is even more unlikely than trying to get everyone to refer to GNU/Linux rather than the usually shortened form. As I mentioned earlier, it’s unlikely that Wiley are going to publish “HTML5, CSS3, SVG 1.1, Canvas 2D API, ARIA, WOFF, and Indexed DB for Dummies”!

What is the answer? While I don’t like the conflation of two different meanings, I also don’t realistically see any solution on the horizon. It’s important to be able to describe the modern web client platform succinctly, but it’s also important that we’re precise with our terminology. Maybe the core HTML5 specification should be described as “HTML5 Core”: at least that would provide for the use of the term in both a specific and a general sense.

Lastly, although I called it out in the comments, I’ll do it again here. If you’re looking for a good primer on the HTML5 specification, I recommend Bruce Lawson and Remy Sharp’s Introducing HTML5 book as a very readable discussion of the new features. While it contains coverage of a few features from the WHATWG proposal that are unlikely to ever be part of the W3C HTML5 specification, it’s both pragmatic and (in my view) fair-minded.

Thoughts? How do you usually differentiate between these two uses of the term HTML5?