Simple Semantics With Microformats, Part 1

Emily P. Lewis | September 15, 2010

 

The Merriam-Webster Online Dictionary defines semantic as “of or relating to meaning in language.” When I first heard the term, I was instantly intrigued. A fancy word for “meaning” was just up my alley as a precocious 10-year-old fascinated by words and language.

When I began my journey as a web professional, semantics once again entered the picture. I had read Jeffrey Zeldman’s Designing With Web Standards and was an instant convert to structural and semantic markup. It just made sense to me: use HTML elements to describe the structure and content of the web document. And it came with benefits beyond clean, semantic markup:

  • Supported the web standard of separating content from presentation
  • Offered added accessibility
  • Provided portability
  • Helped with future-proofing
  • Provided foundation for internal standards for efficient team development

Enter Microformats

Four years ago, I was introduced to microformats and, once again, became an immediate convert. Their simplicity was what got me first. All I needed to know was what I already knew: HTML. But it was the semantic power of microformats that sealed the deal and turned me into a microformats junkie.

In this series, I hope to make you a convert too. In this first article, you’ll learn the basics of microformats, as well the rel-tag and XFN microformats. In parts two and three, we’ll cover hCard and hCalendar, respectively. And then, in part four, you’ll learn how to combine these microformats for even greater semantic richness.

So let’s start with those basics: what are microformats?

At the highest level, microformats are a way to add meaning (semantics) to common web content. At their foundation, microformats are simply sets of HTML attributes (most often rel and class) and values applied to markup in order to describe the content.

rel-tag

To best illustrate microformats, let’s look at one of the simplest microformats, rel-tag. It is applied to tag hyperlinks, such as those blog authors commonly assign to posts:

<a href="/tag/microformats" rel="tag">microformats</a>

The added rel="tag"attribute-value pair indicates that the hyperlink destination (href) is to a page that describes what the content — or part of the content — is about. And that’s it! With a tiny bit of extra markup, you’ve added metadata that describes the content.

You may be asking yourself why this is necessary, especially because “tagging” is a common convention across the web; most people understand that when they see a list of tags, those terms are keywords for the content.

And here lies the beauty of microformats.

Humans First, Machines Second

One of the principles of microformats is that they are designed for humans first, machines second. This simply means that microformats should, first and foremost, be invisible to humans. That is, web content published with microformats should appear the same to users as web content published without.

Let’s consider the rel-tag example. From a user’s perspective, the “microformats” tag simply appears as a hyperlink on the page. The content is readable. The link functions. The user understands that it is a keyword for the content and that it is a hyperlink that can be selected.

Now, for the “machines second” part.

While humans have the necessary context for the tag link, machines — computers, user agents, applications and the like — don’t. To machines, it is just a link on the page, with no special relationship to anything. By adding the rel-tag microformat, however, machines instantly have the metadata necessary to understand that it is a tag describing the content.

And once machines have this context, they can extract the content and use it to provide a wide range of functionality. For example, there is a Firefox add-on, Operator, which parses the rel-tag microformat (among others) on web pages and provides a context-specific search for tag keywords:

Figure 1: The Operator add-on for FireFox parses rel-tag and provided tag-specific search

The Semantic Web

This notion of machine readability is not only a principle of microformats, but also of The Semantic Web, which aims to extend our current web with greater semantics to help machines and humans to work together more effectively:

… most information on the web is designed for human consumption, and … the structure of the data is not evident to a robot browsing the web. … the Semantic Web approach instead develops languages for expressing information in a machine processable form.

– Tim Berners- Lee

Now, microformats aren’t an official Semantic Web technology — like XML, RDFand OWL  — but they do embrace the machine-readability principle and they work just fine alongside these more advanced semantic languages. Further, they provide this machine-readability by using existing and well-known standards, like HTML, so users aren’t required to learn an entirely new language.

So, if you are looking to make your content more meaningful, especially to machines, microformats provide one of the simplest ways to add semantics. And, in return, you get parsable content that can be searched for, extracted indexed, downloaded, saved, cross-referenced and combined. All with just a few HTML attributes and values.

Benefits Beyond Semantics

It is thanks to these machine-readable semantics that microformats can provide a wide range of benefits:

  • **Findability:**Search engines can parse microformatted content — addresses, events, products, reviews — from a site and deliver more useful and relevant results.
  • User experience enhancements: Contact and event information published with hCard and hCalendar, respectively, can be downloaded into users’ electronic address books and calendars.
  • Minimal investment: Microformats are easy to learn, easy to implement and don’t require any special software or technologies.
  • Web standards: Microformats encourage the use of valid POSH (Plain Old Semantic Markup).
  • Workflow efficiencies: Because microformats use a standard set of attribute values, teams can establish internal standards for class naming, as well as for markup patterns.

Common Web Content

Another principle of microformats is that they are intended for the most common use cases; the most common web content found today.

Formal Microformats

First up, are the formal microformats, which are stable and unlikely to change:

Draft Microformats

Next, we have draft microformats, which are relatively far along in the specification process, but have yet to be formalized and, as such, are subject to change:

Personally, I don’t shy away from the draft microformats, just because they are in a state of (minimal) flux. If I have the content and there is a microformat to describe that content, I usually go ahead and publish it. I simply make a point to stay up-to-date on the specifications, so that if there are changes, I can make them.

What About Uncommon Content?

For the content that microformats doesn’t cover, such as that specific to a national laboratory or a university, another semantic language — perhaps RDFa — is most appropriate for describing that “uncommon” content. And, as I already mentioned, microformats can be used alongside RDFa, where microformats are used for the “common” content and RDFa is used for specialized content.

Is Anyone Really Using Them?

I often hear from fellow web practitioners questioning whether microformats are really used in practice. The answer is a firm yes. From major companies like Google and Yahoo!, to popular social networks like Twitter and Facebook, microformats are all over the web.

In fact, according to Yahoo! Searchmonkey, there are over 6 billion published microformats on the web. Here are just a few of the places you can find them in the wild:

  • **hCard:**Yahoo! Local, Google Rich Snippets, Google Maps, Google Profiles, BrightKite, Twitter, Last.fm, 37Signals’ Basecamp, Telnic, Gravatar
  • **hCalendar:**Facebook, Yahoo! Upcoming, Eventful, Google Rich Snippets, MapQuest Local
  • **hResume:**LinkedIn, SimplyHired, Xing
  • **XFN:**Twitter, Flickr, Digg, Technorati, Ident Engine, Plaxo, Google Social Graph, Last.fm

Further, it seems microformats are used by web practitioners far more than other semantic languages. The State of Web Development 2010 survey reports 34.52% use microformats, while just 5.63% use RDFa.

These numbers are supported by Google’s Rich Snippets initiative, which uses semantic technologies (primarily microformats and RDFa) to provide more meaningful search results.  Google recently reported that when it finds data for Rich Snippets on pages, 94% of the time that data is published with microformats

Practical Microformats

Now that you know the basics of microformats, as well as some of the global benefits, let’s now shift focus to the practical implementation of microformats so you can add semantics to your site now, starting with XFN.

XFN

The XFN (XHTML Friends Network) microformat is used to describe social relationships among people online. Consider a blog post where you mention a person and provide a link to his site. With XFN, you can indicate what your social relationship with that person is by assigning specific rel values to the link:

<a href="https://tantek.com" rel="met colleague friend muse">Tantek Çelik</a>

In this example, the values I assigned tell machines that I’ve met Tantek, he is a colleague and a friend, and I consider him a muse.

There are many more values XFN supports for indicating these social relationships:

Relationship XFN rel Values
Friendship (one value) contact   acquaintance   friend
Professional (one or both values) colleague   co-worker
Family (one value) kin   spouse   child   parent   sibling
Romantic (any or all values) muse   crush   date   sweetheart
Physical met
Geographic (one value) neighbor   co-resident
Identity me

And, just as in the example, it is a simply matter of assigning the appropriate values to the rel attribute of the link.

Creating a Social Network

So, just what is the benefit of indicating a social relationship online with XFN? It allows machines to identify who you are connected to online. And not just within social networks like Twitter or Facebook, but across the entire web.

Google’s Social Graph API, for example, can identify the URLs of all people you are connected to online and who have indicated a relationship via XFN:

Figure 2: the Google Social Graph API "My Connections" demo uses XFN to identify your online social relationships

Identity Consolidation With rel-me

Another benefit of XFN is that it aids in online identity consolidation. By assigning rel="me" to links that represent you online — your blog, your portfolio, your social network profiles — machines can identify the URLs that comprise who you are online.

On my blog, for example, I have a link to my Twitter profile and I publish it with rel-me:

<a href="https://twitter.com/emilylewis" rel="me">Twitter</a>

Twitter, meanwhile, also uses XFN and assigns rel-me to the URL on my profile (in my case, my blog):

<a href="https://ablognotlimited.com" rel="me">A Blog Not Limited</a>

And many of my other social network profiles (Flickr, Last.fm, FriendFeed, Digg, etc.) also use XFN’s rel-me, which means via a single link, machines can identify all the URLs that represent me online, giving me a consolidated identity.

To see this in action, check out Ident Engine. This JavaScript library utilizes the Social Graph API to identify those rel-me links and generates a list of all found profiles from a single URL:

Figure 3: IdentEngine utilizes the Social Graph API and XFN to provide a consolidated list of online identities

Where might this be useful? What about during the profile creation process for a social network? Often, these profiles allow users to indicate other online profiles by filling out a form. A tedious process if you are addicted to social networking like I am.

But, by utilizing XFN and identity consolidation, this process of defining “elsewhere” links can be simplified. Huffduffer, the podcast sharing site, does just this. New users are asked to sign up with a single URL:

Figure 4: Huffduffer account creation form

If this URL utilizes rel-me, Huffduffer uses the Social Graph API to generate a list of other profile links, eliminating the need for users to manually enter these URLs:

Figure 5: "Elsewhere" links are generated from the provided website URL by using the Social Graph API to identify rel-me links

It is worth noting that this form auto-fill uses XFN for the identity consolidation and hCard, which you’ll learn in part two of this series, for the specific contact information.

Coming in Part 2

So far, you’ve learned what microformats are, their benefits and how to implement both rel-tag and XFN. And I hope you are intrigued, because there’s more to learn! In Part 2 of this series, we’ll take a close look at the hCard microformat for contact information.

In the meantime, if you want more microformats goodness, check out these resources:

 

About the Author

Emily Lewis is a freelance web designer of the standardista variety, which means she gets geeky about things like semantic markup andCSS, usability and accessibility. As part of her ongoing quest to spread the good word about standards, she writes about web design on her blog, A Blog Not Limited, and is the author of Microformats Made Simple and a contributing author for the HTML5 Cookbook. She’s also a guest writer for Web Standards Sherpa.net magazine and MIX Online.

In addition to loving all things web, Emily is passionate about community building and knowledge sharing. She co-founded and co-manages Webuquerque, the New Mexico Adobe User Group for Web Professionals, and is a co-host of the The ExpressionEngine Podcast. Emily also speaks at conferences and events all over the country, including SXSW, MIX, In Control, Voices That Matter, New Mexico Technology Council, InterLab and the University of New Mexico.

Find Emily on: