Folksonomies, Taxonomies and Metacrap
I've been doing a bit of reading about folksonomies recently. The definition of folksonomy in Wikipedia currently reads
Folksonomy is a neologism for a practice of collaborative categorization using simple tags in a flat namespace. This feature has begun appearing in a variety of social software. At present, the best examples of online folksonomies are social bookmarking sites like del.icio.us, a bookmark sharing site, Flickr, for photo sharing, and 43 Things, for goal sharing.
What I've found interesting about current implementations of folksonomies is that they are blogging with content other than straight text. del.icio.us is basically a linkblog and Flickr isn't much different from a photoblog/moblog. The innovation in these sites is in merging the category metadata from the different entries such that users can browse all the links or photos which match a specified keyword. For example, here are recent links added del.icio.us with the tag 'chicken' and recent photos added to Flickr with the tag 'chicken'. Both sites not only allow browsing all entries that match a particular tag but also go as far as alowing one to subscribe to particular tags as an RSS feed.
I've watched with growing bemusement as certain people have started to debate whether folksonomies will replace traditional categorization mechanisms. Posts such as The Innovator's Lemma by Clay Shirky, Put the social back into social tagging by David Weinberger and it's the social network, stupid! by Liz Lawley go back and forth about this very issue. This discussion reminds me of the article Don't Let Architecture Astronauts Scare You by Joel Spolsky where he wrote
A recent example illustrates this. Your typical architecture astronaut will take a fact like "Napster is a peer-to-peer service for downloading music" and ignore everything but the architecture, thinking it's interesting because it's peer to peer, completely missing the point that it's interesting because you can type the name of a song and listen to it right away.
All they'll talk about is peer-to-peer this, that, and the other thing. Suddenly you have peer-to-peer conferences, peer-to-peer venture capital funds, and even peer-to-peer backlash with the imbecile business journalists dripping with glee as they copy each other's stories: "Peer To Peer: Dead!"
I think Clay is jumping several steps ahead to conclude that explicit classification schemes will give way to categorization by users. The one thing people are ignoring in this debate (as in all technical debates) is that the various implementations of folksonomies are popular because of the value they provide to the user. When all is said and done, del.icio.us is basically a linkblog and Flickr isn't much different from a photoblog/moblog. This provides inherent value to the user and as a side effect [from the users perspective] each new post becomes part of an ecosystem of posts on the same topic which can then be browsed by others. It isn't clear to me that this dynamic exists everywhere else explicit classification schemes are used today.
One thing that is clear to me is that personal publishing via RSS and the various forms of blogging have found a way to trample all the arguments against metadata in Cory Doctorow's Metacrap article from so many years ago. Once there is incentive for the metadata to be accurate and it is cheap to create there is no reason why some of the scenarios that were decried as utopian by Cory Doctorow in his article can't come to pass. So far only personal publishing has provided the value to end users to make both requirements (accurate & cheap to create) come true.
Postscript: Coincidentally I just noticed a post entitled Meet the new tag soup by Phil Ringnalda pointing out that emphasizing end-user value is needed to woo people to create accurate metadata in the case of using semantic markup in HTML. So far most of the arguments I've seen for semantic markup [or even XHTML for that matter] have been academic. It would be interesting to see what actual value to end users is possible with semantic markup or whether it really has been pointless geekery as I've suspected all along.