Desktop Search: Solving the Wrong Problem as Quickly as Possible
Derek has a post entitled Search is not Search where he alludes to conversations we had about my post Apples and Oranges: WinFS and Google Desktop Search. His blog post reminds me about why I'm so disappointed that the benefits of adding structured metadata capabilities to file systems is being equated to desktop search tools that are a slightly better incarnation of the Unix grep command. Derek wrote
I was reminded of that conversation today, when catching up on a recent-ish publication from MIT's Haystack team: The Perfect Search Engine is Not Enough: A Study of Orienteering Behavior in Directed Search. One of the main points of the paper is that people tend not to use 'search' (think Google), even when they have enough information for search to likely be useful. Often they will instead go to a know location from which they believe they can find the information they are looking for.
For me the classic example is searching for music. While I tend to store my mp3s in a consistent directory structure such that the song's filename is the actual name of the song, I almost never use a generic directory search to find a song. I tend to think of songs as "song name: Conga Fury, band: Juno Reactor", or something like that, so when I'm looking for Conga Fury, I am more likely to walk the album subdirectories under my Juno Reactor directory, than I am to search for file "Conga Fury.mp3". The above paper talks a bit about why, and I think another key aspect that they don't mention is that search via navigation leverages our brain's innate fuzzy computation abilities. I may not remember how to spell "Conga Fury" or may think that it was "Conga Furvor", but by navigating to my solution, such inaccuracies are easily dealt with.
As Derek's example shows, comparing the scenarios enabled by a metadata based file system against those enabled by desktop search is like comparing navigating one's music library using iTunes versus using Google Desktop Search or the MSN Desktop Search to locate audio files.
WinFS could have anchored Microsoft's plans to unify search across the desktop, network and the Internet. Further delay creates opportunity for competitors like Google to deliver workable products. It's now obvious that rather than provide placeholder desktop search capabilities until Longhorn shipped, MSN will be Microsoft's major provider on the Windows desktop. That's assuming people really need the capability. Colleague Eric Peterson and I chatted about desktop search on Friday. Neither of us is convinced any of the current approaches hit the real consumer need. I see that as making more meaningful disparate bits of information and complex content types, like digital photos, music or videos.
WinFS promised to hit that need, particularly in Microsoft public demonstrations of Longhorn's capabilities. Now the onus and opportunity will fall on Apple, which plans to release metadata search capabilities with Mac OS 10.4 (a.k.a. "Tiger") in 2005. Right now, metadata holds the best promise of delivering more meaningful search and making sense of all the digital content piling up on consumers' and Websites' hard drives. But there are no standards around metadata. Now is the time for vendors to rally around a standard. No standard is a big problem. Take for example online music stores like iTunes, MSN Music or Napster, which all tag metadata slightly differently. Digital cameras capture some metadata about pictures, but not necessarily the same way. Then there are consumers using photo software to create their own custom metadata tags when they import photos.
I agree with his statements about where the real consumer need lies but disagree when he states that no standards around metadata exist. Music files have ID3 and digital images have EXIF. The problem isn't a lack of standards but instead a lack of support for these standards which is a totally different beast.
I was gung ho about WinFS because it looked like Microsoft was going to deliver a platform that made it easy for developers to build applications that took advantage of the rich metadata inherent in user documents and digital media. Of course, this would require applications that created content (e.g. digital cameras) to actually generate such metadata which they don't today. I find it sad to read posts like Robert Scoble's Desktop Search Reviewers' Guide where he wrote
2) Know what it can and can't do. For instance, desktop search today isn't good at finding photos. Why? Because when you take a photo the only thing that the computer knows about that file is the name and some information that the camera puts into the file (like the date it was taken, the shutter speed, etc). And the file name is usually something like DSC0050.jpg so that really isn't going to help you search for it. Hint: put your photos into a folder with a name like "wedding photos" and then your desktop search can find your wedding photos.
What is so depressing about this post is that it costs very little for the digital camera or its associated software to tag JPEG files with comments like 'wedding photos' as part of the EXIF data which would then make them accessible to various applications including desktop search tools.
Perhaps the solution isn't expending resources to build a metadata platform that will be ignored by applications that create content today but instead giving these applications incentive to generate this metadata. For example, once I bought an iPod I became very careful to ensure that the ID3 information on the MP3s I'd load on it were accurate since I had a poor user experience otherwise.
I wonder what the iPod for digital photography is going to be. Maybe Microsoft should be investing in building such applications instead of boiling the oceans with efforts like WinFS which aim to ship everything including the kitchen sink in version 1.0.