Taxonomy in a Digital World Part 2


Continuing the notes I have made on Everything is Miscellaneous...

Chapter 3 - The Geography of Knowledge

This chapter examines the Dewey Decimal system of classification. It shows how the system is skewed based on the 19th Century American-Christian views of its creator. The implication is that by trying to create classification systems for knowledge we will always end up with a system slanted by the cultural and political norms of the day - limiting their usefulness for finding information in the future.

Dewey's vision was that the floor plan of a library would be a map of ideas. He wanted to spatialize ideas.

Dewey's system has a limited number of top-level categories. Q. What happens when something new and important comes along? Can we fix the system for the 21st Century?

"The Dewey Decimal System can't be fixed because knowledge itself is unfixed. Knowledge is diverse, changing, imbued with the cultural values of the moment. The world is too diverse for any single classification system to work for everyone in culture at every time"

[Does the same apply to those of us creating taxonomies for cataloguing information in our corporations around the world...? How fixed is the information people create in documents? Do documents constrain us, and will knowledge become less fixed as more Web 2.0 techniques are applied in the Enterprise?]

Weinberger contrasts this with how Amazon organises information.

  • Collaborative filtering - based on other users actions.
  • Designed to introduce you quickly to relevant information you didn't know you wanted
  • Customer reviews enable them to "sell more of what people like"
  • Makes use of network effects - the usefulness of the system increases the more people use the system
  • Look for patterns in the text of books pulling out statistically interesting phrases to enable similar books to be grouped together.
  • Personal organisation for each user. (Based on my history, and enabling me to customise which purchases are used in generating this organisation). Rather than Dewey's single universal system

"The fundamental problem with Dewey's system is not that he was an eccentric or that his early education was provincial. The real problem is that any map of knowledge assumes that knowledge has a geography, that is a top-down view. That assumption makes sense in the 1st and 2nd orders of order. In unnecessarily inhibits the useful miscellaneousness of the 3rd."

Chapter 4 - Lumps and Splits

Weinberger introduces the basic concepts we use without thinking to categorize things. Lists - the most basic concept - have the inbuilt assumption that they are about something. Trees and nesting - nesting he says is the fundamental technique of human understanding. He explains how Aristotle made the leap in human understanding to conceive tree structures.

But trees come with embedded assumptions:

  • A well constructed tree gives each thing a place. If too many items get shoved in the miscellaneous pile the three is not doing its job.
  • Each thing gets only one place
  • No one category should be too big or too small. [7 +/- 2 rule again? for branches]
  • It should be obvious what the defining principle of each category is.

"...our knowledge of the world has assumed the shape of a tree because that knowledge has been shackled to the physical. Now that the digitizing of information is allowing us to go beyond the physical in ways Aristotle could not have dreamed the shape of knowledge is changing."

Lump and split are technical terms among indexers. A lumper takes things that seem disparate and combines them because they are similar. A splitter takes things that have been lumped together and separates them into smaller categories.

Trees without paper - in a digital world we want a tree that arranges itself to your way of thinking and then change the next day when you need to view the world in a different way. That is, a faceted classification system that dynamically constructs a browsable branching tree that immediately meets your needs. This kind of system was first invented in the 1930's by an Indian librarian. In such a system no facets have to be assumed to be the root.

"In the third order of order, a leaf can hang on many branches, it can hang on different branches for different people and it can change branches for the same person if she decides to look at the subject differently...In the third order of order, knowledge doesn't have a [single] shape."

Taxonomy in a Digital World Part 3

Technorati tags: David Weinberger, Everything Is Miscellaneous, Taxonomy, Faceted Classification, Collaborative Filtering