Clay Shirky on categorization, links and tags
Posted by direwolff on May 19, 2005
Thought the following was a great piece to place in contrast to my previous post. Clay Shirky is a man that I have met and have great respect for, and like to read regularly. His piece is thought provoking and provides a worthwhile perspective to consider in all of these discussions.
Here’s an excerpt and the link to the full post below…
Ontology is Overrated: Categories, Links, and Tags
This piece is based on two talks I gave in the spring of 2005 — one at the O’Reilly ETech conference in March, entitled “Ontology Is Overrated”, and one at the IMCExpo in April entitled “Folksonomies & Tags: The rise of user-developed classification.” The written version is a heavily edited concatenation of those two talks.
Today I want to talk about categorization, and I want to convince you that a lot of what we think we know about categorization is wrong. In particular, I want to convince you that many of the ways we’re attempting to apply categorization to the electronic world are actually a bad fit, because we’ve adopted habits of mind that are left over from earlier strategies.
I also want to convince you that what we’re seeing when we see the Web is actually a radical break with previous categorization strategies, rather than an extension of them. The second part of the talk is more speculative, because it is often the case that old systems get broken before people know what’s going to take their place. (Anyone watching the music industry can see this at work today.) That’s what I think is happening with categorization.
What I think is coming instead are much more organic ways of organizing information than our current categorization schemes allow, based on two units — the link, which can point to anything, and the tag, which is a way of attaching labels to links. The strategy of tagging — free-form labeling, without regard to categorical constraints — seems like a recipe for disaster, but as the Web has shown us, you can extract a surprising amount of value from big messy data sets.