Tagging dimensions

Monday, July 21, 2008 - 3:38pm

The mid 90's was the era of Yahoo!'s directory. The web was organized by hierchical categories (the names and children of which were decided by Yahoo) and somebody visited sites to place them in a category. You searched the hierarchy, not the text of the web pages themselves. Search engines came only a little bit later).

Web browsers usually have the ability to save web addresses as bookmarks. These can also be organized, like a file system, or like the file cabinet each draws their metaphor from, in a system of hierarchical folders.

Web 2.0 has popularized the concept of tagging data; marking it rather than filing it. This increases the complexity of categorization exponentially.

On Wikipedia, pages are not organized in a tree of knowledge, but can be labeled in any number of ways. For instance, the Frank Sinatra page on Wikipedia is predictably in the American crooners category, but not so predictably in the categories of American Roman Catholics, Grammy Lifetime Achievement Award winners, and people who died by myocardial infarction. There's no linear file cabinet or tree of knowledge; pages are organized along any number of orthogonal directions.

Sites like and browsers like Flock do the same for our bookmarks. Do I want to categorize xkcd--a web comic about math, academia, and other nerdy stuff--within "math" or "comics"? Why choose? Looking at the page for xkcd shows that people have done both. Also "geek", "funny", and the self-tag "xkcd".

The power of tagging to organize data can be seen when you click on one of those tag pages and see what else has been categorized that way. Following the link to the math page you might find yourself at a page about the quincunx. Go down the funny road and you arrive at an impressive collection of pictures taken at just the right time. The point being the interconnectedness of ideas is so much richer the more ways you can organize it.

(There's also the big difference that the Yahoo! directory was organized by Yahoo!, while is organized by everybody and nobody. That's why the tags used are called a folksnomy instead of a taxonomy.)

As a mathematician, I like the idea of classifying things along many dimensions. And I also start to think about what the "right" dimensions are. One of the ideas behind tagging is that it's not so important. That's where the "folk" in folksonomy come in. If people write down the first things that pop into their head about xkcd, "webcomic", "math", and "funny" are up near the top of the list without having to say why. But in classifying my own stuff (my bookmarks list, photo collection, PDF archive, whatever...Tagamac is a cool blog about different kinds of tagging software)
), I'd like to have my own tags organized to avoid simply repeating the title or the text itself.

When my fingers are poised over the keyboard and I'm wondering how to tag something, I ask myself these questions:

  • What is it? If you're bookmarking something on the web, is it a page of text? You probably don't need to tag it as a web page. But what if it's an image? A blog post or the home page of a blog? An FAQ list? These might be useful ways to classify the "form" of a resource.
  • What is it about? Tagamac is about tagging (the practice of it), but it's also about software (the means of it). What else? Computers? hacking? There are lots of "subjects" that a resource can have
  • What do I think about it? This is subjective, so it doesn't necessarily lead to good folksonomy unless you get enough folk to agree with you. However, for personal purposes, I mark things as "funny" or "useful" or just "cool." A resource can have a number of "qualities" worth mentioning.

I feel like I should have more, but I like these three. There's also the best practice of keeping tags simple, lowercase, and singular. Encouraging "comic" rather than "comics" will ensure that that particular tag will get more weight in the folksonomy, rather than equally split among the two.

