Taxonomy – Categories vs. Labels

A large part of early scientific endeavor was what can be described as taxonomy. The early scientist spent most of his time observing and classifying things. Thus we have Charles Darwin who spent most of his time observing animals on his voyages or Johannes Kepler who spent most of his time observing starts in his observatory. Yes, we did get the Laws of Evolution or the Laws of Planetary Motion out of these observations but it was the observations – naming, cataloguing and classifying – that kept these men occupied for most of their lives.

The conventional ‘scientific’ way of classification is to put every object into a class. In this method, each object has a unique class. It cannot have more than one class. Thus, each animal is either a mammal, or a reptile or an amphibian. There is no scope for ‘both’. It is now very difficult to say whether conventional human way of classification was historically similar or not. In today’s scientific day and age this exclusive way of classification appears to be the most ‘natural’ one to us.

So this was the way we began to organize our information. We had files and cupboards in offices, each neatly labeled and marked, and each page of information we wrote had a definitive, unique location that it would fit into. We had categories in libraries and each book had a unique rack number that it could go to. This was of unique categorization was very convenient. All you had to do was learn what went where and after that, finding information was a piece of cake. However, you sometimes ran into trouble. If you wanted to locate Neil Gaiman’s American Gods where do you look? Do you look under science fiction? Do you look under fantasy? Do you look under horror? Indeed, you could even look under crime.

No wonder then, that a modern specialist in any field spends a lot of time and effort in categorizing things and also explaining and debating why that particular object should get categorized in that particular category.

File and Folders
When computers came along and digital information began to be stored in human comprehensible system of files some clever chap immediately came up with the idea of folders. Folders were exactly like office cabinets or library racks. Each file on the computer had a unique location. Once again it led to trouble. For example, if you wrote a letter to your boss asking for a leave, where do you put this document? Do you put it in folder ‘boss’ or ‘letter’ or ‘leave’. But people were so used to an exclusive system of categorization that they never complained. Silently they devised their own personal system of categorization and filed their files accordingly.

And Then There Was Searching
But then there came searching – largely through the internet. Desktop searching has always been lurking in some corner but it was the internet that made searching big. People gradually began to realize that the same information could be reached – very easily and very effectively – through a lot many ways. If you wanted to search for your document titled ‘letter to boss for leave’ you could search for any or all of letter, boss, or leave. As people began to get familiar with this paradigm of information management they started complaining about the folder system. Why should it be necessary to file this document in one folder? After all, it is not a ‘real’ document. It is virtual. Had it been real then we could not possibly have had it in two cabinets or two racks at the same time. But on a computer everything is virtual! Why not make this document just ‘appear’ in more than one folders?

Let There Be Labels
Thus labels were born. Labels (or tags) are everywhere on the internet now. The idea is simple. You can attach as many labels to your document as you want. When you search for a particular label all documents having that label appear. So my email to boss for leave can have both labels – boss and leave. Whether I search for boss or whether I search for leave, this email will appear. Indeed, desktop filesystems based on labels are being contemplated. It is rumored that the new WinFS (Microsofts new files system which they failed to deliver with Vista) will be label based. These days you don’t categorize your information anymore. You just label.

Paradigm Shift?
The advent of labels makes me wonder. Is human organization of information about to undergo a fundamental paradigm shift? Will posterity wonder no more whether tomato is a fruit or a vegetable, American Gods an SF or a fantasy, and an electron a particle or a wave? Will it just answer ‘both’ without even giving it another thought?

Or is it that labeling just is the more ‘natural’ way of organizing information and that exclusive categorization has been imposed on us by the dominance of the scientific method. So what we are doing is just realizing the simple truth and going back to a more human way of organizing data?

