The global financial crisis has saddled me with a penchant for economics blogs. This interest has led me to look not only at the relationship of information to economics, but the economics and physics of information itself. Mike Konczal, a blogger whose perspective I have come to respect, recently mentioned an article that ties a lot of this thinking together. Below is my complement.

I just finished reading an essay by James C. Scott on the effects of extensional, top-down classification schemes on the concentration of political and administrative power, and the inherent fragility of rationally- and prescriptively-engineered systems. It reminds me a great deal of John Ralston Saul's central argument in his book Voltaire's Bastards, which I have considered in the past. While I accept the use of such structures as expedients by vested interests, it seems as if there is also a element pertaining to the technological constraints of information management which have for some time been obsolete and appear to be maintained only through the combined forces of tradition, authority and the immense investment that goes along with them.

Hierarchies are Warehouses

If you observe the structure of organizations and methods of classification, they come apart in the same way as the contents of a warehouse. There is a strict hierarchy of orthogonal, non-overlapping components which completely contain their subcomponents, like sets of Matryoshka dolls, laid out from A to Z. This is perfectly rational, as physical objects cannot occupy two spaces at once, nor can two objects occupy one space at once. Likewise the whole must be oriented somehow in physical space such that a person can unambiguously locate any element to his or her desired precision.

Scott also considers the introduction of an official name, a cardinal identifier, by highlighting the invention of certain relevant mechanisms. These include surnames and other forms of personal identification, highway route numbers and the metric system. He follows with the argument that this disambiguation enabled central authorities to measure, quantify and tax much more effectively than before it existed. It is important to recognize, though, that this is another side effect of the shackling of information to physical space.

We Have These Great Things Called Computers

This technical constraint has since been removed by the computer, which decouples the logical values and relationships of symbols from their physical representations and relationships in space. This affords informational structures of much greater, yet still manageable complexity. Structures may simultaneously take the form of overlapping sets and tangled networks, and identifiers may be linked by networks of synonyms. Despite this new ability, the extensional, orthogonal, hierarchical model of organizing information — the One True Taxonomy — persists, perhaps most evidently in the file systems upon which virtually all electronic data is stored.

Nitpickers: I am using the term network in place of graph so as not to confuse it with a chart. I know it isn't completely accurate. I am also not specifying whether or not graphs are directed because it doesn't really matter at this level (and besides, lots of semantic relationships are symmetric).

A Forest is Not a City (is Not a Tree)

A side effect of these antiquated models of information is that if an entity, that is to say a cardinal identifier isn't in the catalogue, it may as well not exist to that catalogue's administrator. Scott brings this to light in a poignant account of overzealous classification leading to the engineering of perfectly rational forests in 18-century Prussia. While convenient for loggers, these evenly-spaced rows of identical trees fell victim to every abuse Nature could throw at them. Perhaps most interesting is that others are aware of this category of weakness.

The doctoral thesis of the architect Christopher Alexander, Notes on the Synthesis of Form, discusses a method of decomposing complex design problems. In it, Alexander exposes how concepts can either be referred to by extension, that is by their position in a directory, or by intension, that is by their content. His argument is that the optimal pattern of breaking a complex problem into simple problems belies the extensional approach because there are orders of magnitude more possible decompositions than can be handled by natural language. Simply put, there aren't enough words, let alone concept hierarchies, to describe them all. Moreover, the few patterns that do conform are almost surely far from optimal. Indeed, the very nature of an extensional classification scheme assumes that such a structure is already known.

In a later publication (part 2), Alexander discusses the layouts of planned vs. organically-grown cities, which expounds on his earlier work. Scott's analogue of the rationally engineered forest ends with a similar conclusion to Alexander's.

The work in Notes on the Synthesis of Form and A City is Not a Tree manifests most famously in the opus A Pattern Language. It is also worth noting that recent scientific evidence indicates that just about everything to do with the layout of forests is fractal.

Extensional Classification Systems Hate Freedom

Tongue firmly planted in cheek, but it's actually somewhat true. The benefit of extension is the administrative efficiency of an absolute index over a set of entities. In such a system, a single, cardinal identifier per entity is all that is necessary, in fact adding synonyms would only slow it down and make room for error. We can easily imagine the power of such a structure reinforcing its own continuity as well as that of its maintainers. I submit, however, that the One True Taxonomy is, until very recently, all even the most powerful interests had time for. Consider the following:

The way you make an extension is by going around collecting a number of entities, which itself is potentially a costly process. Then you test each one against a predicate, a simple yes/no question. Then you compose a list of those entities for which the predicate is true. To extend the extension, repeat the process to get a list of lists. An additional arbitrary cost lies in finding a set of predicates for which every entity is only ever true for exactly one of. If you can't, your results will include omissions or duplicate entries.

Suppose you were to try to sort your kitchen junk drawer. How would you do it? Shape, colour, purpose? What about oddly-shaped items? Or those of more than one colour? Or more than one purpose, or no discernible one? Which dimension would you pick first? Why? Is it possible that this exercise might enable us to say something about why kitchen junk drawers are cluttered by definition?

Imagine, then, that the elite of the Enlightenment onward, using the tools available to them, effectively took on the great task of sorting the junk drawer that was western Europe, its colonies, and finally the world. The result, as evidenced by our current situation, is a massive leap in efficiency and economy, through which most of the people of the world ultimately profited. No small number of people, mind you, more than completely lost out. I believe that it is only within the last few decades that this approach could even begin to be made subject to any critical examination.

Extension is Not Enough

The complement to extension is intension, which behaves much like the social relationships of sibling, friend or spouse. You navigate a system organized this way just as you would play a once-popular trivia game. The benefit is that you can eventually reach any point in the network from any other. The down-side is that when the links are bound to geography, doing so has all the trappings of a scavenger hunt.

We can say that we need both extension and intension to make use of information: intension to provide access to the material, extension to provide the shortcuts.

On Identifiers and Private Language

Economy is sought through division of labour, or perhaps in more modern terms, the distribution of a grand computation. Absent a tyrant administering orders backed by threats, this division must come about voluntarily between peers. We call this notion cooperation, or perhaps collaboration, its more involved cousin.

The only remaining requirement for cooperation or collaboration, after mutual wherewithal, is the ability to negotiate mutual intent. This problem is ultimately couched in establishing a common language, such that one party understands what the other is talking about. This language need not be perfect, only expressive enough to establish the value of the transaction and carry it out. It need also not exist in advance, but often benefits from being established ad hoc as part of the negotiation process.

In a private language, words can take on special meanings or be minted anew. The only requirements are that all those involved have a consistent understanding of the underlying concept for a given term. Herein, however, lies a caveat.

When we develop a private language, it is only in our most parochial of interests not to take steps to eventually reconcile at least some of it with the common lexicon. To integrate our language is to increase our opportunities and broader appreciation for our offerings. If we keep the language private, we only frustrate negotiations with outsiders, and eventually fragment into tribes, classes and orders, for whom combat may become preferable to trade. Only when the cost of reconciliation outweighs the benefits does it make sense to avoid it, but a preventative measure is as simple and cheap as a policy of diversifying who we talk to.

Now it Gets Technical

We have had the ability to compete with the One True Taxonomy for some time. That you have found and are reading this piece is evidence, albeit tangential, that this competition has already begun.

The strategy for defeating official taxonomies is to absorb them. Mathematically, a hierarchy dissolves into a network without a trace. In this new model, nested categories are transformed without harm into sets, and their important property, containment, is safely commuted to inclusion with respect to other sets and membership for everything else. This enables us to add new sets which overlap and intermingle with the existing sets, and entities to exist in more than one set at once.

The network also permits a much richer array of intensional relationships between entities, which can be used to compose even more sets and so on. Perhaps most remarkable is that extra overhead of finding the optimal structure for organizing knowledge is not paid up front but amortized over time, without disturbing the structure that came before it. Through this mechanism, we can achieve a much more powerful and nuanced view of the world around us, yet still find our way back to the official taxonomies while they remain relevant.

As for symbols, names, identifiers, handles, monikers, whatever you want to call them: the same system works for joining them together. We connect symbols together in the network by the symmetric relationship typically called synonym. The only requirement is that each can be used as an entry point to find the entity to which they refer, and that they do not collide, such that the same name has more than one meaning. If this happens, we can add more information to clear up the ambiguity, or narrow our context to clear unwanted meanings out of scope.

Freedom of Computation

Despite what I mentioned above about connecting our private language to the public, there are instances in which it is useful, if not essential, to keep it at least temporarily under wraps. If we didn't have some administrative discretion, we would constantly have to sell, justify or fight just to be able to carry on the most basic activities. This is the value of privacy, and as long it is legitimate to store and operate on information in our own private space, there is little danger of wholesale compromise.

Fetching, archiving and operating on information once took copious resources. It is now for the most part practically free. The franchise is being extended at an exponential rate. For the first time in a long time it has become feasible, if not practical, to concoct our own point of view instead of subscribing to established dogmas. It is also just as feasible to reconcile those dogmas with our own point of view. We have arguably never been more at liberty to architect our own reality, which is a prospect as unnerving as it is inspiring. One fact, however, should have become evident: the power of the One True Taxonomy is only what we permit it to be.