When auditing content for the Web, it's important to remember that although many of us still write Web content as isolated documents, they are very rarely read that way. It's entirely feasible for a reader to encounter inconsistent or confusing writing between one page and the next. In order to fully appreciate the story we're telling our audience, we should look at it in context, like this:
This graph is a rendering of the most frequently-trodden paths through my own site. Even before zooming in we can glean significant information about the content and the relationships between it.
How I Did It
Web browsers still courteously supply us with the location of the referring resource, if present, along with each new request. This information shows up in the server's log. It's a straightforward task to turn the log into a list of referrer-referent connections, each weighted by the number of hits that go between them.
The weights give us an indication of how much traffic flows between an ordered pair of pages, that is, how many people look at one specific page followed by another specific page. We can then import this list into some graph visualization software, such as Gephi, which will show us the trails people take into and through the site.
In addition to the weighted lines, I used Gephi's built-in PageRank analysis to show me the highest-ranked pages by tying it to the size of the dots that represent them.
What I Learned
To sum it up in one sentence: it looks like there is some considerable work ahead of me. Specific remarks include:
People appear disproportionately keen to learn about me
They want to know who I am, what I do and what I'm working on. Examining this particular path was my main motivator for doing this work, as those pages are obsolete, awkward and truncated, respectively. I didn't even wait for these results before replacing my what-I-do, which, while a little long, is considerably more accessible than its predecessor.
This is perhaps the most glaring evidence that I never intended for this site to be public. Rather, I didn't mind that it was publicly-accessible, but I had no interest in maintaining the tacit service guarantee associated with putting anything on the Web. This early work was an attempt to capture what I know about the Web in the true style of hypertext, though I eventually found it too time-consuming to manage and just reverted to writing essays. It appears, however, that I should at least give it and its neighbours a second chance.
There isn't a lot of traffic between the essays themselves
Even though those documents are studded with cross-references, a reader's next step is overwhelmingly home, or to one of who I am, what I do and what I'm working on. I was suspicious of this. If people insist on reading my site, I'd prefer if they got better exposure to related ideas. Solving that entails bringing metadata under management, a big chore I've been avoiding but appear to be running out of excuses not to do.
What hypertext handles really well is parenthesis. In its purest form, hypertext has an inimitable capacity to square away all definitions, remarks and digressions and just focus on a single, brief main message. When we write in this way, the overhead of managing all these digressions explodes. Ironically, HTML is a monumentally awkward way to manage hypertext, in part because we write the links in referring pages before we write the pages they refer to. Never mind having to choose a URI for the document before writing it, how about the content itself? HTML is biased toward hurriedly putting something up at a given location, even if it isn't really very good. This is one of those casualties.
Most Importantly…
If you put something on the Web, somebody will eventually come along and read it. Managing Web content is considerably different from managing it for print, even more alien than we normally appreciate. The sheer complexity generated by being able to link arbitrarily from one idea to another has implications for the way we present ourselves online that are hard to see without the proper instrumentation.
I wrote the majority of the content on my site as notes to myself. Now that it appears that I have a modest following, I should probably put in the necessary elbow grease to treat you, my readers, with some courtesy. As with any situation in which we must generate buy-in—especially if the ones to be convinced are ourselves—it's handy to be able to point to some data.