Architectural Principles

What follows is a non-exhaustive list of the most salient principles that come to mind when considering the design and implementation of this project.

Durable Addressing / Links Are First-Class Citizens

The foremost issue at stake under this heading is that without durable URIs you have no dense hypermedia.

The URI (URL, URN) is Tim Berners-Lee's most important invention. (He seems to think so, at least.) Using URIs, one can point from any information system to just about any other information system. For every appearance of a URI we have the issuing side, who controls what a URI means, and the referring side, who controls where it shows up. These are not expected to be the same entity; indeed, that they aren't is kind of the point.

The content of a URI itself, that is to say the string of characters that make it up, while there are numerous equities to consider in choosing a good one, does not actually matter to the referring side. What matters is that it works, that is, that the URI actually points to something on the issuing side. What's more, some degree of continuity is expected, i.e., that the URI identifies the same resource today as it did when the referring side included it in their document.

What this entails is that issuers of URIs need some kind of mechanism for remembering their commitments. The consequence of exposing a URI is the risk that somebody out there actually links to it. By putting a resource at a given URI and then letting others become aware of its existence—that is, by linking to it yourself—you're making an implicit pledge that others can do the same. What I'm arguing is that with the way the Web is set up, it's exceedingly difficult to keep that promise.

The Web is characteristically easy to deploy, and many of its design tradeoffs are biased in this direction. One such tradeoff is the use of the host computer's file system as the state mechanism for URI resolution, a task for which over the long term it is manifestly unsuitable. Berners-Lee was aware of the problem and wrote about it as early as 1998, but did not offer a solution. Indeed, the tone of the article reads as if he didn't believe himself to be the one who put the world in that situation in the first place.

Because of a ramp-up in interest over the years in search engine optimization, virtually every Web platform has some facility for URI continuity, but coverage is patchy—bolted on after the fact—and local to specific platforms. There is not, to my knowledge, a systemic solution, or even a consensus on how one might approach it. In my opinion, durable addressing has to be designed into the system at the outset.

If one proposes to create a system with ten, or potentially a hundred times more addressable information resources, the problem of managing URI continuity becomes all that much more acute. The way you solve this problem is you give each resource at least one (but ideally only one) durable identifier from a symbol space that is big enough to fit as many identifiers as you would ever need, without ever assigning the same one twice. You then overlay any other identifiers you deem appropriate as synonyms to the durable ones. Most importantly, you keep track of every one of these associations that you ever make, and you guard that information with your life.

This is not to say that you must keep every webpage around in perpetuity, just every address. Even a gigabyte, which is a modest capacity nowadays, could register millions of synonymic associations, even if we stored them in the most naïve possible way. Content is bound to shift and merge and split off over time, and ultimately be retired from publication. In addition to supporting these renames and reroutes, this memory would enable you to eliminate a particular source of technological gaslighting: it can actually tell the truth about its 404 errors. That is, when a resource which was available is affirmatively deleted, the oblivious system chirps 404, a concept encountered so often it's seated firmly in the common lexicon. 404 is serverese for never heard of it, which may be true, but is probably a lie. It is possible the resource was moved or renamed—in which case the proactive thing to do is redirect the request—but it is probably more likely deleted. The honest thing for the server to say is actually 410 Gone, which acknowledges that there was once something there which is no longer. In order to do that though, you need a record of something having been there in the first place.

The synonym relation between one address and another is the same structure as a link, which is why the other aspect of this architectural principle is that links get first-class treatment. The reason why is backlinks, or rather the lack of backlinks being one of Ted Nelson's original gripes about the Web. Well, a database of links is just a database of backlinks turned around. It means that every resource on a given website can feature everything that links to it. Incorporating other websites, mind you, is a bigger challenge, but establishing a database of links—including links to URI synonyms—gives us a point of departure for thinking about backlink-sharing protocols across administrative boundaries. Solving the durable URI problem by creating a master database of links also helps us imagine a future beyond HTTP and DNS, with ARKs, decentralized identifiers, or something else entirely.

Nodes and Links Are Typed

The motivation for solving link rot through a regime of durable addressing is to be able to increase the number of addressable information resources by several orders of magnitude. The motivation behind that is to only have a single authoritative copy of any given piece of content that is subsequently reused. An example of this in the wild are the graph-based personal knowledge management systems, also known as tools for thought. PKM systems achieve the density we're after, but they don't distinguish programmatically between kinds of thing, be it resource or link.

Typed nodes—resources, pages, blocks, modules, whatever you want to call them—are obvious. That is typically how you dispatch an appropriate display template. Typed links, on the other hand, are typically implicit. An example of a particular type of link would be the links that represent a person's friends on a social media network. A sophisticated system may have an API that indeed groups these links as a member of a data structure, but the median player simply delineates link types by where they put them on the page.

In conventional Web development, adding a new kind of link is a non-trivial process. It involves altering the database schema and surrounding code (which on a live site triggers operations overhead), modifying display templates, and deciding how to style it. Two out of three of these interventions are major surgery that require the patient (or rather its in-development doppelgänger) to be anaesthetized, while the third merely renders it temporarily unpresentable. Often this is a multi-person job, which we have to repeat, at least in part, everywhere in the system we want this kind of link to show up. The outcome is that any link type is represented in a number of different places:

a foreign key constraint identifier (if you even use those),
an ORM relationship identifier or accessor name,
the file name of a template fragment representing the region in the page layout,
a CSS class name…

…but which one of these is canonical? Or does the canonical representation live somewhere else, like in some spec document? Are the identifiers all the same? If not, where does the mapping exist from one to the other? Is the specification available to stakeholders? Do they know where it is?

If we instead begin from a situation where links are first-class objects and typed, a lot of what I just described can be derived. A link between two URIs of a certain type is enough to tell us something about its disposition—say, whether it should be displayed as an ordinary link, or an embed, or a form action, or invisible metadata, or something else. Put together with the types of the nodes at either pole, this information is enough to place it on the page. Types of course inherit, so the layout of a more-specific type can be represented as a delta against its more-generic ancestor. Finer adjustments can also be attached to specific URIs if we want them to deviate from their types in some unique way.

Standard Interfaces

Web-based software is typically bound not just to a particular programming language, but a particular application framework on top of a particular programming language. This puts the content of the Web at odds with its own infrastructure:

Programmers are suckers for technical fads,
every system integration is its own unique and special snowflake.

So you've got one piece of Web infrastructure that was written when Blub was en vogue, but now nobody would be caught dead using it because Glub++ is the new hotness. If you want the two systems to cohabitate in the same address space, you probably have to do some real bastard stuff to make it work. All it takes is one VP of No to declare that it isn't worth integrating, and you are left with a gap that has to be spanned by human users in perpetuity.

The solution here is to design the engine with the assumption that it will be joining an ecosystem with other languages and frameworks. Additionally, hold all of its subsystems to the standard that they must pass as standalone microservices. Give them each their own URL. Knit everything together at the HTTP level. This software should be considered to be a proposal for how to organize a language bus—intended to subsume all functionality written in a particular programming language (in this case Ruby), and intended to be daisy-chained with other language buses, running in a single amalgamated address space.

The Web microframework pattern (Rack, WSGI, PSGI, JSGI…) is a sensible point of departure. Construed as implementations of a set of nearly-identical protocols, microframeworks sit in between whatever's handling requests off the network, and the application, for which they drastically simplify the interface. The application doesn't have to care about the context in which it is being run. All it has to do is ingest one data structure and emit a different one. No fuss, no muss. All microframeworks perform this mundane yet terrifically stable role for their respective programming languages. They might do more than that, but this flattening of program input and output to a predictable pair of structures is the important thing. If your language of choice doesn't have one of these microframeworks, you'd be a hero if you wrote one.

Forcing subsystems to converse exclusively in HTTP means that they can't use any bespoke funny business to communicate, which in turn means said components are extremely well-defined development targets which are amenable not only to be ported to other languages, but used as-is in heterogeneous systems. This design decision, however, raises a couple of issues:

Efficiency, or lack of it, having to repeatedly serialize and reparse HTTP messages;
the expressivity of the constraint—whether it's possible to say everything that needs to be said to describe an acceptable range of behaviour.

The solution to these problems is a combination of creative interpretation of standards—or what I have come to call standards hacking, and an equally creative implementation. To be sure, serialize-and-reparse phenomenon is going to happen when an HTTP message crosses a process boundary, but this overhead can be easily mitigated within an individual process. As for the expressivity issue, we create a grammar of coarser-grained operations within which these aforementioned microservices serve as the vocabulary.

There will be an inherent need for communicating internally within the system using structured data. The proposal here is to use RDF as the reference semantics for structured data, and downgrade to other formats (plain JSON, CSV…) when necessary.

I have been using RDFa successfully in (X)HTML and SVG for many years as a means of dispatching display templates and CSS selectors. Moreover, JSON-LD contexts are designed to map RDF identifiers to plain JSON. Other benefits of RDF include the fact that there exist hundreds of vocabularies (RDF Schema, OWL, SHACL) that define common classes of data object and their interralated properties, that you are free to use as with any open-source software. Should you need to write your own extension, you can usually take advantage of RDF's inheritance model to supply a minimal patch to what is already available. When you publish the resulting extension vocabulary on the Web, it is possible to do so in a way that the machine-executable specification is embedded into the human-readable prose, which, among other benefits, helps prevent the two from drifting apart.

My final remark regarding standard interfaces for the time being concerns the problem of presentation markup and page composition. Since we have stipulated that every resource must be addressable by URI and converse in HTTP, we can do page composition at the network level. To accomplish this I arrange for content handlers to emit a minimal XHTML document containing only semantic elements with structured data embedded using RDFa. To compose the resources and embellish the markup, I currently use XSLT (a standard language), executed in the browser.

Web browsers only implement the 24-year-old XSLT 1.0, which is effectively abandonware. I am frankly surprised the Chromium people in particular haven't opportunistically deleted it. I can only assume there is some powerful legacy constituency which is preventing them from doing so. Despite an awkward syntax and an astonishing lack of basic functionality (e.g. regular expressions, the ability to reprocess result trees), XSLT 1.0 runs quickly, and is more than expressive enough to handle the basic logic needed for template processing.

The XSLT standard is now up to version 3.0, which has a boatload of new capabilities including processing JSON. There is an implementation of it in JavaScript which can be bootstrapped in place of the native 1.0. Alternatively, XSLT can be run as a filter on the server side.

In my opinion, XSLT has a number of unique characteristics that set it apart from just about every other template processing language. Being a standard is one of them. Another is the fact that barring considerable effort, it is impossible to use it to generate invalid markup. Few other systems can claim that. I suspect its dearth of popularity has to do with being rooted in XML, which mainstream Web developers almost universally despise. My bet is if there was a compact XSLT syntax à la RelaxNG, it would be more popular.

Layered System: Pipes with Types

If we stipulate that all communication is relegated to HTTP messages, and that each subcomponent is its own microservice, then we can contemplate two species of microservice:

One species that responds directly to an HTTP request,
another that manipulates an HTTP message, i.e., either request or response, in transit.

We will dub the former a content handler, and the latter a transform. A content handler is well-understood—that's what we typically write when we write Web apps. You give it an HTTP request and it returns a response. A transform is something you give a response and it returns a response. Here are some potential operations of transforms:

HTML/XML markup: Add social media metadata, rewrite links, rearrange document tree
Raster images (e.g., photos): Resize, crop, and other rudimentary Photoshop-esque functions; change between file formats
Audio/video: Take clips or individual frames, transcode/resample (though the latter may not be something we do in real time)
Other: Transform between representations, e.g. Markdown to (X)HTML or RDFa to JSON-LD (or Turtle, or RDF/XML…), handle over-the-wire compression (indeed that is how compression is typically implemented already. More likely we'd have to decompress responses from upstream before we could operate on them.)

The idea behind transforms, at least response transforms, is that they don't have to know much—if anything—about the application: they are nominally bytes in, bytes out. This means that a transform written in one programming language could be used to alter the output of a content handler written in a different language, because all response transforms are also stand-alone microservices.

We can imagine two transform stacks: one that manipulates the request on the way in, and another that manipulates the response on the way out. We would configure these statically, passing the HTTP message from one transform to the next. It will be useful for transforms triggered earlier in the process to append a later one, should certain conditions be met. It will likewise be useful for us, in addition to both static configuration and conditional processing, to explicitly address response transforms as parameters to other URIs, which in turn could take their own parameters. This has the side effect of creating URIs which are legible as being derived from non-parametrized resources. Consider a request like GET /photo;scale=100,100. We map scale=100,100 under the hood to POST /scale?width=100&height=100 and pass in the request body. We use path parameters for response transforms because they faithfully communicate the sequence of operations (e.g. /photo;desaturate;crop=200,100,1000,1000;scale=100,100), but also to keep query parameters free for parametrizing content handlers.

Wholeness

One thing that has struck me as remarkable about Web development, going back decades, is the preponderance of fragments. Going back as far as CGI scripts and server-side includes, the name of the game was typically to stitch together pieces of markup by concatenating strings of text. This pattern was carried forward by systems like PHP, ASP, JSP, ColdFusion, and many others. Fragments are the modus operandi of MVC frameworks that dominated the last decade; even the most modern frameworks like React, Angular, and Vue still operate in terms of fragments.

Why I think this is significant is because HTML, which is the output that by far the most work goes into, smushes together a number of disparate equities:

There are the technical concerns,
matters of presentation,
not to mention the content itself.

All of these have to coexist in the same document. When you cut the document up into fragments, these diverging concerns are cut up as well. At the basic technical level, stitching together fragments introduces the potential for syntax errors in the markup, or mismatches in character encoding. Presentation markup (including metadata for both search engines and scripting) and CSS identifiers have to be interspersed in application code. Same goes with buttons, labels, and other microcontent.

The approach I'm proposing instead borrows from a largely forgotten niche framework called Cocoon, which relies heavily on transformation pipelines to do its work. While Cocoon is firmly of an era, one thing it absolutely got right was the idea that a representation of an information resource can be put through a series of successive transformations, which are separate from the handler that produces the initial representation. For this to work, however, the initial representation (to say nothing of its transformed variants) has to be a complete object—not a fragment. User-facing Web pages are frequently, if not usually composite—that is, they represent more than one subject entity. There's no rule saying the parts can't also be valid, addressable wholes unto themselves. Again, moving composition into the HTTP realm means content generated from different frameworks can mingle. Since each elementary component produces a complete document unto itself, it is a crisply-defined development target that can be created and evaluated independently.

The real inspiration for this architectural principle, though, comes from the later work of the architect Christopher Alexander. Rather than approaching a problem as a matter of assembling parts which individually have little to no value, the idea is to produce a whole, and put it through a series of structure-preserving transformations. Here the natural whole is the HTTP resource, which is a many-to-many mapping between a set of URIs and a set of representations. This means at least one address is related to at least one literal byte segment. Polymorphic resources can thus be constructed out of elementary ones (including and especially representations derived via transformation function). Larger structures can then be composed from individual resources. What you notice when you do this is that you start thinking about individual resources in terms of the information they need to provide to be useful to the contexts that employ them, which encourages you to design more durable and modular resources. This is a technique I have been using for about fifteen years: I don't, for example, design a navigation, I design a stand-alone table of contents and then transform it into a navigation.

What we find when we do this is we are biased toward creating resources that do double duty as stand-alone pages as well as components of other pages. In Alexander parlance this is called strong centers: rather than be meaningless when separated from the whole, every part has an identity of its own.

Appendix: Why Ruby?

I was originally going to write (what ultimately became) Intertwingler in Python, but ended up writing it in Ruby because Ruby's RDF library has a reasoner, and Python's, at least at the time, did not. I briefly contemplated writing it in Clojure (on top of something like Jena), and still may, but at the time (2018) I didn't feel like paying for a bigger server in perpetuity just so I could run a JVM. So, Ruby it is.

One of the things you notice working with RDF is how soon you find yourself needing to do entailment, and for that you need a reasoner. If you want to say something like select all resources of type X, you're going to want to include class X, its subclasses, and any equivalent classes of any of those. If you want to say get me all the resources related to subject S by property P, you're going to want to do the same thing but for subproperties and their equivalents. A reasoner is table stakes if you want to do any graph manipulation UI: Give me all the properties with a domain of this class, give me all the instances of classes in the range of this property; it is a folly to attempt to code these operations by hand. Problem is, reasoners are scarce, and not all of them are created equal. Furthermore, most reasoners are written in Java, which sharply limits how you can interact with them. The Ruby one is simplistic; it attaches to the vocabularies themselves and provides ad-hoc entailments for class and property terms, which is more than sufficient for the examples I just gave. Other reasoners will generate all the entailments for an entire instance graph in one shot, which is almost certainly never what you want.

I am actually inclined to diagnose the lack of reasoners outside of the Javasphere as a significant contributing factor to RDF gaining less traction than it otherwise would have. Over a decade and a half of observation suggests that the bulk of Semantic Web tooling has been written in Java due to its inherent bias toward academia. A reasoner is a hard piece of software to write; it requires an intersection of skills few people possess. It is likewise a thankless task because it produces results that are several degrees removed from anything most people can directly make any money off of. It's no surprise that the few extant specimens are concentrated where they are. The reasoner in Ruby's RDF library is rudimentary (this should not be controversial) compared to more sophisticated designs, and has very acute limitations, but is nevertheless serviceable. This is more than what can be said for other programming languages. Hopefully, however, for these other languages, somebody will rise to the challenge.