<?xml version="1.0"?>
<?xml-stylesheet href="/transform" type="text/xsl"?>
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:bs="http://purl.org/ontology/bibo/status/" xmlns:ci="https://vocab.methodandstructure.com/content-inventory#" xmlns:dct="http://purl.org/dc/terms/" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:xhv="http://www.w3.org/1999/xhtml/vocab#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" lang="en" prefix="bibo: http://purl.org/ontology/bibo/ bs: http://purl.org/ontology/bibo/status/ ci: https://vocab.methodandstructure.com/content-inventory# dct: http://purl.org/dc/terms/ foaf: http://xmlns.com/foaf/0.1/ rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# xhv: http://www.w3.org/1999/xhtml/vocab# xsd: http://www.w3.org/2001/XMLSchema#" vocab="http://www.w3.org/1999/xhtml/vocab#" xml:lang="en">
  <head>
    <title lang="en" property="dct:title" xml:lang="en">Architectural Principles</title>
    <base href="https://doriantaylor.com/summer-of-protocols/architectural-principles"/>
    <link href="../elsewhere" rel="alternate bookmark" title="Elsewhere"/>
    <link href="../this-site" rel="alternate index" title="This Site"/>
    <link href="http://purl.org/ontology/bibo/status/draft" rel="bibo:status"/>
    <link href="http://purl.org/ontology/bibo/status/published" rel="bibo:status"/>
    <link href="" rel="ci:canonical" title="Architectural Principles"/>
    <link href="../person/dorian-taylor#me" rel="dct:creator" title="Dorian Taylor"/>
    <link href="http://www.w3.org/1999/xhtml/vocab#" rel="http://www.w3.org/ns/rdfa#usesVocabulary"/>
    <link href="../person/dorian-taylor" rel="meta" title="Who I Am"/>
    <link href="./" rel="up" title="Summer of Protocols: Retrofitting the Web"/>
    <link about="../" href="../3f36c30c-6096-454a-8a22-c062100ae41f" rel="alternate" type="application/atom+xml"/>
    <link about="../" href="../this-site" rel="alternate"/>
    <link about="../" href="../elsewhere" rel="alternate"/>
    <link about="../" href="../e341ca62-0387-4cea-b69a-cdabc7656871" rel="alternate" type="application/atom+xml"/>
    <link about="../" href="../f07f5044-01bc-472d-9079-9b07771b731c" rel="alternate" type="application/atom+xml"/>
    <link about="../verso/" href="../3f36c30c-6096-454a-8a22-c062100ae41f" rel="alternate" type="application/atom+xml"/>
    <link about="../verso/" href="../this-site" rel="alternate"/>
    <link about="../verso/" href="../elsewhere" rel="alternate"/>
    <meta content="2023-06-29T13:16:42Z" datatype="xsd:dateTime" property="dct:created"/>
    <meta content="I have consolidated a set of five architectural principles that underpin this project." name="description" property="dct:description"/>
    <meta about="../person/dorian-taylor#me" content="Dorian Taylor" name="author" property="foaf:name"/>
    <meta content="summary" name="twitter:card"/>
    <meta content="@doriantaylor" name="twitter:site"/>
    <meta content="Architectural Principles" name="twitter:title"/>
    <meta content="I have consolidated a set of five architectural principles that underpin this project." name="twitter:description"/>
    <object>
      <nav>
        <ul>
          <li>
            <a href="technical-note/3#EuA0xQHKGp2A6KFuge3QlI" rev="ci:mentions" typeof="bibo:DocumentPart">
              <span property="dct:title">Architectural Principles</span>
            </a>
          </li>
        </ul>
      </nav>
    </object>
  </head>
  <body about="" id="Epv3Sw-m5tRHCj8WyR-K_L" typeof="bibo:Report">
    <p>What follows is a non-exhaustive list of the most salient principles that come to mind when considering the design and implementation of this project.</p>
    <section id="EmhXM9GaIpRzWYpWAgq1EJ" rel="dct:hasPart" resource="#EmhXM9GaIpRzWYpWAgq1EJ" typeof="bibo:DocumentPart">
      <h2 property="dct:title">Durable Addressing / Links Are First-Class Citizens</h2>
      <p>The foremost issue at stake under this heading is that without durable <abbr>URIs</abbr> you have no <dfn>dense hypermedia</dfn>.</p>
      <p>The <abbr>URI</abbr> (<abbr>URL</abbr>, <abbr>URN</abbr>) is <span>Tim Berners-Lee's</span> most important invention. (<a rel="dct:references" href="https://www.w3.org/People/Berners-Lee/Weaving/Overview.html">He seems to think so, at least.</a>) Using <abbr>URIs</abbr>, one can point from any information system to just about any <em>other</em> information system. For every appearance of a <abbr>URI</abbr> we have the <em>issuing</em> side, who controls what a <abbr>URI</abbr> <em>means</em>, and the <em>referring</em> side, who controls where it shows up. These are not expected to be the same entity; indeed, that they <em>aren't</em> is kind of the point.</p>
      <p>The content of a <abbr>URI</abbr> <em>itself</em>, that is to say the string of characters that make it up, while there are numerous equities to consider in choosing a good one, does not actually matter to the referring side. What matters is that it <em>works</em>, that is, that the <abbr>URI</abbr> actually points to something on the <em>issuing</em> side. What's more, some degree of continuity is expected, i.e., that the <abbr>URI</abbr> identifies the same resource <em>today</em> as it did when the referring side included it in their document.</p>
      <p>What this entails is that issuers of <abbr>URIs</abbr> need some kind of mechanism for remembering their commitments. The consequence of exposing a <abbr>URI</abbr> is the risk that somebody out there actually links to it. By putting a resource at a given <abbr>URI</abbr> and then letting others become aware of its existence&#x2014;that is, by linking to it yourself&#x2014;you're making an implicit pledge that others can do the same. What I'm arguing is that with the way the Web is set up, it's <em>exceedingly</em> difficult to keep that promise.</p>
      <p>The Web is characteristically easy to deploy, and many of its design tradeoffs are biased in this direction. One such tradeoff is the use of the host computer's file system as the state mechanism for <abbr>URI</abbr> resolution, a task for which over the long term it is manifestly unsuitable. <span>Berners-Lee</span> was aware of the problem and <a rel="dct:references" href="https://www.w3.org/Provider/Style/URI.html">wrote about it as early as <time>1998</time></a>, but did not offer a solution. Indeed, the tone of the article reads as if he didn't believe himself to be the one who put the world in that situation in the first place.</p>
      <p>Because of a ramp-up in interest over the years in <dfn>search engine optimization</dfn>, virtually every Web platform has <em>some</em> facility for <abbr>URI</abbr> continuity, but coverage is patchy&#x2014;bolted on after the fact&#x2014;and local to specific platforms. There is not, to my knowledge, a systemic solution, or even a consensus on how one might approach it. In my opinion, durable addressing has to be designed into the system at the outset.</p>
      <aside id="EpSSXyojxmgsbCeDV-PyiK" role="note" rev="oa:hasTarget" resource="../a52497ca-88f1-49a0-ab1b-09e0d5f8fca2" typeof="oa:Annotation">
        <p property="oa:hasBody">The problem is there are a million internal identifiers that can contribute to the lexical form of a given <abbr>URI</abbr> that can be changed or deleted, usually nonchalantly by the person who controls it. If it isn't the file system, which is notoriously mutable, it's some other internal identifier like a class or method name, or database key.</p>
      </aside>
      <p>If one proposes to create a system with ten, or potentially a <em>hundred</em> times more addressable <dfn>information resources</dfn>, the problem of managing <abbr>URI</abbr> continuity becomes all that much more acute. The way you solve this problem is you give each resource <em>at least one</em> (but ideally only one) durable identifier from a symbol space that is big enough to fit as many identifiers as you would ever need, without ever assigning the same one twice. You then <em>overlay</em> any other identifiers you deem appropriate as synonyms to the durable ones. Most importantly, you keep track of <em>every</em> one of these associations that you <em>ever</em> make, and you guard that information with your life.</p>
      <p>This is not to say that you must keep every <em>webpage</em> around in perpetuity, just every address. Even a gigabyte, which is a modest capacity nowadays, could register millions of synonymic associations, even if we stored them in the most na&#xEF;ve possible way. Content is bound to shift and merge and split off over time, and ultimately be retired from publication. In addition to supporting these renames and reroutes, this memory would enable you to eliminate a particular source of technological gaslighting: it can actually tell the truth about its <code>404</code> errors. That is, when a resource which <em>was</em> available is affirmatively deleted, the oblivious system chirps <code>404</code>, a concept encountered so often it's seated firmly in the common lexicon. <code>404</code> is serverese for <q>never heard of it</q>, which <em>may</em> be true, but is probably a lie. It is possible the resource was moved or renamed&#x2014;in which case the proactive thing to do is redirect the request&#x2014;but it is probably more likely deleted. The <em>honest</em> thing for the server to say is actually <code>410 Gone</code>, which acknowledges that there <em>was</em> once something there which is no longer. In order to do that though, you need a record of something having been there in the first place.</p>
      <p>The synonym relation between one address and another is the same structure as a link, which is why the other aspect of this architectural principle is that links get first-class treatment. The reason why is <em>backlinks</em>, or rather the <em>lack</em> of backlinks being one of <span>Ted Nelson's</span> original gripes about the Web. Well, a database of links is just a database of backlinks turned around. It means that every resource on a given website can feature everything that links to it. Incorporating <em>other</em> websites, mind you, is a bigger challenge, but establishing a database of links&#x2014;including links to <abbr>URI</abbr> synonyms&#x2014;gives us a point of departure for thinking about backlink-sharing protocols across administrative boundaries. Solving the durable <abbr>URI</abbr> problem by creating a master database of links also helps us imagine a future beyond <abbr>HTTP</abbr> and <abbr>DNS</abbr>, with <a rel="ci:mentions" href="https://datatracker.ietf.org/doc/draft-kunze-ark/"><abbr title="archival resource key">ARKs</abbr></a>, <a rel="ci:mentions" href="https://www.w3.org/TR/did-core/">decentralized identifiers</a>, or something else entirely.</p>
      
    </section>
    <section id="Edn-p5d-E7Dkakc3lDh2jJ" rel="dct:hasPart" resource="#Edn-p5d-E7Dkakc3lDh2jJ" typeof="bibo:DocumentPart">
      <h2 property="dct:title">Nodes <em>and</em> Links Are Typed</h2>
      <p>The motivation for solving <dfn>link rot</dfn> through a regime of durable addressing is to be able to increase the number of addressable information resources by several orders of magnitude. The motivation behind <em>that</em> is to only have a single authoritative copy of any given piece of content that is subsequently reused. An example of this in the wild are the graph-based <dfn>personal knowledge management</dfn> systems, also known as <q>tools for thought</q>. <abbr>PKM</abbr> systems achieve the <em>density</em> we're after, but they don't distinguish programmatically between <em>kinds of thing</em>, be it resource or link.</p>
      <aside id="Ed5Mob0T73jzbKdohDY3SL" rev="oa:hasTarget" role="note" resource="../7793286f-44fb-4de3-bcdb-29da210d8dd2" typeof="oa:Annotation">
        <p property="oa:hasBody">Some products, like Notion, have a rudimentary type system for resources, although it is very literal about what <q>kind of thing</q> means. Also, an embedding versus a conventional navigational arc can be construed as a <q>link type</q>, although that framing skips over what kind of link it <em>is</em> and just says how to display it.</p>
      </aside>
      <p>Typed <em>nodes</em>&#x2014;resources, pages, blocks, modules, whatever you want to call them&#x2014;are obvious. That is typically how you dispatch an appropriate display template. Typed <em>links</em>, on the other hand, are typically implicit. An example of a particular type of link would be the links that represent a person's friends on a social media network. A sophisticated system may have an <abbr>API</abbr> that indeed groups these links as a member of a data structure, but the median player simply delineates link types by where they put them on the page.</p>
      <p>In conventional Web development, adding a new kind of link is a non-trivial process. It involves altering the database schema and surrounding code (which on a live site triggers operations overhead), modifying display templates, and deciding how to style it. Two out of three of these interventions are major surgery that require the patient (or rather its in-development doppelg&#xE4;nger) to be anaesthetized, while the third merely renders it temporarily unpresentable. Often this is a multi-person job, which we have to repeat, at least in part, everywhere in the system we want this kind of link to show up. The outcome is that any <q>link type</q> is represented in a number of different places:</p>
      <ul>
        <li>a foreign key constraint identifier (if you even use those),</li>
        <li>an <abbr>ORM</abbr> relationship identifier or accessor name,</li>
        <li>the file name of a template fragment representing the region in the page layout,</li>
        <li>a <abbr>CSS</abbr> class name&#x2026;</li>
      </ul>
      <p>&#x2026;but which one of these is canonical? Or does the canonical representation live somewhere else, like in some spec document? Are the identifiers all the same? If not, where does the mapping exist from one to the other? Is the specification available to stakeholders? Do they know where it is?</p>
      <aside id="E-XIkS0zcgXyGawqghMRzJ" rev="oa:hasTarget" role="note" resource="../f972244b-4cdc-4817-9c86-6b0aa084c473" typeof="oa:Annotation">
        <p property="oa:hasBody">Also: tack on whatever code generates your <abbr>API</abbr> response if that's in the mix as well. I have come to call this general state of affairs <a href="../the-symbol-management-problem">the symbol management problem</a>, because you have a bunch of <em>symbols</em>, which you have to <em>manage</em>, and this is a <em>problem</em>.</p>
      </aside>
      <p>If we instead begin from a situation where links are first-class objects and <em>typed</em>, a lot of what I just described can be <em>derived</em>. A link between two <abbr>URIs</abbr> of a certain type is enough to tell us something about its disposition&#x2014;say, whether it should be displayed as an ordinary link, or an embed, or a form action, or invisible metadata, or something else. Put together with the types of the nodes at either pole, this information is enough to place it on the page. Types of course inherit, so the layout of a more-specific type can be represented as a delta against its more-generic ancestor. Finer adjustments can also be attached to specific <abbr>URIs</abbr> if we want them to deviate from their types in some unique way.</p>
    </section>
    <section id="EsRgEghZ59jlEPnDOuTrOL" rel="dct:hasPart" resource="#EsRgEghZ59jlEPnDOuTrOL" typeof="bibo:DocumentPart">
      <h2 property="dct:title">Standard Interfaces</h2>
      <p>Web-based software is typically bound not just to a particular programming language, but a particular application framework on <em>top</em> of a particular programming language. This puts the <em>content</em> of the Web at odds with its own infrastructure:</p>
      <ul>
        <li>Programmers are suckers for technical fads,</li>
        <li>every system integration is its own unique and special snowflake.</li>
      </ul>
      <p>So you've got one piece of Web infrastructure that was written when <code>Blub</code> was en vogue, but now nobody would be caught dead using it because <code>Glub++</code> is the new hotness. If you want the two systems to cohabitate in the same address space, you probably have to do some real bastard stuff to make it work. All it takes is one <abbr>VP</abbr> of No to declare that it isn't worth integrating, and you are left with a gap that has to be spanned by human users in perpetuity.</p>
      <aside id="EA302j8fApTl1eQJWAdBML" rev="oa:hasTarget" role="note" resource="../037d368f-c7c0-4a53-b975-79025601d04c" typeof="oa:Annotation">
        <p property="oa:hasBody">Even if you use standard <em>syntax</em> (like <abbr>JSON</abbr>), there is still no standard <em>semantics</em> for <abbr>API</abbr> responses and other structured data. Moreover, there is no standard discovery mechanism for <abbr>API</abbr> endpoints, and even if there <em>was</em>, it wouldn't matter because you'd probably have to write a custom adapter for every single third-party <abbr>API</abbr> you encounter.</p>
      </aside>
      <p>The solution here is to design the engine with the assumption that it will be joining an ecosystem with other languages and frameworks. Additionally, hold all of its subsystems to the standard that they must pass as standalone <dfn>microservices</dfn>. Give them each their own <abbr>URL</abbr>. Knit everything together at the <abbr>HTTP</abbr> level. This software should be considered to be a <em>proposal</em> for how to organize a <q>language bus</q>&#x2014;intended to subsume all functionality written in a particular programming language (in this case Ruby), and intended to be daisy-chained with other language buses, running in a single amalgamated address space.</p>
      <aside id="EM9Wew4F8_Jwmfvh4MDbkI" rev="oa:hasTarget" role="note" resource="../33d59ec3-817c-4fc9-8c26-7ef8783036e4" typeof="oa:Annotation">
        <p property="oa:hasBody">The Web <dfn>microframework</dfn> pattern (<a rel="ci:mentions" href="https://en.wikipedia.org/wiki/Rack_(web_server_interface)">Rack</a>, <a rel="ci:mentions" href="https://en.wikipedia.org/wiki/Web_Server_Gateway_Interface"><abbr title="Web Server Gateway Interface">WSGI</abbr></a>, <a rel="ci:mentions" href="https://en.wikipedia.org/wiki/PSGI"><abbr title="Perl Web Server Gateway Interface">PSGI</abbr></a>, <a rel="ci:mentions" href="https://en.wikipedia.org/wiki/JSGI"><abbr title="JavaScript Gateway interface">JSGI</abbr></a>&#x2026;) is a sensible point of departure. Construed as implementations of a set of nearly-identical protocols, microframeworks sit in between whatever's handling requests off the network, and the application, for which they drastically simplify the interface. The application doesn't have to care about the context in which it is being run. All it has to do is ingest one data structure and emit a different one. No fuss, no muss. All microframeworks perform this mundane yet terrifically <em>stable</em> role for their respective programming languages. They might do <em>more</em> than that, but this flattening of program input and output to a predictable pair of structures is the important thing. If your language of choice doesn't have one of these microframeworks, you'd be a hero if you wrote one.</p>
      </aside>
      <p>Forcing subsystems to converse exclusively in <abbr>HTTP</abbr> means that they can't use any bespoke funny business to communicate, which in turn means said components are <em>extremely</em> well-defined development targets which are amenable not only to be ported to other languages, but used as-is in heterogeneous systems. This design decision, however, raises a couple of issues:</p>
      <ul>
        <li>Efficiency, or lack of it, having to repeatedly serialize and reparse <abbr>HTTP</abbr> messages;</li>
        <li>the expressivity of the constraint&#x2014;whether it's possible to say everything that needs to be said to describe an acceptable range of behaviour.</li>
      </ul>
      <p>The solution to these problems is a combination of creative interpretation of standards&#x2014;or what I have come to call <dfn>standards hacking</dfn>, and an equally creative implementation. To be sure, serialize-and-reparse phenomenon is going to happen when an <abbr>HTTP</abbr> message crosses a process boundary, but this overhead can be easily mitigated <em>within</em> an individual process. As for the expressivity issue, we create a grammar of coarser-grained operations within which these aforementioned microservices serve as the vocabulary.</p>
      <p>There will be an inherent need for communicating internally within the system using structured data. The proposal here is to use <abbr>RDF</abbr> as the reference semantics for structured data, and downgrade to other formats (plain <abbr>JSON</abbr>, <abbr>CSV</abbr>&#x2026;) when necessary.</p>
      <aside id="EOhfDC1DgOl7CY6R2Hzp4I" rev="oa:hasTarget" role="note" resource="../3a17c30b-50e0-43a5-8ec2-63a4761f3a78" typeof="oa:Annotation">
        <p property="oa:hasBody">I have been using <abbr>RDFa</abbr> successfully in <abbr>(X)HTML</abbr> and <abbr>SVG</abbr> for many years as a means of dispatching display templates and <abbr>CSS</abbr> selectors. Moreover, <dfn><abbr>JSON-LD</abbr> contexts</dfn> are designed to map <abbr>RDF</abbr> identifiers to plain <abbr>JSON</abbr>. Other benefits of <abbr>RDF</abbr> include the fact that there exist hundreds of vocabularies (<abbr>RDF</abbr> Schema, <abbr>OWL</abbr>, <abbr>SHACL</abbr>) that define common classes of data object and their interralated properties, that you are free to use as with any open-source software. Should you need to write your own extension, you can usually take advantage of <abbr>RDF's</abbr> inheritance model to supply a minimal patch to what is already available. When you publish the resulting extension vocabulary on the Web, it is possible to do so in a way that the machine-executable specification is embedded into the human-readable prose, which, among other benefits, helps prevent the two from drifting apart.</p>
      </aside>
      <p>My final remark regarding standard interfaces for the time being concerns the problem of presentation markup and page composition. Since we have stipulated that every resource must be addressable by <abbr>URI</abbr> and converse in <abbr>HTTP</abbr>, we can do page composition at the network level. To accomplish this I arrange for content handlers to emit a minimal <abbr>XHTML</abbr> document containing only semantic elements with structured data embedded using <abbr>RDFa</abbr>. To compose the resources and embellish the markup, I currently use <abbr>XSLT</abbr> (a standard language), executed in the browser.</p>
      <aside id="E_fcZ-_-TytuV3yuwT80GI" rev="oa:hasTarget" role="note" resource="../fdf719fb-ff93-4cad-8b95-df2bb04fcd06" typeof="oa:Annotation">
        <div property="oa:hasBody">
          <p>Web browsers only implement the 24-year-old <abbr>XSLT</abbr> <var>1.0</var>, which is effectively abandonware. I am frankly surprised the Chromium people in particular haven't opportunistically deleted it. I can only assume there is some powerful legacy constituency which is preventing them from doing so. Despite an awkward syntax and an astonishing lack of basic functionality (e.g. regular expressions, the ability to reprocess result trees), <abbr>XSLT</abbr> <var>1.0</var> runs quickly, and is more than expressive enough to handle the basic logic needed for template processing.</p>
          <p>The <abbr>XSLT</abbr> standard is now up to version <var>3.0</var>, which has a boatload of new capabilities including processing <abbr>JSON</abbr>. <a rel="ci:mentions" href="https://www.npmjs.com/package/saxon-js">There is an implementation of it in JavaScript</a> which can be bootstrapped in place of the native <var>1.0</var>. Alternatively, <abbr>XSLT</abbr> can be run as a filter on the server side.</p>
          <p>In my opinion, <abbr>XSLT</abbr> has a number of unique characteristics that set it apart from just about every other template processing language. Being a standard is one of them. Another is the fact that barring considerable effort, it is impossible to use it to generate invalid markup. Few other systems can claim that. I suspect its dearth of popularity has to do with being rooted in <abbr>XML</abbr>, which mainstream Web developers almost universally despise. My bet is if there was a compact <abbr>XSLT</abbr> syntax &#xE0; la <abbr>RelaxNG</abbr>, it would be more popular.</p>
        </div>
      </aside>
      
    </section>
    <section id="Eu252AgitBcQ3_w-DBDvkI" rel="dct:hasPart" resource="#Eu252AgitBcQ3_w-DBDvkI" typeof="bibo:DocumentPart">
      <h2 property="dct:title">Layered System: Pipes with Types</h2>
      <p>If we stipulate that all communication is relegated to <abbr>HTTP</abbr> messages, and that each subcomponent is its own <dfn>microservice</dfn>, then we can contemplate two species of microservice:</p>
      <ul>
        <li>One species that responds directly to an <abbr>HTTP</abbr> request,</li>
        <li>another that manipulates an <abbr>HTTP</abbr> message, i.e., either request or response, in transit.</li>
      </ul>
      <p>We will dub the former a <dfn>content handler</dfn>, and the latter a <dfn>transform</dfn>. A content handler is well-understood&#x2014;that's what we typically write when we write Web apps. You give it an <abbr>HTTP</abbr> request and it returns a response. A transform is something you give a <em>response</em> and it returns a response. Here are some potential operations of transforms:</p>
      <dl>
        <dt><abbr>HTML</abbr>/<abbr>XML</abbr> markup</dt>
        <dd>Add social media metadata, rewrite links, rearrange document tree</dd>
        <dt>Raster images (e.g., photos)</dt>
        <dd>Resize, crop, and other rudimentary Photoshop-esque functions; change between file formats</dd>
        <dt>Audio/video</dt>
        <dd>Take clips or individual frames, transcode/resample (though the latter may not be something we do in real time)</dd>
        <dt>Other</dt>
        <dd>Transform between representations, e.g. Markdown to <abbr>(X)<abbr>HTML</abbr></abbr> or <abbr>RDFa</abbr> to <abbr>JSON-LD</abbr> (or Turtle, or <abbr>RDF</abbr>/<abbr>XML</abbr>&#x2026;), handle over-the-wire compression (indeed that is how compression is typically implemented already. More likely we'd have to <em>de</em>compress responses from upstream before we could operate on them.)</dd>
      </dl>
      <aside id="EOQu9zeajCXAc8TgN8DW7L" rev="oa:hasTarget" role="note" resource="../390bbdcd-e6a3-4097-b01c-f1380df035bb" typeof="oa:Annotation">
        <p property="oa:hasBody">We can also have <dfn>request transforms</dfn> that rewrite the request body (e.g. for <code>POSTs</code>), manipulate headers, and even the request <abbr>URI</abbr> or the request method itself.</p>
      </aside>
      <p>The idea behind transforms, at least <dfn>response transforms</dfn>, is that they don't have to know much&#x2014;if anything&#x2014;about the application: they are nominally bytes in, bytes out. This means that a transform written in one programming language could be used to alter the output of a content handler written in a different language, because all response transforms are also stand-alone microservices.</p>
      <aside id="E8WxuBzyeM2H0CX7_VmBFL" rev="oa:hasTarget" role="note" resource="../f16c6e07-3c9e-4336-b1f4-097eff566045" typeof="oa:Annotation">
        <p property="oa:hasBody">At the moment I'm not sure if request transforms should also be made into microservices. In my mind they definitely <em>are</em> functions that take an <abbr>HTTP</abbr> request and return an altered <abbr>HTTP</abbr> request. You <em>could</em> make that into a <code>POST</code> microservice that speaks <code>message/http</code>, I suppose, but what would use it?</p>
      </aside>
      <p>We can imagine two transform <em>stacks</em>: one that manipulates the request on the way in, and another that manipulates the response on the way out. We would configure these statically, passing the <abbr>HTTP</abbr> message from one transform to the next. It will be useful for transforms triggered earlier in the process to append a later one, should certain conditions be met. It will likewise be useful for us, in addition to both static configuration and conditional processing, to explicitly <em>address</em> response transforms as parameters to other <abbr>URIs</abbr>, which in turn could take their own parameters. This has the side effect of creating <abbr>URIs</abbr> which are legible as being derived from non-parametrized resources. Consider a request like <code>GET /photo;scale=100,100</code>. We map <code>scale=100,100</code> under the hood to <code>POST /scale?width=100&amp;height=100</code> and pass in the request body. We use <em>path</em> parameters for response transforms because they faithfully communicate the sequence of operations (e.g. <code>/photo;desaturate;crop=200,100,1000,1000;scale=100,100</code>), but also to keep <em>query</em> parameters free for parametrizing content handlers.</p>
      <aside id="EUVgezQ0VjSqUvpRKoTdPJ" rev="oa:hasTarget" role="note" resource="../51581ecd-0d15-48d2-9a94-be944aa1374f" typeof="oa:Annotation">
        <p property="oa:hasBody">A few years ago I designed a <a rel="ci:mentions" href="https://vocab.methodandstructure.com/transformation#">Transformation Functions Ontology</a> for representing transforms, their additional scalar parameters (in both positional and named form), the result when the function is applied, and even partial application (also known as <dfn>currying</dfn>).</p>
      </aside>
      
    </section>
    <section id="Ez-whtsVwAMXHONB7v4LPJ" rel="dct:hasPart" resource="#Ez-whtsVwAMXHONB7v4LPJ" typeof="bibo:DocumentPart">
      <h2 property="dct:title">Wholeness</h2>
      <p>One thing that has struck me as remarkable about Web development, going back decades, is the preponderance of <em>fragments</em>. Going back as far as <abbr>CGI</abbr> scripts and <dfn>server-side includes</dfn>, the name of the game was typically to stitch together pieces of markup by concatenating strings of text. This pattern was carried forward by systems like <abbr>PHP</abbr>, <abbr>ASP</abbr>, <abbr>JSP</abbr>, ColdFusion, and many others. Fragments are the modus operandi of <abbr>MVC</abbr> frameworks that dominated the last decade; even the most modern frameworks like React, Angular, and Vue still operate in terms of fragments.</p>
      <p>Why I think this is significant is because <abbr>HTML</abbr>, which is the output that by far the most work goes into, smushes together a number of disparate equities:</p>
      <ul>
        <li>There are the technical concerns,</li>
        <li>matters of presentation,</li>
        <li>not to mention the content itself.</li>
      </ul>
      <p>All of these have to coexist in the same document. When you cut the document up into fragments, these diverging concerns are cut up as well. At the basic technical level, stitching together fragments introduces the potential for syntax errors in the markup, or mismatches in character encoding. Presentation markup (including metadata for both search engines and scripting) and <abbr>CSS</abbr> identifiers have to be interspersed in application code. Same goes with buttons, labels, and other <dfn>microcontent</dfn>.</p>
      <aside id="ELGMrxAJAUpVLSUfE231xL" rev="oa:hasTarget" role="note" resource="../2c632bc4-0240-4529-b54b-4947c4db7d71" typeof="oa:Annotation">
        <p property="oa:hasBody">What this means on the ground, in addition to having a brittle product, is that up to three different teams have to coordinate to make any changes.</p>
      </aside>
      <p>The approach I'm proposing instead borrows from a largely forgotten niche framework called <a rel="ci:mentions" href="https://cocoon.apache.org/">Cocoon</a>, which relies heavily on transformation pipelines to do its work. While Cocoon is firmly of an era, one thing it absolutely got right was the idea that a <dfn>representation</dfn> of an <dfn>information resource</dfn> can be put through a series of successive transformations, which are separate from the handler that produces the initial representation. For this to work, however, the initial representation (to say nothing of its transformed variants) has to be a complete object&#x2014;<em>not</em> a fragment. User-facing Web pages are frequently, if not <em>usually</em> composite&#x2014;that is, they represent more than one subject entity. There's no rule saying the parts can't <em>also</em> be valid, addressable wholes unto themselves. Again, moving composition into the <abbr>HTTP</abbr> realm means content generated from different frameworks can mingle. Since each elementary component produces a complete document unto itself, it is a crisply-defined development target that can be created and evaluated independently.</p>
      <p>The real inspiration for this architectural principle, though, comes from the later work of the architect Christopher Alexander. Rather than approaching a problem as a matter of assembling parts which individually have little to no value, the idea is to produce a <em>whole</em>, and put it through a series of structure-preserving transformations. Here the natural whole is the <abbr>HTTP</abbr> <dfn>resource</dfn>, which is a many-to-many mapping between a set of <abbr>URIs</abbr> and a set of <dfn>representations</dfn>. This means at least one address is related to at least one literal byte segment. Polymorphic resources can thus be constructed out of elementary ones (including and <em>especially</em> representations derived via transformation function). Larger structures can then be composed from individual resources. What you notice when you do this is that you start thinking about individual resources in terms of the information they need to provide to be useful to the contexts that employ them, which encourages you to design more durable and modular resources. This is a technique I have been using for about fifteen years: I don't, for example, design a navigation, I design a stand-alone <em>table of contents</em> and then <em>transform</em> it into a navigation.</p>
      <aside id="Ef-oXuae0Qb-EJv54QMheJ" rev="oa:hasTarget" role="note" resource="../7fea17b9-a7b4-441b-9f84-26fe7840c85e" typeof="oa:Annotation">
        <p>Past criticism of this technique has cited a lack of context, but I find it's not a matter of <em>no</em> context, but rather at <em>at least one</em> context, <em>including</em> the <q>default context</q> of just observing the resource by itself in a browser.</p>
      </aside>
      <p>What we find when we do this is we are biased toward creating resources that do double duty as stand-alone pages as well as components of other pages. In Alexander parlance this is called <dfn>strong centers</dfn>: rather than be meaningless when separated from the whole, every part has an identity of its own.</p>
      
    </section>
    <section id="Ew-DJE-ESlkG78ZV-2_YsL" rel="dct:hasPart" resource="urn:uuid:c3e0c913-e112-4964-b1bb-f1957edbf62c" typeof="ci:Appendix">
      <h2 property="dct:title">Appendix: Why Ruby?</h2>
      <p>I was originally going to write (what ultimately became) <code>Intertwingler</code> in <a rel="ci:mentions" href="https://www.python.org/">Python</a>, but ended up writing it in <a rel="ci:mentions" href="https://ruby-lang.org/">Ruby</a> because <a rel="ci:mentions" href="https://ruby-rdf.github.io/">Ruby's <abbr>RDF</abbr> library</a> has a <a rel="ci:mentions" href="https://en.wikipedia.org/wiki/Semantic_reasoner"><dfn>reasoner</dfn></a>, and <a rel="ci:mentions" href="https://rdflib.readthedocs.io/en/stable/">Python's</a>, at least at the time, did not. I briefly contemplated writing it in <a rel="ci:mentions" href="https://clojure.org/">Clojure</a> (on top of something like <a rel="ci:mentions" href="https://jena.apache.org/">Jena</a>), and still may, but at the time (<time>2018</time>) I didn't feel like paying for a bigger server in perpetuity just so I could run a <abbr>JVM</abbr>. So, Ruby it is.</p>
      <p>One of the things you notice working with <abbr>RDF</abbr> is how soon you find yourself needing to do <dfn>entailment</dfn>, and for that you need a <dfn>reasoner</dfn>. If you want to say something like <q>select all resources of type <var>X</var></q>, you're going to want to include class <var>X</var>, its subclasses, and any equivalent classes of any of those. If you want to say <q>get me all the resources related to subject <var>S</var> by property <var>P</var></q>, you're going to want to do the same thing but for subproperties and their equivalents. A reasoner is table stakes if you want to do any graph manipulation <abbr>UI</abbr>: <q>Give me all the properties with a domain of this class</q>, <q>give me all the instances of classes in the range of this property</q>; it is a folly to attempt to code these operations by hand. Problem is, reasoners are scarce, and not all of them are created equal. Furthermore, most reasoners are written in Java, which sharply limits how you can interact with them. The Ruby one is simplistic; it attaches to the vocabularies themselves and provides ad-hoc entailments for class and property terms, which is more than sufficient for the examples I just gave. Other reasoners will generate all the entailments for an entire instance graph in one shot, which is almost certainly never what you want.</p>
      <aside role="note">
        <p>I just want to briefly digress here on the topic of querying graphs for the purpose of making <abbr>RDF</abbr>-driven Web apps: in general I find <abbr>SPARQL</abbr> to be overkill because I'm rarely doing anything that needs functionality remotely close to the full query algebra. It's almost never worth the overhead of <abbr>SPARQL</abbr>&#x2014;unless obviously your interface to persistent storage is <abbr>SPARQL</abbr> under the hood. Even still, though, <abbr>SPARQL</abbr> does not have the same maturity as <abbr>SQL</abbr> query optimizers and it's easy to inadvertently generate some <em>very</em> expensive queries; as such I like to be conscious of when I'm actually using <abbr>SPARQL</abbr> and just use basic graph patterns on locally-attached storage when I can.</p>
      </aside>
      <p>I am actually inclined to diagnose the lack of reasoners outside of the Javasphere as a significant contributing factor to <abbr>RDF</abbr> gaining less traction than it otherwise would have. Over a decade and a half of observation suggests that the bulk of <dfn>Semantic Web</dfn> tooling has been written in Java due to its inherent bias toward academia. A reasoner is a hard piece of software to write; it requires an intersection of skills few people possess. It is likewise a thankless task because it produces results that are several degrees removed from anything most people can directly make any money off of. It's no surprise that the few extant specimens are concentrated where they are. The reasoner in Ruby's <abbr>RDF</abbr> library is rudimentary (this should not be controversial) compared to more sophisticated designs, and has very acute limitations, but is nevertheless serviceable. This is more than what can be said for other programming languages. Hopefully, however, for these other languages, somebody will rise to the challenge.</p>
    </section>
  </body>
</html>
