<?xml version="1.0"?>
<?xml-stylesheet href="/transform" type="text/xsl"?>
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:bs="http://purl.org/ontology/bibo/status/" xmlns:ci="https://vocab.methodandstructure.com/content-inventory#" xmlns:dct="http://purl.org/dc/terms/" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:xhv="http://www.w3.org/1999/xhtml/vocab#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" lang="en" prefix="bibo: http://purl.org/ontology/bibo/ bs: http://purl.org/ontology/bibo/status/ ci: https://vocab.methodandstructure.com/content-inventory# dct: http://purl.org/dc/terms/ foaf: http://xmlns.com/foaf/0.1/ rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# xhv: http://www.w3.org/1999/xhtml/vocab# xsd: http://www.w3.org/2001/XMLSchema#" vocab="http://www.w3.org/1999/xhtml/vocab#" xml:lang="en">
  <head>
    <title property="dct:title">The Great Deshittification</title>
    <base href="https://doriantaylor.com/the-great-deshittification"/>
    <link href="document-stats#EQbViMwN58nhwKCEv6VSEK" rev="ci:document"/>
    <link href="elsewhere" rel="alternate bookmark" title="Elsewhere"/>
    <link href="this-site" rel="alternate index" title="This Site"/>
    <link href="http://purl.org/ontology/bibo/status/published" rel="bibo:status"/>
    <link href="" rel="ci:canonical" title="The Great Deshittification"/>
    <link href="lexicon/#software-developer" rel="dct:audience" title="software developer"/>
    <link href="person/dorian-taylor#me" rel="dct:creator" title="Dorian Taylor"/>
    <link href="person/dorian-taylor" rel="meta" title="Who I Am"/>
    <link about="./" href="3f36c30c-6096-454a-8a22-c062100ae41f" rel="alternate" type="application/atom+xml"/>
    <link about="./" href="this-site" rel="alternate"/>
    <link about="./" href="elsewhere" rel="alternate"/>
    <link about="./" href="e341ca62-0387-4cea-b69a-cdabc7656871" rel="alternate" type="application/atom+xml"/>
    <link about="./" href="f07f5044-01bc-472d-9079-9b07771b731c" rel="alternate" type="application/atom+xml"/>
    <link about="verso/" href="3f36c30c-6096-454a-8a22-c062100ae41f" rel="alternate" type="application/atom+xml"/>
    <link about="verso/" href="this-site" rel="alternate"/>
    <link about="verso/" href="elsewhere" rel="alternate"/>
    <meta content="Recounting an impromptu episode last week where I helped Venkat Rao fix his blog, Ribbonfarm dot com." name="description" property="dct:abstract"/>
    <meta content="2023-10-23T22:59:44.313761+00:00" datatype="xsd:dateTime" property="dct:created"/>
    <meta content="2023-10-23T23:00:19+00:00" datatype="xsd:dateTime" property="dct:modified"/>
    <meta about="person/dorian-taylor#me" content="Dorian Taylor" name="author" property="foaf:name"/>
    <meta content="summary" name="twitter:card"/>
    <meta content="@doriantaylor" name="twitter:site"/>
    <meta content="The Great Deshittification" name="twitter:title"/>
    <meta content="Recounting an impromptu episode last week where I helped Venkat Rao fix his blog, Ribbonfarm dot com." name="twitter:description"/>
    <object>
      <nav>
        <ul>
          <li>
            <a href="document-stats#EQbViMwN58nhwKCEv6VSEK" rev="ci:document" typeof="qb:Observation">
              <span>urn:uuid:41b56233-0379-4f27-a870-28212fe95484</span>
            </a>
          </li>
        </ul>
      </nav>
    </object>
  </head>
  <body about="" id="E0xT1w_iIwNvE_TkP80brK" typeof="bibo:Article">
    <p>I have arranged my career to interact with <a href="https://en.wikipedia.org/wiki/PHP" rel="dct:references"><abbr>PHP</abbr></a> and <a href="https://en.wikipedia.org/wiki/MySQL" rel="dct:references">MySQL</a> as little as I can get away with. I have made <em>extra</em> effort to avoid the chimerical fusion of these two products into the gaping hellmouth known as <a href="https://en.wikipedia.org/wiki/WordPress" rel="dct:references">WordPress</a>. <span>MySQL</span> I haven't used by choice since <a href="https://en.wikipedia.org/wiki/PostgreSQL" rel="dct:references">Postgres</a> was a viable alternative, around early <time>2002</time>. My distaste for <abbr>PHP</abbr> goes back even farther, mainly because I already had an established technique by the time it had started gaining traction, and could get by just fine without it. But moreover, I've always found the language <a href="https://eev.ee/blog/2012/04/09/php-a-fractal-of-bad-design/" rel="dct:references">to be an absolute mess</a> that is <em>uniquely</em> well-suited to enabling people to create even <em>bigger</em> and more elaborate messes.</p>
    <aside role="note">
      <p><abbr>PHP</abbr> has two identifiable advantages over other programming languages&#x2014;one deeply structural, and the other incidental to time and place. The first is that barring certain stylistic interventions, there is a direct, one-to-one relationship between the source code you upload to a server and <span class="parenthesis" title="The ability to plunk a piece of code down into a Web server's addressable space and have it Just Work&#x2122; is at once what makes WordPress so successful, as well as fundamentally unfixable in certain respects.">its effect you see on the website.</span> <a href="the-only-argument-you-will-ever-need-against-php" rel="dct:references" title="The Only Argument You Will Ever Need Against PHP">Enough experience will teach you that this is actually a liability</a>, but to a novice it's particularly gratifying. You <em>can</em> write <abbr>PHP</abbr> like the Web frameworks in other languages, squeezed through the needle's eye of some kind of dispatcher construct, but if you do that, you might as well just use <em>any</em> of those other languages, as it will be cleaner, more concise, better organized, <em>definitely</em> more secure, and probably faster too.</p>
      <p>The other factor of <abbr>PHP's</abbr> popularity compounded how easy it was to get into. The cheap shared Web hosting providers of the <time>late 1990s and early 2000s</time> would offer it as a bundled feature, whether you cared or not. Unlike contemporaneous competitors with very similar properties, such as <a href="https://en.wikipedia.org/wiki/Coldfusion" rel="dct:references">Coldfusion</a>, <span class="parenthesis" title="and still is"><abbr>PHP</abbr> was</span> open-source and thus had no licensing fees. The result was a glut of self-taught <abbr>PHP</abbr> programmers, to whom prospective employers took notice, reinforcing the cycle. <span class="parenthesis" title="much to my chagrin">As such</span>, <abbr>PHP</abbr> has been the engine of journeyman Web development for the last two decades, its share only just recently being eaten into by <span>JavaScript</span>, the one language that has an <em>even</em> lower bar.</p>
    </aside>
    <p>This brings us to <span>WordPress</span>, the engine that powers an impressive <em>one third</em> of the Web. There's a reason for that, which I have characterized in the past as <dfn>expertise arbitrage</dfn>. That is, the skill level required to set <span>WordPress</span> up, get it pulling actual weight, and even tinkering with and extending it, is orders of magnitude narrower than the skill level required to unfuck it when it breaks. Now, you <em>could</em> argue that this asymmetry is everywhere, but I submit that in the <span>WordPress</span> ecosystem, this phenomenon is especially, if not <em>uniquely</em> pronounced. I say this as a person who has de-<span>Viagra</span>'d my fair share of <span>WordPress</span> blogs when nobody else for miles around had the first clue what to do, despite <em>them</em> being the ones who work with it for a living and <em>me</em> doing everything in my power to stay as far away from that consummate lifesuck as possible. The gist is, as long as everything is fine, <em>you're</em> fine, but the second something unusual happens, you're fucked. And even being a <span>WordPress</span> or even a <abbr>PHP</abbr> or <span>MySQL</span> guru isn't enough to fix it.</p>
    <aside role="note">
      <p>I should footnote that I don't consider myself special; my skillset more or less reduces to an incrementally above-average reading comprehension, coupled to <span class="parenthesis" title="or more accurately, computer programmers">a profound distrust of computers</span>.</p>
    </aside>
    <p>Such is the case of this episode. <a href="https://venkateshrao.com/" rel="dct:references">Venkat</a> is <span class="parenthesis" title="and definitely not PHP or MySQL">not in the <span>WordPress</span></span> business, except as a user. The hosting company that runs the <span>WordPress</span> instance that powers <a href="https://ribbonfarm.com/" rel="dct:references"><code>ribbonfarm.com</code></a> recently forced an upgrade to <span>MySQL</span> 8.0, and the net effect was that a bunch of weird symbols started showing up in his blog posts. When asked about this mishap, the tech support staff at this premium, top-dollar, <span>WordPress</span>-specific hosting provider, said something on the order of <q>we don't know, perhaps you should hire a database guy.</q></p>
    <aside role="note">
      <p>This remark, in my opinion, speaks to the gravity of the <dfn>expertise arbitrage</dfn> that is endemic in the <span>WordPress</span> ecosystem. Even full-time employees of a high-end dedicated <span>WordPress</span> hosting provider couldn't diagnose this ultimately common problem. How much do you wanna bet that at <em>least</em> one other customer experienced the same issue?</p>
    </aside>
    <p>Now, the single longest stint I ever worked as a full-time employee happened to be four years as an infrastracture developer in a heavily-in&#xAD;ter&#xAD;nat&#xAD;ion&#xAD;al&#xAD;ized environment in the early aughts before <abbr>UTF-8</abbr> was ubiquitous, so I have seen just about every permutation of encoding cock-up. They often have a <q>signature</q> of sorts, and this one was not unusual in that regard. In particular, this had the hallmark of having taken a run of text that had been encoded in Unicode <abbr>UTF-8</abbr>, treated as <em>if</em> it had been encoded in an older single-byte Latin character set, and then encoded <em>again</em> in <abbr>UTF-8</abbr>.</p>
    <p>If you aren't aware, <a href="https://tonsky.me/blog/unicode/" rel="dct:references"><abbr>UTF-8</abbr> is a variable-byte encoding system</a> with built-in error detection, such that any single <em>virtual</em>, non-composite <abbr>UTF-8</abbr> character&#x2014;above code point 127 that is&#x2014;is two to four <em>literal</em> bytes with a distinctive pattern. Namely, the top bits of the bytes are always masked, and the first byte in a sequence is masked in a way that tells you how many more to expect. That way the parser always knows if a <abbr>UTF-8</abbr> character has been mangled, or if it is intact. When you look at the bytes of a <abbr>UTF-8</abbr> character like <code>U+2014</code> (that would be the em-dash, <code>&#x2014;</code>) in hexadecimal notation, they look like <code>E2 80 94</code>. Through the lens of a single-byte encoding scheme like <a href="https://en.wikipedia.org/wiki/Windows-1252" rel="dct:references"><abbr>CP-1252</abbr></a>, those bytes turn into this: <code>&#xE2;&#x20AC;&#x201D;</code>. Except the <em>actual</em> bytes on the wire were <code>C3 A2 E2 82 AC E2 80 9D</code>. That is, they <em>were</em> valid <abbr>UTF-8</abbr>, just not the text that was intended:</p>
    <ul>
      <li><code>E2</code> (<code>&#xE2;</code>) becomes <code>C3 A2</code>,</li>
      <li><code>80</code> (<code>&#x20AC;</code>) maps to <code>U+20AC</code> which becomes <code>E2 82 AC</code>, and</li>
      <li><code>94</code> (<code>&#x201D;</code>) maps to <code>U+201D</code> which turns into <code>E2 80 9D</code>.</li>
    </ul>
    <p>Diagnosis: text that was <em>actually</em> <abbr>UTF-8</abbr> was treated as <abbr>CP-1252</abbr> and encoded as <abbr>UTF-8</abbr> a second time.</p>
    <aside role="note">
      <p>To test this, I busted out <a href="http://pry.github.io/" rel="dct:references"><code>pry</code></a> and wrote the following <a href="https://ruby-lang.org/" rel="dct:references">Ruby</a> one-liner:</p>
      <pre style="white-space: pre-wrap">[37] pry(main)&gt; [0xc3, 0xa2, 0xe2, 0x82, 0xac, 0xe2, 0x80, 0x9d].&#x200B;map(&amp;:chr).join.force_encoding(Encoding::UTF_8).&#x200B;encode(Encoding::CP1252).force_encoding(Encoding::UTF_8)
=&gt; "&#x2014;"</pre>
    </aside>
    <p>Having settled on <em>what</em> had happened, the next step was to figure out <em>where</em> it was happening. Was it <span>WordPress</span> that was doing this? The database? A plug-in? This was initially unclear. I'll spare you some of the drama and confirm that the culprit was indeed the database. Getting there, however, was a bit of a challenge. See, <span>MySQL</span> is pretty loosie-goosie about a lot of things, and character sets are but one tiny example. <a href="https://dev.mysql.com/blog-archive/new-defaults-in-mysql-8-0/" rel="dct:references">Up until version 8.0</a>, which shipped in 2018, <span>MySQL</span> had been setting something it calls <code>latin1</code> as its default character set.</p>
    <aside role="note">
      <p>By <code>latin1</code> they actually appear to mean Windows <abbr>CP-1252</abbr>, which is a <em>superset</em> of the <em>real</em> Latin-1, which itself is a synonym for <a href="https://en.wikipedia.org/wiki/ISO-8859-1" rel="dct:references"><abbr>ISO-8859-1</abbr></a>. Like I said, <em>loosie-goosie</em>.</p>
    </aside>
    <p>Historically for <span>MySQL</span>, while the character set was <em>nominally</em> <code>latin1</code>, it didn't especially care what you threw into it. The only time it would matter would be if you were trying to sort text, search within it, or otherwise match pieces of text together. This is where a thing called <a href="https://en.wikipedia.org/wiki/Collation" rel="dct:references"><dfn>collation</dfn></a> would come into play&#x2014;essentially normalization functions that get automatically applied to the comparands that put them in a definite order or increase the likelihood of a match.</p>
    <aside role="note">
      <p>Incidentally, until 8.0, <span>MySQL</span> shipped Swedish as its default collation, it being a Swedish product. Since collation is generally about managing accented letters, this is something you would never notice if your database was exclusively filled with English.</p>
    </aside>
    <p><span class="parenthesis" title="or any SQL database, really, but it's almost always paired with MySQL">Since <span>WordPress</span> effectively uses <span>MySQL</span></span> as little more than bulk storage, and since MySQL doesn't enforce <code>latin1</code> encoding&#x2014;not that there's really anything to enforce&#x2014;<span class="parenthesis" title="or anything else, for that matter">WordPress</span> can just ignore the declared character set and shove the database full of <abbr>UTF-8</abbr>. Fast-forward to this forced-upgrade business on the part of <span>WPEngine</span>, the hosting provider. Did they <em>Do Something</em>&#x2122; to the data when they upgraded? I mean <em>somebody</em> must have, 'cause the data didn't wreck itself.</p>
    <aside role="note">
      <p>This has the smell of an upgrade script that didn't account for a particular edge case. It isn't really worth the candle to figure out whose fault it was, whether <span>WPEngine</span>, <span>WordPress</span>, or <span>MySQL</span> itself. That said, as <span>WPEngine</span> is the proximate service provider, I am not terribly impressed with their attitude toward the mishap.</p>
    </aside>
    <p>You can define the default character set on <span>MySQL</span> at the system level, <span class="parenthesis" title="the system can host more than one">per database</span>, per table in a given database, and per column in a table. In addition to these, you can set the character set as sent from the <em>client</em>, per <em>session</em> as received by the server, and last but not least, the <em>results</em> the server emits back to the client. This last one is what got me.</p>
    <p>A number of the individual <em>columns</em>, particularly those containing the content of posts and comments, were set to <code>latin1</code>, but to select them on the <code>mysql</code> command line, they were showing up in my terminal as <abbr>UTF-8</abbr>. What the hell was happening? Well, turns out the <code>character_set_results</code> variable, despite a wholesale upgrade across the board, was still set to <code>latin1</code>. What this meant was that the <code>mysql</code> command-line client was <em>decoding</em> the <em>spurious</em> <abbr>UTF-8</abbr> content into the <em>actual</em> <abbr>UTF-8</abbr> content, which it thought was <code>latin1</code>. This had the effect of rendering <em>correctly</em> in my <abbr>UTF-8</abbr> terminal, which made absolutely no sense. It wasn't without further probing the documentation that I eventually found and set the variable to <code>utf8</code> and saw the <em>wrong</em> characters in my terminal, that I determined that the wrong character sequences were what were actually stored in the database.</p>
    <aside role="note">
      <p>This episode was further complicated by the fact that Ribbonfarm runs behind a <abbr>CDN</abbr>, so we were having trouble determining if our interventions had fixed the problem and what we were seeing was merely cache. (Sometimes there's an alternate hostname for a site that's unmediated, but if there was one, I didn't know what it was.)</p>
    </aside>
    <p>Satisfied with both the diagnosis and the proposed treatment&#x2014;the bad characters were in the database and the task was to get them out&#x2014;the question moved to how to go about doing that. Since the database user didn't have access to modify the <code>INFORMATION_SCHEMA</code>&#x2014;a sensible mitigation tactic&#x2014;the easiest thing to do to purge those per-column <code>latin1</code> designations was to just <a href="https://www.gnu.org/software/sed/manual/sed.html" rel="dct:references"><code>sed</code></a> them out of the most recent database dump and reload it. <em>Why</em> they were there in the first place, I suspect, is because Ribbonfarm dates back to <time>2007</time>, long before <abbr>UTF-8</abbr> was <span>MySQL</span> policy. Alternatively, they could have been put there during some other past upgrade of one part of the system or another as a prophylaxis <em>against</em> messing up the character set, eventually turning out to have the opposite effect.</p>
    <p>The uncompressed database dump was about 300 megabytes. To complicate matters, <span>MySQL</span> puts its bulk <code>INSERT</code> statements all on one line, and lines of text several megabytes long tend to choke out terminals and text editors. Here is where <a href="https://www.ribbonfarm.com/author/artem/" rel="dct:references">Artem</a> helped with a second set of eyes and hands. If we could just convert the database <em>dump file</em> to <abbr>UTF-8</abbr> using an off-the-shelf program like <a href="https://www.gnu.org/savannah-checkouts/gnu/libiconv/documentation/libiconv-1.17/iconv.1.html" rel="dct:references"><code>iconv</code></a>, we could reimport it wholesale and this would be a done deal; no screwing around with <abbr>SQL</abbr>.</p>
    <p>Not so fast. Turns out that <code>9D</code> is one of the few unassigned bytes in <abbr>CP-1252</abbr> (what <span>MySQL</span> loosely called <code>latin1</code>) and the dump file was full of them (they are, as I described above, the result of double-encoding an em-dash). Nominally, the solution would have been as easy as piping the whole 300MB dump through <code>iconv</code>, but since there was no viable <em>target</em> encoding (<abbr>CP-1252</abbr> being the closest one but crucially missing that <code>9D</code>), <code>iconv</code> was a bust. Here is where <span>Artem</span> tried just writing something to brute-force the conversion, compliance be damned.  This attempt was foiled as well, as some of the <em>comments</em> were neither <abbr>UTF-8</abbr> <em>nor</em> <code>latin1</code>. Not really surprising for a sixteen-year-old blog, come to think of it. So we were back to looking for an <abbr>SQL</abbr> solution.</p>
    <p>After some trial and error, the database eventually yielded to the following incantation:</p>
    <pre style="white-space: pre-wrap">UPDATE `table` SET `column` = CONVERT(CAST(CONVERT(`column` USING latin1) AS binary) USING utf8mb4);</pre>
    <p>&#x2026;where <code>`table`</code> and <code>`column`</code> are supplanted with whatever ones you're trying to fix. There was yet <em>another</em> hiccup in the process due to <span>MySQL</span> having <em>two separate</em> <abbr>UTF-8</abbr> encodings, the distinction between which didn't become apparent until we tried to convert a comment containing an emoji: &#x1F607;. On <span>MySQL's</span> long road to Unicode compliance, they made a pit stop with a flavour of <abbr>UTF-8</abbr> encoding that maxed out at three bytes per character (which they call <code>utf8mb3</code>), and emojis take <em>four</em> bytes to store. So now the Ribbonfarm database is 100% <code>utf8mb<em>4</em></code>, and this species of problem should never happen again.</p>
    <aside role="note">
      <p>That <abbr>SQL</abbr> command can of course be adjusted with a <code>WHERE</code> clause to do things like steer around errant spam comments that are encoded in who-knows-what, and brand-new posts which are correctly stored in <abbr>UTF-8</abbr>. I should also add that MySQL has an <a href="https://dev.mysql.com/doc/refman/8.0/en/alter-table.html#alter-table-character-set" rel="dct:references"><code>ALTER TABLE `table` CONVERT TO CHARACTER SET `charset`</code></a> command, but since it was already confused, I didn't want to chance it.</p>
    </aside>
    <section>
      <header>
        <h2>Coda: Why do websites even use <abbr>SQL</abbr>, anyway?</h2>
        <h3>&#x2026;and why <span>MySQL</span> in particular?</h3>
      </header>
      <p>The pattern of backing a website onto an <abbr>SQL</abbr> database is not immediately obvious. Rather, it's a relationship of convenience. <abbr>SQL</abbr> is <em>one</em> way to address complex, fine-grained data structures, and store them such that they persist. Every other strategy available at the time&#x2014;the <time datetime="1994">mid-1990s</time>, that is&#x2014;involved trading off either the addressability or complexity of the structure, or the ability for two or more requests to write at the same time (still a constraint with <a href="https://en.wikipedia.org/wiki/SQLite" rel="dct:references">SQLite</a>, which powers many single-user desktop and mobile apps, and is actually quite a bit younger than <span>MySQL</span>). <abbr>SQL</abbr> databases also had built-in data type serializations that developers would otherwise have had to concoct by hand.</p>
      <p>Now, <abbr>SQL</abbr> databases were typically expensive things&#x2014;<a href="https://en.wikipedia.org/wiki/Oracle_database" rel="dct:references">Oracle</a>, <a href="https://en.wikipedia.org/wiki/Sybase" rel="dct:references">Sybase</a>/<a href="https://en.wikipedia.org/wiki/Microsoft_SQL_Server" rel="dct:references">Microsoft</a>, <a href="https://en.wikipedia.org/wiki/DB2" rel="dct:references">IBM DB2</a>. They were made to run on mainframes&#x2014;to power banks, government agencies, that kind of thing. The free alternative was a thing called <a href="https://en.wikipedia.org/wiki/MSQL" rel="dct:references">mSQL</a> (for <q>mini</q>), which still exists. <a href="https://en.wikipedia.org/wiki/MySQL#History" rel="dct:references">MySQL was started shortly after as a personal project</a>, its author citing a dissatisfaction with the performance of <span>mSQL</span>. Attention among Web developers quickly shifted to <span>MySQL</span>, making it effectively the only game in town for a while.</p>
      <p>The <span>MySQL</span> developers have always prioritized <em>speed</em>&#x2014;as in runtime performance&#x2014;often at the expense of correctness. For a long time, it infamously didn't have <em>transactions</em>&#x2014;a dealbreaker for anybody who was serious about their database platform. Transactions weren't even available on an <em>experimental</em> basis until <time>2001</time>, and not the default until <time>2010</time>. Other essential features like foreign key constraints didn't show up in <span>MySQL</span> until well into its career. <a href="https://www.postgresql.org/" rel="dct:references">PostgreSQL</a>, a couple years younger than <span>MySQL</span> and its chief competitor, prioritized <em>correctness</em>, and so got a reputation for being relatively sluggish.</p>
      <aside role="note">
        <p><span>PostgreSQL</span> caught up speedwise with <span>MySQL</span> around <time>2006</time>. <span>MySQL</span>, on the other hand, has accumulated so many things that depend on it behaving <em>in</em>correctly, it has painted itself into a corner.</p>
      </aside>
      <p>In many ways, the prevailing <em>style</em> of Web development has coevolved, not just with <abbr>SQL</abbr>, but with <span>MySQL</span> in particular. The absence of composite keys meant auto-incrementing serial numbers&#x2014;that were meaningless except within the database itself&#x2014;would show up in public-facing places like <abbr>URLs</abbr>, which would vanish when those serial numbers were reinitialized. Stitching <abbr>SQL</abbr> queries together gave way to libraries that availed themselves of placeholders and were thus more robust against <a href="https://xkcd.com/327/" rel="dct:references">injection attacks</a>. These were further expanded into <a href="https://en.wikipedia.org/wiki/Object-relational_mapper" rel="dct:references">object-relational mappers</a> and related infrastructure, which caused the field to switch from a <q>callback</q> paradigm&#x2014;which is central to the design of <abbr>PHP</abbr>, and if I understand correctly, still <span>WordPress</span>&#x2014;to <a href="https://en.wikipedia.org/wiki/Model_view_controller" rel="dct:references">model-view-controller</a>, with frameworks like <a href="https://rubyonrails.org/" rel="dct:references">Ruby on Rails</a>. This newer-generation middleware, in addition to eliminating much of the labour of safely and correctly constructing complex <abbr>SQL</abbr> queries, also spackled over most of <span>MySQL's</span> many shortcomings. It didn't matter, for example, that <span>MySQL's</span> date and time arithmetic is and always has been hot garbage, because the <abbr>ORM</abbr> would handle that. <a href="https://en.wikipedia.org/wiki/Database_trigger" rel="dct:references">Triggers</a> weren't necessary, because the <abbr>ORM</abbr> would handle that. The net effect is an <abbr>SQL</abbr> database that is calibrated for storing Web content, and very little else.</p>
      <aside role="note">
        <p>The alternative is to design the system so all the data integrity logic lives fully inside the database, so it doesn't matter whether a particular Web stack is running on top of it, or if it even interfaces with a website at all. That is, more like a conventional, pre-Web, general-purpose <abbr>SQL</abbr> database. You can pull this off with <span>PostgreSQL</span> (and I have), or maybe <span>Oracle</span> (though it will be excruciating) or <span>Microsoft</span>, but with <span>MySQL's</span> anemic support for procedural triggers, you will almost certainly hit a wall.</p>
      </aside>
      <p>If all you're doing with your database is serving Web pages, <abbr>SQL</abbr> is actually pretty clunky. For starters, <abbr>SQL</abbr> syntax is <em>super</em> verbose and generating it without using an <abbr>ORM</abbr> is crazy talk. Moreover, since a page is essentially a tree-like structure while <abbr>SQL</abbr> speaks in grids, you have a performance trade-off to make, between multiple round trips to the database server, and pulling many redundant results over the wire, which you promptly throw away. This has led to all kinds of increasingly baroque caching infrastructure&#x2014;cobbling together results in the application layer that would normally have been done in the database with a series of joins. This in turn led some to interrogate the wisdom of using <abbr>SQL</abbr> databases when they didn't really actually <em>use</em> the <abbr>SQL</abbr> part&#x2014;say, for business intelligence&#x2014;thus spawning the <a href="https://en.wikipedia.org/wiki/NoSQL" rel="dct:references">NoSQL</a> movement.</p>
      <p>No sooner had that happened, that people began looking to put back what <abbr>SQL</abbr> had been giving them for free. <abbr>NoSQL</abbr> basically put Web development right back where it was in <time>1994</time>, except with faster hardware and marginally better data hygiene. Completely lacking were any standards (with the possible exceptions of <abbr>HTTP</abbr> for the transport and <abbr>JSON</abbr> for what was being transported), so those started getting built back up (e.g., in the form of <a href="https://en.wikipedia.org/wiki/GraphQL" rel="dct:references">GraphQL</a>). To put it in perspective, <a href="https://en.wikipedia.org/wiki/MapReduce" rel="dct:references">Map&#xAD;Reduce</a>&#x2014;which has since receded into just another boring, utilitarian data processing task&#x2014;was still very much a hot topic. Now the state of the art is an Ourobouros of <a href="https://en.wikipedia.org/wiki/AWS_Lambda" rel="dct:references">lambdas</a> running <a href="https://en.wikipedia.org/wiki/Microservices" rel="dct:references">microservices</a> that interface over Web <abbr>APIs</abbr>. That this situation is any better is far from a self-evident proposition.</p>
      <aside role="note">
        <p>My own recent projects&#x2014;where <q>recent</q> is defined as the last ten years or so&#x2014;use an <a href="https://www.w3.org/TR/rdf12-concepts/" rel="dct:references"><abbr>RDF</abbr></a> graph database, also known as a <a href="https://en.wikipedia.org/wiki/Triplestore">triple store</a>. This is a completely different way to write Web-based software. <a href="https://www.w3.org/TR/sparql11-query/" rel="dct:references">It doesn't use <abbr>SQL</abbr></a>, but it's also not self-consciously <em>trying</em> not to use it. What it <em>does</em> have in common with the <abbr>NoSQL</abbr> movement is the data over which you operate is already much closer to the shape you ultimately need it in, so you spend a lot less effort restructuring it. My current project, a thing called <a href="https://intertwingler.net/" rel="dct:references">Intertwingler</a>, is a piece of website-making infrastructure designed to manage huge quantities of very small objects, densely connected to one another by a thicket of typed, bidirectional links. It's the kind of thing that would be prohibitive to write on top of <abbr>SQL</abbr>, or even using a conventional <abbr>NoSQL</abbr> strategy.</p>
      </aside>
      <p>In any case, the <abbr>SQL</abbr>-backed website has been a pretty hard thing to kill. Despite the fact that <span>WordPress</span>&#x2014;whose architecture ossified twenty years ago&#x2014;is starting to look like an antique, <span class="parenthesis" title="notwithstanding the occasional failure mode you won't be able to fix">on its face</span> it's a <a href="https://en.wikipedia.org/wiki/Satisficing">good-enough</a> strategy for small to medium systems that aren't trying to stray too far from convention. There is a deep well of both expertise and resources to draw on&#x2014;probably deeper than anything else in the industry. Even the <abbr>MVC</abbr> frameworks that constitute the generation that came immediately after <span>WordPress</span> are similar enough to each other that if you've seen one of them, you've more or less seen them all. There's even an argument to be made that since <em>scale</em> is a problem any product developer would love to have, an exotic, novel design is liable to bite you in the ass, and so <abbr>MVC</abbr> running on <abbr>SQL</abbr> is a plenty sensible way to to get started. I, however, have had my fill.</p>
    </section>
  </body>
</html>
