<?xml version="1.0"?>
<?xml-stylesheet href="/transform" type="text/xsl"?>
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:bs="http://purl.org/ontology/bibo/status/" xmlns:ci="https://vocab.methodandstructure.com/content-inventory#" xmlns:dct="http://purl.org/dc/terms/" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:schema="https://schema.org/" xmlns:xhv="http://www.w3.org/1999/xhtml/vocab#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" lang="en" prefix="bibo: http://purl.org/ontology/bibo/ bs: http://purl.org/ontology/bibo/status/ ci: https://vocab.methodandstructure.com/content-inventory# dct: http://purl.org/dc/terms/ foaf: http://xmlns.com/foaf/0.1/ rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# schema: https://schema.org/ xhv: http://www.w3.org/1999/xhtml/vocab# xsd: http://www.w3.org/2001/XMLSchema#" vocab="http://www.w3.org/1999/xhtml/vocab#" xml:lang="en">
  <head>
    <title property="dct:title">Google's Long March</title>
    <base href="https://doriantaylor.com/googles-long-march"/>
    <link href="document-stats#EZs3Zssh_jFtCv-eebMmrK" rev="ci:document"/>
    <link href="elsewhere" rel="alternate bookmark" title="Elsewhere"/>
    <link href="this-site" rel="alternate index" title="This Site"/>
    <link href="http://purl.org/ontology/bibo/status/published" rel="bibo:status"/>
    <link href="" rel="ci:canonical" title="Google's Long March"/>
    <link href="person/dorian-taylor#me" rel="dct:creator" title="Dorian Taylor"/>
    <link href="person/dorian-taylor" rel="meta" title="Who I Am"/>
    <link about="./" href="3f36c30c-6096-454a-8a22-c062100ae41f" rel="alternate" type="application/atom+xml"/>
    <link about="./" href="f07f5044-01bc-472d-9079-9b07771b731c" rel="alternate" type="application/atom+xml"/>
    <link about="./" href="this-site" rel="alternate"/>
    <link about="./" href="elsewhere" rel="alternate"/>
    <link about="./" href="e341ca62-0387-4cea-b69a-cdabc7656871" rel="alternate" type="application/atom+xml"/>
    <link about="verso/" href="3f36c30c-6096-454a-8a22-c062100ae41f" rel="alternate" type="application/atom+xml"/>
    <link about="verso/" href="this-site" rel="alternate"/>
    <link about="verso/" href="elsewhere" rel="alternate"/>
    <meta content="A thought experiment in walling off the open Web. This article picks on Google (Alphabet, whatever), but I could probably write one of these for each of them." name="description" property="dct:abstract"/>
    <meta content="2019-02-10T21:57:11+00:00" datatype="xsd:dateTime" property="dct:created"/>
    <meta content="2019-03-11T08:14:10+00:00" datatype="xsd:dateTime" property="dct:modified"/>
    <meta content="2022-05-31T15:10:50+00:00" datatype="xsd:dateTime" property="dct:modified"/>
    <meta about="person/dorian-taylor#me" content="Dorian Taylor" name="author" property="foaf:name"/>
    <meta content="summary" name="twitter:card"/>
    <meta content="@doriantaylor" name="twitter:site"/>
    <meta content="Google's Long March" name="twitter:title"/>
    <meta content="A thought experiment in walling off the open Web. This article picks on Google (Alphabet, whatever), but I could probably write one of these for each of them." name="twitter:description"/>
    <object>
      <nav>
        <ul>
          <li>
            <a href="document-stats#EZs3Zssh_jFtCv-eebMmrK" rev="ci:document" typeof="qb:Observation">
              <span>urn:uuid:66cdd9b2-c87f-48c5-ab42-bfe79e6cc9ab</span>
            </a>
          </li>
        </ul>
      </nav>
    </object>
  </head>
  <body about="" id="ECPjl3fAN1kzfvbDZia8nI" typeof="bibo:Article">
  <p>Google's <a href="https://about.google/" rel="dct:references">stated mission</a> has long been to <q>organize the world's information</q>. What they <em>mean</em> is that they want to <em>intermediate</em> the world's information: to insinuate themselves between you and the knowledge you need to conduct your life. This other meaning manifests partly their own designs, but also the way they have responded to events outside their immediate control.</p>
  <section id="EVY_bqJgnF7RBXEJfcZb5I">
    <h2>Mise en Sc&#xE8;ne</h2>
    <p>We have to begin somewhere, so it might as well be with the <dfn>JavaScript</dfn> engine <a href="https://v8.dev/" rel="dct:references"><dfn>V8</dfn></a>, which was developed in parallel to the initial release of Chrome. While it is desirable to have a fast Web browser, we must remember that a major cost centre at Google is crawling Web pages. Consider that at the time, the use of <abbr>AJAX</abbr>&#x2014;<dfn>Asynchronous JavaScript and <abbr>XML</abbr></dfn>&#x2014;had proliferated in the so-called <dfn>Web 2.0</dfn> development style.</p>
    <aside role="note" id="EamP7-2oGZGyY6nS6v0aWI">
      <p>Germane to this story: <dfn>V8</dfn> now also runs <a href="https://webassembly.org/" rel="dct:references"><dfn>WebAssembly</dfn></a>.</p>
    </aside>
    <p>Such Web properties must have cost orders of magnitude more per page to crawl than conventional pages, because rather than simply scanning a plain document for text, you would have to set up a virtual machine and <em>execute a program</em>, obtain its result, and scan <em>that</em>. Moreover, because <dfn>JavaScript</dfn> is <a href="https://en.wikipedia.org/wiki/Turing_completeness" rel="dct:references"><dfn>Turing-complete</dfn></a>, you need to take steps to protect the rest of the system from runaway processes and malicious code. No doubt a lot of this was one-time setup cost, and Google is really good at it by now, but a fast <dfn>JavaScript</dfn> engine is at least as important on the crawler side as it is in an end-user product.</p>
  </section>
  <section id="EuLqzUYaPpvxjymvibyQ0I">
    <h3>Commercial Diplomacy</h3>
    <blockquote class="quote" rel="dct:hasPart" resource="ni:/sha-256;Uq-l9YWHQ8zz9PY47Y0MH655rxGn3DQvI4LIV3Opiio" typeof="bibo:Excerpt" id="EjGxsd7guBul9SHUS_oazL">
      <p property="rdf:value">Technology standardization is commercial diplomacy and the purpose of individual players (as with all diplomats) is to expand one's area of economic influence while defending sovereign territory.</p>
      <div><a rev="dct:hasPart" href="https://stephesblog.blogs.com/papers/stdsprimer.pdf" rel="dct:references"><cite property="dct:creator">Stephen R. Walli</cite>, <span property="dct:title">Understanding Technology Standardization Efforts</span></a></div>
    </blockquote>
    <p>The installation of a Google employee as the editor of the <abbr>HTML5</abbr> specification is suspect. Hickson's stated position on structured data has been remarkably congruent to the interests of his employer: that there was no point to it, because so much <em>un</em>structured data already exists that would have to be organized by artificial intelligence anyway. His actions as specification editor reflected as much: open hostility toward <abbr>RDFa</abbr>, and only so much interest in semantic markup as is conducive to current practices in search engine optimization or client-side application development.</p>
    <aside role="note" id="EB2oLJ0KIgMqdlMqILsMWI">
      <p><a href="https://homepages.cwi.nl/~steven/Talks/2018/07-31-balisage/" rel="dct:references">A notable criticism of <abbr>HTML5</abbr></a> is that rather than cataloguing a set of terse, declarative, prescriptions and proscriptions about what is and is not valid syntax, the <abbr>HTML5</abbr> spec tells you, in exhaustive detail, the algorithm for parsing it. Not only does this make the spec longer&#x2014;about <var>1100</var> pages when printed&#x2014;it is effectively a computer program in its own right: a program which is executed by programmers.</p>
    </aside>
    <p><time datetime="2009">Ten years ago</time>, Google was keen about <a rel="dct:references" href="https://webmasters.googleblog.com/2009/05/introducing-rich-snippets.html"><q>rich snippets</q></a>, or the use of semantic metadata it found on the Web to embellish its own search result pages&#x2014;with photos, ratings, and the like. <time datetime="2011">Not long after</time>, having digested the experience, Google teamed up with Microsoft and Yahoo to create a one-stop simplified ontology, purpose-built for search engine optimization, known as <a rel="dct:references" href="https://schema.org/"><dfn>schema.org</dfn></a>. This ontology, while theoretically expressable as <abbr>RDF</abbr>, discards a lot of its features. You would have to perform major surgery on any <dfn>schema.org</dfn> content you consume in order to use it with other <abbr>RDF</abbr> data. In other words, the principal beneficiaries of producing <dfn>schema.org</dfn> metacontent are none other than the consortium's proponents.</p>
    <aside role="note" id="ExhXmlKY3r495G9am7CTDJ">
      <p>This is not to say that Google has lost interest in information systems that achieve the same objectives as the <dfn>Semantic Web</dfn>: they just want to keep it for themselves. Google's <time>2010</time> acquisition of <a rel="dct:references" href="https://en.wikipedia.org/wiki/Metaweb">Metaweb</a> and its flagship product <a rel="dct:references" href="https://en.wikipedia.org/wiki/Freebase">Freebase</a>, has been developed into <a rel="dct:references" href="https://developers.google.com/knowledge-graph/">Knowledge Graph</a>, which powers both Web search results, and any product that responds to <q>OK, Google</q>.</p>
    </aside>
    <p><a rel="dct:references" href="https://www.wired.com/story/google-chrome-kill-url-first-steps/">Google has also announced</a> that it wishes to <q>kill the <abbr>URL</abbr></q>, on the grounds that <abbr>URLs</abbr> can be spoofed, that they confuse people, that they are long and ugly, and that mobile screens are too small to fit them. Specifically, they want to de-emphasize the <abbr>URL</abbr> as both a signal of authority over content and of orienteering for the user. Taken together with their <a rel="dct:references" href="https://www.ampproject.org/"><dfn>Accelerated Mobile Pages</dfn> initiative</a>, this proposition is somewhat disturbing. This is because the <q>acceleration</q> of <abbr>AMP</abbr> is provided by Google, in this case by literally intermediating your requested content through their cache infrastructure. So if you removed <abbr>URLs</abbr> from the user interface, and you're serving the content yourself, what's stopping you from pulling out of the Web entirely?</p>
    <p>As the proprietor of YouTube, Google has an interest in, and is a chief proponent of, the <a rel="dct:references" href="https://www.w3.org/TR/encrypted-media/"><dfn>Encrypted Media Extensions</dfn></a> specification from the <abbr>W3C</abbr>. This controversial standard is considered to be a compromise on the part of the <abbr>W3C</abbr>, with the argument that it is better to <em>standardize</em> <abbr>DRM</abbr> than it is to have a zillion competing proprietary systems and potentially another browser war. Whether you buy that position or not, the standard <em>has</em> been implemented, though it currently only covers audiovisual content.</p>
  </section>
  <section id="EmHXXGZchm0FK_qanu1nPJ">
    <h2>The Coming Executable Web</h2>
    <p>This situation gets more interesting when you consider that <dfn>WebAssembly</dfn> is quickly becoming a viable mechanism for delivering executable code: <abbr>Wasm</abbr> can be understood as a compilation target for basically <em>any</em> programming language, not just JavaScript. Your Web browser becomes a virtual machine where apps can run, that are just as good as native code.</p>
    <aside role="note" id="EHPTFXs0UhLeVDszM1MshL">
      <p>In other words, this is what Sun tried to do with Java applets circa 1995, and Microsoft very shortly after with ActiveX.</p>
      <p>It is also worth mentioning that this state of affairs addresses Alan Kay's early and vociferous complaint about the Web: that browsers ought to just be virtual machines that download apps off the network and execute them. While it appears he is getting his way, there is still demonstrable value in having the bulk of the content not only be amenable to <q>view source</q>&#x2014;a state which Kay would agree with, considering it was an essential feature of <dfn>SmallTalk</dfn>&#x2014;but also be deterministic and declaratively defined.</p>
    </aside>
    <p>One of the fundamental characteristics of the World-Wide Web, one that has been around since the very beginning, is that you can simply <q>view source</q> on any aspect of it, and from there, if you're perseverent, you can learn how any particular effect is achieved. We are <em>already</em> at the point, with <span class="parenthesis" title="or whatever it's called these days"><abbr>AJAX</abbr></span>, and <q>minified</q> JavaScript frameworks, and all manner of obfuscation and voodoo, where <q>view source</q> doesn't cash out the way it used to, but it at least <em>can</em> be understood with enough elbow grease. Even compared to this, <abbr>Wasm</abbr> is completely unintelligible.</p>
    <aside role="note" id="Es10EEtlPgTnkIZHtdZZAJ">
      <p><a rel="dct:references" href="https://rachelandrew.co.uk/archives/2019/01/30/html-css-and-our-vanishing-industry-entry-points/">A recent article by Rachel Andrew</a> repudiates the ongoing derision by the programming community of declarative data formats like <abbr>HTML</abbr> and <abbr>CSS</abbr> as <span class="parenthesis" title="and possibly misogynist">not only elitist</span>, but an assault on a legitimate entry point into the tech industry. By being declarative and deterministic, and <span class="parenthesis" title="as opposed to minified/obfuscated JavaScript or Wasm bytecode">rendered in ordinary plain text</span>, <abbr>HTML</abbr> and <abbr>CSS</abbr> are amenable to static analysis, which is likely why they are not considered <q>real programming</q> by <q>real</q> programmers. This one-to-one correspondence of symbols in the source to elements on the screen, however, makes these formats especially easy to <em>learn</em>. Moreover, speaking as somebody who has been working with the medium since <time>1995</time>, they are all you <em>need</em> to learn in order to achieve a great many useful results in an open Web.</p>
      <p>Indeed, an overlooked aspect of what is called <dfn>progressive enhancement</dfn>&#x2014;deploying capabilities in layers so that <em>some</em> functionality is always guaranteed to work&#x2014;is that you're signalling to your users that you aren't up to any funny business. The alternative, and unfortunately the norm, is to say <q>I won't let you see my content unless you execute my code; you'll just have to trust me.</q></p>
    </aside>
    <p>Now, take this state of affairs and apply <abbr>EME</abbr> to it. Imagine a few years down the road it seems expedient to take the infrastructure which is already present and ubiquitous, and create Version 1.1 of <dfn>Encrypted Media Extensions</dfn>, which also applies to code. Not only, at this point, has <q>view source</q> been completely obliterated, but it would actually be a <em>crime</em>&#x2014;at least in the United States&#x2014;to try to do what has been taken for granted for the last <strong>30 years</strong>.</p>
  </section>
  <p>The Web seems to be inexorably marching toward becoming a programmer's medium. If you try to load a web page with <dfn>JavaScript</dfn> disabled, it's an even bet you will see a blank screen. Turn <dfn>JavaScript</dfn> back on, and you'll see something that looks not much different from something you might have seen <time datetime="2014">15 years ago</time>, except under the hood it's gobbeldygook.</p>
  <p>A Web page that executes binary code in a <abbr>VM</abbr>, in a browser that does not expose the <abbr>URL</abbr>, that sources content from a centralized server farm, is not a Web page, it's an <em>app</em>. Whether by grand scheme or by instinct, Google is quietly moving its pieces into position to control that ecosystem.</p>
  <p>This is not a conspiracy theory as much as it is an <em>opportunism</em> theory: if Google acts in its own interest, it will find itself in a position to continue to act in its own interest. The side effect will be the shrinkwrapping, and concomitant suffocation of the open Web.</p>
  
</body>
</html>
