Hephaestos' Curse

Pardon the inside baseball, but I'm actually pretty pissed about this.

It turns out that the people behind Google Chrome intend to nuke an important but contentious piece of functionality for the Web. XSLT, which stands for Extensible Stylesheet Transformations, is a standard language for schlepping markup that has been part of Web browsers since the otherwise unremarkable Microsoft Internet Explorer 5.5. I know, because I can remember the room I was sitting in when I first started tinkering with it. In two thousand and one.

XSLT was a major part of my first major tech job. I used it to power a publication pipeline that I designed and implemented at my employer from 2002 through 2005. Using it, it was possible to manage over 120 websites in 15 different languages, with a team of four people (five including me). Its job was to take the raw semantic content and slap on the presentation layer, something XSLT excels at. Indeed, I've been using it this way on my own Web properties since 2007, to transform (X)HTML into itself. Having XSLT in the browser is especially convienent for knocking out extremely lazy static websites, because it does everything you'd want in a templating language, but built-in with no additional moving parts. In fact, I am inclined to say that it got just about everything important right.

Now, I say XSLT is contentious because it was undoubtedly roped into Web browsers during the XML fever of the Y2K era. If you're not familiar, XML is what I would describe as a framework for representing data structures as files or network messages, with a strong affinity toward document-shaped structures. The idea was, at the time, that HTML—the language for representing Web pages—would become XHTML: one of many possible schemas, that could be trunked through a unified parsing infrastructure.

The problem with XML—aside from the fact that it is a pain in the ass to actually physically type—is that it is extremely, needlessly strict. The parser has been specified in the standard to throw and unrecoverable error unless everything is perfect. It is on you, the author, to comply. Coming to XML from HTML, which will doggedly produce some value of works for all but the sloppiest handiwork, is nothing short of jarring. Developers come out of the womb hating it; the trauma is generational. XSLT is XML, and is meant to operate over XML (including XHTML), so you can imagine the kind of reception it gets among mainstream Web developers.

So that's the backdrop. The precipitating event, ostensibly, is that earlier this year, a security researcher at Google Zero (i.e., same company) turned his attention to XSLT in the browser because of his personal proclivity for finding bugs in obscure, forgotten subsystems. Not a bad place to look! Unsurprisingly, what he found was a pile of zero-day✱.

Now, it turns out that a bunch of these bugs happen to exist in a 25-year-old pair of software libraries called libxml2 and libxslt. The raw age of this software isn't so much an issue, as it is that like many things open-source, it's a hobby project that woke up one morning and discovered that it was load-bearing. The other complicating matter is that due to being hobby software with no real resources behind it, libxslt only supports XSLT 1.0, which was standardized in 1999. These facts put together provide the basis for Google's argument to cut XSLT loose: it's old, it's full of serious security bugs (at least, the implementation of it we happen to use), and almost nothing uses it.

The problem with this argument is that it's disingenuous. While both Chrome and Safari rely on this software, for which the security bugs are real and very serious, and for which its own maintainer considers unfit for purpose, the claim XSLT is old and nothing uses it is misleading at best. For one, XSLT has vibrant, ongoing support in the publishing industry. Far from being abandoned, it has been updated to version 2.0 in 2007 and 3.0 in 2017, with the editor's draft of XSLT 4.0 shipping just last week. Not one, but two new implementations have recently been authored in the language Rust✱. It's the browsers who haven't kept up.

Why haven't the browsers kept up with this specification in 25 years? I suspect the reason is simple enough: the people behind the essential standard that defines the Web—that is, HTML—hate anything to do with XML and want it to go away. Indeed, the entire reason why the WHATWG even exists in the first place is, among related things, a disagreement over the extent to which XML belongs on the Web.

Unlike the W3C which has a much greater diversity of business entities represented in its membership—as anybody who can afford to pay the dues can join—the WHATWG only consists of developers of browser engines. There are currently four such entities in existence, and three of them are trillion-dollar corporations.