A Brief and Fuzzy History of Web Application Development

Once upon a time, if you wanted to do anything on the Web past serving files, you had to write your own Web server, or hack the source of somebody else's. Since this would have been tremendously inconvenient, the major Web server vendors devised APIs to make their products more extensible. Developing dynamic Web sites with these, however, would have been almost as slow: programming in C or C++, recompiling, stopping and restarting the server, and of course, extremely tedious memory debugging of a potentially multithreaded program. It hurts to even imagine.

The Other CGI

At some point, some clever people must have considered this situation to demand way too much effort for trivial goals, and created a way for what would eventually become Web applications to run as simple, purpose-built, text-based programs. These programs would exist as files on the Web server, often sitting among plain files. When called from a browser, the server would run them in a pipeline, feeding them the client's request and consuming their output to relay as a response. This was known as the Common Gateway Interface, and a side benefit was that as long as you conformed to the pattern, you could use any language you wanted, even sloppy two-minute shell-scripts. The king of the CGI script, however, was Perl.

I Know, Let's Put Toilet Water On It

As Web sites grew more sophisticated, people noticed that they were copying and pasting a lot of the same content. Maintaining this content would have been a pain — constantly hunting down broken links, typos and minor incongruities. The solution came with the idea of server-side includes, which enabled authors to call out to other files on the Web server. You could keep chunks of your content in a single authoritative place and just refer to them. Moreover, you could reference CGI programs and weave their output directly into your HTML. This was a particular bonus, as anybody who has hand-rolled CGI applications can tell you.

The natural progression of this paradigm was to plant chunks of executable code directly into HTML files, and run them through a special interpreter. There are several popular implementations still in use today:

PHP, a Web-oriented programming language which, it is often forgotten, began its life as a Perl CGI script;
ASP, a theoretically generic protocol pioneered by Microsoft for containing and running scripting languages in HTML pages;
ColdFusion, an attempt to save on expensive programmers by dressing up code in a way cheap HTML coders could understand, and
JSP, essentially a front-end to Sun's Java servlet architecture and a competitor to ASP, which I will discuss in a moment.

Internet: Serious Business

While the advent of CGI enabled tremendous innovation at a trifling cost, it wasn't the only part of the Web's conceptual infrastructure to evolve. For those who insisted on spending huge sums of money on IT projects, there were enterprise solutions to match. At around the same time the paradigm of embedding code in HTML was evolving, so was that of the server API, in the form of the application server.

An application server was initially conceived as its own machine or cluster thereof, which sat in between a regular Web server and a database server — again, or clusters thereof. Expensive. The idea was that the regular content servers would respond to the highest volume of static requests while forwarding dynamic requests to the application server, which would be fewer in number but more resource-intensive. This would theoretically enable the application to scale quickly and transparently to its users, while avoiding slow and costly engineering considerations.

In this configuration, it was inconceivable to go mincing around with individual files and trying to keep them in sync across multiple application servers. What it demanded was an entire Web application to be crammed into a monolithic bundle so it could be pushed to the cluster in one piece. At the same time, the files in the bundle didn't necessarily correspond directly to pages on the Web site, but instead would be generated on the fly, just as with the server APIs. The most prominent example of these bundles is probably the Java servlet, for which just about every major software corporation, bridge club and corner store has its own brand of application server. Except Microsoft of course, as they have ASP.NET, which is their proprietary version of basically the same thing.

Yeee Haw

As the patterns of embedded code and application servers matured, Web technology began to get down like an Ozark family reunion. Domain-specific languages intended to generate tables and populate forms in HTML sprouted legs. Web frameworks now ship with their own embedded servers for testing, or they are Web servers designed from the ground up. The distinct roles of content and application servers began to merge back together, and they can now be leased at at affordable rates based on storage, traffic and computation time. Possibly most remarkable is the sheer bulk of application state, presentation logic and even business logic which has migrated wholesale to the browser and/or being offered as a third-party widget, possibly under some kind of freemium business model.

Products, Products, Everywhere…

We find ourselves now floating in a sea of products, at all points along what is known as a stack. The stack starts at the operating system and/or virtual machine, then proceeds to Web server, data store, application programming language, application framework, and finally content management system, under which I would include blog, wiki and social media platforms. Pick a product anywhere in this stack and you'll find that it overlaps heavily with just about every other product in its category, and the few unique behaviours it offers, it does in a surprisingly incompatible way. Mixing and matching small chunks of functionality is not economical, so you write your own; there is still no such thing as Web Lego. There has been some respite to this incompatibility with the advent of Web services and the JavaScript objects which venture past blogger-bling sophistication, though they create their own menagerie of problems.

I Think It's Called Web 2.0

Utility computing is great because it amortizes big capital costs into small operating expenses, but can you run an Amazon EC2 application on Joyent or Google App Engine? Can search engines index your third-party blog comments, and most importantly, associate them with your site? Can one widget interact intelligently with another if both are on your page? Would you want them to? Are you okay with not having physical possession of your data? If one of these providers disappeared overnight, would you be left in a lurch? If they did, who would find out first? You, or your customers?

Oh, and how about the dozens of accounts everybody ends up registering to interact with all of these products, as users, site owners and developers? Federated identity management? Getting there, but still a way off.

The interesting thing about all this recent development on the Web is that it really tangles up the business relationships with those to technology. It then takes that mess and exposes it for all the world to see. This means that your relationship with your customers can be affected all too easily by someone else's relationship with one of your suppliers. I suppose this has always been the case, but not to the extent of depending on two dozen different companies to make your site work as intended, every time somebody loads a page. Don't get me wrong, I think interdependence is a good thing, but what we currently have is just dependence. I want my systems to be a bit more robust than to be subject to the caprice of any particular vendor, or a vendor's vendor, or some random basement-dweller. Moreover, if there's a problem with any one of them, I want to know about it first.