I don't really do projects. I know it says projects at the bottom of the screen but if you go there it's just a total head-fake. That's because what I typically do is more like agglomerates of little project-lings without the big contiguous membrane that makes a project what it is. But, I'm not averse to using the model when it fits. So what follows is a bona fide, almost scary-ambitious project I would like to take on in the not-too-distant future:

Total-Functional Client-Side Web Scripting

This endeavour is to solve a problem that makes me genuinely angry: The increasing dependency on JavaScript for even the barest functionality on the Web. What makes it such an affront is that the site's owner is effectively coercing me into handing control of my device over to them and implicitly trusting them to behave. This is because JavaScript is Turing-complete, which any antivirus researcher can tell you is why they still have a job. It is a mathematical impossibility for the behaviour of Turing-complete code to be analyzed by machine. What that means is that if your site depends on JavaScript to work, I have to either take your word that not only the behaviour of your code meets my approval, but that you have also secured it from anybody else adding malicious code of their own. That, or viewing your website means I have to pore over every line. Oh, and signing only certifies that the code you signed is the same code I got. It doesn't say anything about behaviour.

Now, people will undoubtedly remind me that JavaScript is contained in a sandbox, and can't cause the same kind of damage a conventional virus can. But the facet of JavaScript's behaviour I'm interested in preventing is in the family of cross-site scripting and request forgeries, wherein the code hooks your browser into doing things on other sites you're logged into without your knowledge or permission. There's a new one of those every other day, and they aren't going away anytime soon. And that's just malicious third parties exploiting sites with user-generated content. The site owners themselves can still do things like subject you to invasive behavioural analysis, or coerce you into sharing information with other parties. I hear tell that there's even a company that brokers distributed computing tasks, taking money on one end from customers and paying website owners on the other end to pass on to their users to execute for free. That is super-super lame.

My concern here is as much political as it is about security. I can block cross-site scripting on my own just fine, but I can't say thanks-but-no-thanks to you turning my computer into a finger puppet. Moreover, if you link to JQuery, YUI or whatever Google's thing is, then you're making my relationship with you conditional on a relationship with them. What if for whatever reason I don't want to deal with them? My only options are whole hog or total boycott. And that just isn't satisfactory.

The way to get what I want would be to look into the code before running it and selectively disable certain behaviour. Too bad the Turing-completeness of JavaScript makes any such analysis asymptotically approach worthless! But if we got rid of Turing-completeness, any code would be amenable to reliable, automated static analysis. Now, what would a non-Turing complete language look like?

Turns out it would probably be a lot like Haskell. Or rather a lot like Miranda, Haskell's intellectual predecessor. How do I know this? Because the guy who wrote Miranda also conceived total functional programming, which has an expressive range that is one meager rung below Turing-complete. His name is David Turner, and I asked him about this idea. His response was probably, but I'd still have the problem of proving code correctness—a perfectly legitimate concern.

However, I'm not so much looking for what the code does, but rather what it doesn't do. What makes predicting the behaviour of Turing-complete code uncomputable is that the code is expressive enough to rewrite its own semantics. It could be as straight-forward as having a function like eval, which takes a string and executes it in situ. If one of those is present, all bets are off.

But scanning over a length of total-functional language, if I understand correctly, we would be able to tell with certitude if it does or does not execute certain behaviour. That's why a total-functional client-side Web scripting language makes so much sense. If we were particularly concerned about the site absconding with data, or creepy behavioural tracking, or running a MapReduce job, or whatever, we should be able to spot it up front and disable it.

Do people even see a need for something like this? I sure do. Apparently so does Tim Berners-Lee, who, you know, kinda invented the whole shebang. Web developers would be happy to use it because functional programming is more succinct than imperative, and because functional languages like Haskell are so hot right now. There is also a PR halo effect from being able to say your code doesn't pull any dirty tricks, and that there's no need to take your word for it.

The million-dollar question is, of course, how much of the everyday client-side scripting tasks genuinely depend on the full range of JavaScript's linguistic expressivity? That would have to be tested, but my gut says virtually none of them. In the 15 years I've been using JavaScript, I don't think I've once written anything crazier than completely deterministic DOM-schlepping. That could be replaced easily with a total-functional language, in fact it could be replaced with something even more restrictive.

The next big question is how computationally expensive would the code analysis be? If it's cheap, it could be done à la carte, but if it's costly, we'd have to figure out some sort of resource pooling, some interchange format for the proofs (which would probably be useful anyway), and some sort of crypto signature thingy (of which there are plenty, it's just a matter of ironing out the details). The predicates themselves—the questions does this code do only this or that—should also be shareable just like ad-blocking profiles are today.

But the point is, this is totally doable. The theory is sound, there's high-profile demand for it, and one could just write the reference implementation as a patch against both WebKit and Gecko, obviating any excuses not to immediately put it into no fewer than three major browsers right off the bat. So, who's with me?

Note: I'm not suggesting replacing JavaScript outright, but rather complementing it. In fact, it would be useful to start out with some sort of CoffeeScript-style quasi-compiler that produced JavaScript. This would enable the development of a prototype and test harness. Likewise, the proving engine would take the TFP source as input. Once all of that is sorted out, we could move to a native engine that would live in the browser alongside the JavaScript one, which would be triggered by a MIME type, just like Microsoft did with VBScript.