The Tar Pit development log [i]

^05c March 19, 2017 -- (tech)

I was recently reminded that my blog lacks a comments box. This is not a new issue: I've admitted a while ago that I might have to fix this, and while the long-awaited magical solution hasn't even been on the table (or nowhere near the horizon, for that matter), The Tar Pit has been going through various issues which led to changes. Long story short, the blog is now generated through a small tool¹ mostly written and entirely maintained by yours truly.

This event opens up an opportunity, that is, the chance to write a blogging software that is the result of pragmatic engineering, as opposed to undisciplined enthusiasm: the blog can be specified as an unambiguously useful tool with a clear purpose, and more importantly a clear set of limitations, stemming from the fact that the concept of blog pertains to an immutable set of functionalities. Everything outside that immutable set represents functionality of anything else but a blog.

The problem of what constitutes an X, where X is a conceptual artifact of computer science and software engineering, is uncoincidentally a very hard problem -- arising from what we know as abstraction hell, which means that the concept of blog is not unambiguous in and of itself, etc. And this is where we, and by we I mean this is where this series of posts comes in.

This development log proposes to walk the reader through the process of building a blog, and at the end hopefully to elucidate the definition of blog, at least from the author's point of view.

The principal cause of this development log is the following. Intuitively, making a blog from scratch is not a difficult intellectual endeavour. This intuition given, any intellectual with the patience to get their hands dirty in matters of techne² should be able to make their own or otherwise steal from another and understand on their own. This possibility given, any braindead blog solution stamped and approved by a Committee of Blogs near you, e.g. Wordpress, becomes useless, except maybe for the plebeians who will forever live in abuse of Gutenberg's goodwill. Thus if we can, then, in the Pythons' own words, let's get on with it.

This lengthy introduction left behind us, let's start with a few requirements. In order for a blog post to display comments without much fuss (e.g. manually updating the page), the HTTP server needs a dynamic page generation component. Thus the server must not only serve some static content, but it must also grab some content from a database³, somehow combine it with the static content, then dynamically generate the HTML page from the two items and only then serve it. This is easy to achieve through templating, and it's why PHP was created. This is also how The Tar Pit works, only the template-to-HTML-page conversion is done by me before the content reaches the site, instead of being generated dynamically by a HTTP server.

Since The Tar Pit lives in the world of Common Lisp, what we need is a way to serve HTML pages generated on-the-fly using the HTTP protocol. As a bonus, it would be nice to have a component that can resolve blog paths in a more intelligent way. For example, we could access not only /posts/y42/12c-post.html, but also /posts/y42/12c-post, and also, why not /posts/12c. In any case, on a HTTP GET request, a server should be able to grab a post in template form (i.e. not yet fully⁴ converted to HTML), manipulate the dynamic parts as needed, convert the template to HTML and then serve the resulting page. On a HTTP POST request, the same server should be able to add comments to the database, but this something we will get into later.

The Common Lisp world already has a bunch of HTTP servers available for our benefit. The hard part consists in figuring out which one is more suitable for our purpose. Ideally, I would like to integrate into The Tar Pit a software that brings along a minimal set of dependencies and comes with as few features as possible; features which I can then grab from other software if I decide I need them -- the essential definition of HTTP server being software that receives a request and responds according to the Hypertext Transfer Protocol. On a first glance all the aforementioned servers seem equally suitable; but let's do a more thorough examination:

AllegroServe: Portable HTTP server implementation written by Franz Inc. Portable in that it works on multiple Common Lisp implementations, an aspect I am not sure I am interested in, as SBCL seems more than enough. Pro: it's self-contained. Con: the unneeded components would require trimming: The Tar Pit will not use SSL in the foreseeable future; no external CGI is required; CL-WHO is already used as a templating language; virtual hosts are unneeded, as that functionality can be already provided by e.g. Apache. Other than that, it doesn't seem bad.
Antiweb: "Antiweb is a webserver written in Common Lisp, C, and Perl by Doug Hoyte and Hoytech". Had I wanted C and Perl, I would have used Apache.
Araneida: The web server behind CLiki, the files are still available on the web. Pro: minimal set of external dependencies. Con: same as AllegroServe; for example The Tar Pit will never have a web-based administration interface, and thus it will never use web-based authentication mechanisms; I also doubt it will ever use cookies.
dwim.hu: Common Lisp utilities written by dwim.hu, among them being a HTTP server. Looking through the ASDF file for dependencies, hu.dwim.quasi-quote.xml+hu.dwim.quasi-quote.js doesn't strike me as very sane. Clearly not a first choice.
house: Written as part of the DEAL project (whatever that is), but otherwise its own thing. Pro: it seems very small: about eight files, not more than about 300 lines each. Con: the list of dependencies is not insignificant. It does probably deserve a second look though.
Hunchentoot: The alpha and omega of Common Lisp HTTP servers nowadays, it seems. It's like a shwarma with everything, and thus I'd like to avoid using it unless I'm out of other options.
jarw-inet: Utility library for various protocols, including HTTP, written by John A.R. Williams. Unfortunately the links on the project page are dead, and google has been unable to help me find any code.
s-http-server: Simple HTTP server written by Sven Van Caekenberghe. Code is available on GitHub page. Pros: it seems small (most of the code is bundled up in one file); it seems done. Cons: it depends on a few other libraries written by Caekenberghe; it contains some SSL support code which needs to be eliminated. So far it seems like a good candidate for The Tar Pit's web serving component, but I need to give it a more detailed look before passing judgment.
sw-http: Allegedly fast HTTP web server written (and abandoned, it would seem) by Lars Rune Nøstdal. A mirror is available on GitHub. Pro: It seems small and not too bloated. Cons: I don't want to know what bootstrap-types.lisp does; one of the dependencies is the library of Alexandria; another is the library of CL utilities; yet another of the dependencies (also an utility library, also written by Nøstdal) is abandoned and nowhere to be found. Not sure I want to try this out.
teepeedee2: "Fast webserver for dynamic pages". Available on GitHub. Much in the vein of Hunchentoot, huge and with a heavy dependency burden.
toot: Result of an attempt to cut stuff from Hunchentoot. It's smaller, but not small enough, the list of dependencies being hefty. I tried it out and it works, but for the amount of things it imports, it doesn't even support setting a custom 404 page. Requires a lot of trimming and modifications; I'm not convinced it's a good choice, but in lack of a better alternative, I'd dive into it.
Wookie: Yet another Hunchentoot fork. Async and all, nothing interesting to see here.

In addition to the list on CLiki, I've found a few more by googling:

cl-http: Also known as the Common Lisp Hypermedia Server. Probably one of the first web servers ever written, at least in the Lisp world. Unfortunately all the official sites and FTP mirrors pointed to by Google are dead.
cl-http-server: Distinct from the cl-http above. Written by Tomo Matsumoto and available on GitHub. It looks similar to s-http-server in most respects.
Some frameworks written on top of Hunchentoot, and some frameworks written on top of the frameworks written on top of Hunchentoot, turtles all the way down to? I'm not even considering them.

I think this is enough for us to get an idea of what Common Lisp has to offer in terms of HTTP servers. In the worst case, we know that there is a lot of code to cut from, making this method the modern equivalent of building our own web server. Assuming that's not the case, then: AServe is a powerhouse that I'm willing to consider in lack of more minimal working solutions; s-http-server and cl-http-server sound promising; house might be worth trying; the rest would only be worth taking into consideration in the worst case scenario described previously.

But that's enough for today.

Embedded in the source code that is available on GitHub. Given that GitHub is a venue owned by actors having a dubious relationship to freedom, I don't expect this to last forever. Thus at some point I will host my own public code repository with blackjack and hookers, only without the blackjack.↩
Which in any civilized world everywhere and everywhen means any intellectual, period. This is Leonardo's most important lesson: you're not human unless you're willing to understand the important things that your lazy reptilian brain otherwise refuses to, i.e. the miracles that matter.↩
The pedantic will point out that static files served by the HTTP server are usually stored in an operating system's filesystem, which itself is a database.↩
For example we wouldn't run Pandoc on every page request, as that would be unnecessary and also a big performance killer.↩