Standardizing on Capistrano
Prior to launching this blog as a centralized place for all of my English writing, I spent some time performing the long overdue housekeeping on the server that hosts my websites. One of the goals was to unify the process of deploying updates for each of them. I’d grown tired of using a variety of ad hoc utilities ranging from rsync to Capistrano to manually running Git commands on the server. I wanted one tool to rule them all, so I embarked on a quest to find it.
Let’s begin by establishing what makes deploying updates for web applications challenging. A deployment can be thought of as a form of cache invalidation, something that is commonly labeled one of the two hardest problems in Computer Science. Perhaps, an even more revealing parallel to deployments is database transactions. If you consider a typical deployment as a complex update statement running against your infrastructure, many best deployments practices follow naturally from the ACID properties:
- Atomicity. In a case of intermediate failure, the state of the application should remain unchanged (“all or nothing”).
- Consistency. A deployment should not be able to bring the system to a state that violates the specified constraints (i.e., “acceptance tests”).
- Isolation. The users should not notice inconsistent application state while a deployment is running.
- Durability. The effect of deployment should be persistent. A server outage should not cause an unexpected rollback to an earlier version.
While this comparison to ACID is not to be taken literally, it gives us a good idea of the guarantees that a competent deployment tool should impart. Even in the simplest case of updating just one server, we can immediately discredit some of the common approaches, namely SVN checkout, Git pull, and FTP/SFTP upload. A more potent instrument is needed to satisfy these requirements.
Capistrano is an example of a deployment tool that provides all necessary building blocks for implementing “ACID deployments.” However, being conceived within the Ruby community, it evolved to better suit its needs and ingrained some of the idiosyncrasies of the host platform. For example, due to compiled binary extensions required by many popular Ruby gems, the prevalent practice has always been to install the gems on each node individually. In the age of build tools powered by Node.js, this often engenders installing the entire build toolset on every web server. To make things worse, Capistrano 3 demands that every target server has access to the code repository, effectively converting each node into a mini build server. Aside from complicating the server environment, this introduces an obvious problem of having to ensure consistency of resulting builds across the array of target servers. All of this would be seen as heresy by a JVM developer whose build server produces self-containing JARs that can be uploaded and run practically anywhere.
Despite the discussed flaws, Capistrano remains a battle-tested deployment utility with relatively simple configuration. It also doesn’t require any additional software on the target server (except for the omnipresent SSH), which some other tools do. Last but not least, I have six years of experience with Capistrano and even wrote about some of its internals. Most of my websites are static, but they rely on various build toolchains (Gulp, Jekyll, lein-cljs, etc.) Looking for the ways to avoid installing every single one of them on the server, I discovered that Capistrano has a plugin API that allows one to add custom SCM adapters. For my use case, such adapter would replace the logic that creates a release directory on the server by performing a copy from an SCM system with a simple file upload of a locally built tarball.
I ended up implementing and packaging this adapter as the capistrano-tarball_scm gem, which is now available via Rubygems. You can find an example of a deployment configuration file for a ClojureScript project in my tetris-cljs repo on GitHub. I’ve been using the plugin for two months with several different projects, and I’m content with its performance. As I state in the gem’s rationale, it should not be viewed as “better Capistrano,” but merely as a simpler way to use Capistrano for deploying packaged applications and static websites.
Pushing releases to servers is not the only hard part about deployments. On the contrary, in a comprehensive write-up on deployments titled “How to deploy software” Zach Holman (ex-GitHub) argues that it’s the easiest part. He proceeds to cover all of the stages of software deployments at scale, from initial planning to post-release monitoring. The message of my article is much more humble: the complexity of deployments is not to be underestimated, and appropriate tools should be preferred. Quoting Alan Kay, “simple things should be simple, complex things should be possible.” And even if there’s no way Capistrano would’ve ever met Kay’s simplicity standards, it’s a good enough solution for most deployments.