27 Sep 2016
As we've discussed on this blog before, FP Complete has been running a Hackage mirror for quite a few years now. In addition to a straight S3-based mirror of raw Hackage content, we've also been running some Git repos providing the same content in an arguably more accessible format (all-cabal-files, all-cabal-hashes, and all-cabal-metadata).
In the past, we did all of this mirroring using Travis, but had to stop doing so a few months back. Also, a recent revelation showed that the downloads we were making were not as secure as I'd previously believed (due to lack of SSL between the Hackage server and its CDN). Finally, there's been off-and-on discussion for a while about unifying on one Hackage mirroring tool. After some discussion among Duncan, Herbert, and myself, all of these goals ended up culminating in this mailing list post
This blog post details the end result of these efforts: where code is running, where it's running, how secret credentials are handled, and how we monitor the whole thing.
One of the goals here was to use the new hackage-security mechanism in Hackage to validate the package tarballs and cabal file index downloaded from Hackage. This made it natural to rely on Herbert's hackage-mirror-tool code, which supports downloads, verification, and uploading to S3. There were a few minor hiccups getting things set up, but overall it was surprisingly easy to integrate, especially given that Herbert's code had previously never been used against Amazon S3 (it had been used against the Dreamhost mirror).
I made a few downstream modifications to the codebase to make it compatible with officially released versions of Cabal, Stackify it, and in the process generate Docker images. I also included a simple shell script for running the tool in a loop (based on Herbert's README instructions). The result is the snoyberg/hackage-mirror-tool Docker image.
After running this image (we'll get to how it's run later), we have
a fully populated S3 mirror of Hackage guaranteeing a consistent view
of Hackage (i.e., all package tarballs are available, without CDN
caching issues in place). The next step is to use this mirror to
populated the Git repositories. We already have
for updating the appropriate repos, and all-cabal-files is just a
matter of running a
tar xf on the tarball containing .cabal
files. Putting all of this together, I set up the
- run-inner.sh will:
- Grab the 01-index.tar.gz file from the S3 mirror
- Update the all-cabal-files repo
git archivein that repo to generate and update the 00-index.tar.gz file*
- Update the all-cabal-hashes and all-cabal-metadata repos using the appropriate tools
run-inner.sheach time a new version of
01-index.tar.gzis available. It's able to do a simple
ETagcheck, saving on bandwidth, disk IO, and CPU usage.
- Dockerfile pulls in all of the relevant tools and provides a commercialhaskell/all-cabal-tool Docker image
- You may notice some other code in that repo. I did have intention of rewriting the Bash scripts and other Haskell code into a single Haskell executable for simplicity, but didn't get around to it yet. If anyone's interested in taking up the mantle on that, let me know.
* About this 00/01 business: 00-index.tar.gz is the original package format, without hackage-security, and is used by previous cabal-install releases, as well as Stack and possibly some other tools too. hackage-mirror-tool does not mirror this file since it has no security information, so generating it from the known-secure 01-index.tar.gz file (via the all-cabal-files repo) seemed the best option.
In setting up these images, I decided to split them into two pieces instead of combining them so that the straight Hackage mirroring bits would remain unaffected by the rest of the code, since the Hackage mirror (as we'll see later) will be available for users outside of the all-cabal* set of repos.
At the end of this, you can see that we're no longer using the original hackage-mirror code that powered the FP Complete S3 mirror for years. Unification achieved!
As I mentioned, we previously ran all of this mirroring code on Travis, but had to move off of it. Anyone who's worked with me knows that I hate being a system administrator, so it was a painful few months where I had to run this code myself on an EC2 machine I set up personally. Fortunately, FP Complete runs a Kubernetes cluster these days, and that means I don't need to be a system administrator :). As mentioned, I packaged up all of the code above in two Docker images, so running them on Kubernetes is very straightforward.
For the curious, I've put the Kubernetes deployment configurations in a Gist.
We have a few different credentials that need to be shared with these Docker containers:
- AWS credentials for uploading
- GPG key for signing tags
- SSH key for pushing to Github
One of the other nice things about Kubernetes (besides allowing me to not be a sysadmin) is that it has built-in secrets support. I obviously won't be sharing those files with you, but if you look at the deployment configs I shared before, you can see how they are being referenced.
One annoyance I've had in the past is, if there's a bug in the scripts or some system problem, mirroring will stop for many hours before I become aware of it. I was determined to not let that be a problem again. So I put together the Hackage Mirror status page. It compares the last upload date from Hackage itself against the last modified time on various S3 artifacts, as well as the last commit for the Git repos. If any of the mirrors fall more than an hour behind Hackage itself, it returns a 500 status code. That's not technically the right code to use, but it does mean that normal HTTP monitoring/alerting tools can be used to watch that page and tell me if anything has gone wrong.
If you're curious to see the code powering this, it's available on Github.
Official Hackage mirror
With the addition of the new hackage-security metadata files to our S3 mirror, one nice benefit is that the FP Complete mirror is now an official Hackage mirror, and can be used natively by cabal-install without having to modify any configuration files. Hopefully this will be useful to end users.
And strangely enough, just as I finished this blog post, I got my first "mirrors out of sync" 500 error message ever, proving that the monitoring itself works (even if the mirroring had a bug).
Hopefully nothing! I've spent quite a bit more time on this in the past few weeks than I'd hoped, but I'm happy with the end result. I feel confident that the mirroring processes will run reliably, I understand and trust the security model from end to end, and there's less code and machines to maintain overall.
Many thanks to Duncan and Herbert for granting me access to the private Hackage server to work around CDN caching issues, and to Herbert for the help and quick fixes with hackage-mirror-tool.