Distlockrun: Lockrun for Your Cloud

Lockrun is a handy little utility for ensuring you don’t run two of the same cron job (or other task) at the same time on one machine. It’s especially handy when the cron job in question has a widely varying duration. Lockrun was written by Steve Friedl and initially released in 2006.

However, when you have jobs that use resources from multiple machines, lockrun isn’t adequate. Consider a cron job that builds a local index of a bunch of NFS-mounted files. You don’t want more than one consumer of that NFS volume to index it at a time, otherwise performance will be degraded.

Enter distlockrun: it works similarly to lockrun except that it talks to a central server for mutual exclusion instead of locking a file.

First of all, you need the lock server running. It’s really lightweight, so you can run it on any machine you’ve already got. For example, on Engine Yard Cloud, it should be run on the database master in a screen session.

Start the server like so: distlockrun-server

By default, it starts on port 7890, but that can be changed with the --port option.

Then, on the client, run your job as follows: distlockrun --server {server's hostname} big-expensive-indexer.rb

If there’s already a big-expensive-indexer.rb running, then distlockrun will just exit. Otherwise, it runs the job and then tells the lock server that it’s finished.

That’s all it takes to get cluster-wide mutual exclusion; short but sweet!