November 6th 2012

By Jay Moorthi

3 Challenges to Continuous Integration and How to Overcome Them

An effective continuous deployment system makes your engineering team more efficient by giving them automatic, quick feedback on their changes, and by providing a platform on which your whole organization can automate frequent operations, like merges and deploys. Let the computer do the repetitive work, and let your developers ship product.

Unfamiliar with Solano Labs and their testing frameworks? Check them out now for free now.

Note: these recommendations assume that you’re writing tests. Read these fine articles to learn more about why automated testing is so useful, and how it can help your business.

To improve and perfect your CI, you need to overcome 3 major challenges:

No Standalone Fresh Checkout
Unreliable Tests
Obscure Build Results

Fresh-Checkout Isn’t Standalone

The single biggest hurdle to a smooth CI build is ensuring that your application’s tests can be run from a fresh SCM checkout (e.g. a git clone). This means that all of your app’s dependencies are either included in the checkout, or they’re specified and can be pulled in by a script in the checkout. There are a number of ways this can help productivity. The first is that it gives your new developers a simple way to bring up the app into a known-good state.

Now, think of your CI server as a new developer who knows nothing about your app and needs to be taught everything by a script. Every build in your CI server should start from a known-good state, and each build should be independent from the last. This tabula rasa condition is critical to trustworthy and reliable build results – a cornerstone of CI as we’ll discuss in the next section.

Most simple applications start out with standalone checkouts, but as they gain features and integrations, it’s easy to lose this property if you’re not in the habit of maintaining it.

Here’s a test for how standalone your app is on first-checkout (the examples assume a Rails app that’s in a git repo, but they apply for other frameworks and source control tools):

Go into a new directory and checkout your app: $ mkdir -p ~/tmp $ cd ~/tmp $ git clone [email protected]:repo.git repo_from_scratch
????
Run your test suite (you have a test suite, right?) `$ rake test`

What did you have to do for step 2? Was it a single shell command? Did it work the first time? Or did you need to have your co-worker shout instructions at you from across the office?

Once you express this setup step as a single command (bash script, rake task, ant target, etc.), you save your new coworkers time, and you ensure that every development environment and test runs against a predictable base.

Unreliable Tests

Now that your app sets up with a single command, you’ve built a foundation for effective CI. The next challenge is to ensure that your test results are repeatable and reliable. Intermittent or “expected” failures that persist for too long are pernicious. Once the habit of treating failures as intermittent takes hold, legitimate errors often get ignored.

Try the following exercises:

Run your test suite. Does it pass?
Run your test suite again. Does it produce the same result as the first time?
Reset your test databases and caches -- drop them, and recreate from scratch using the scripted setup procedure you now have. Run your test suite again. Does it produce the same results?
If your test framework supports randomizing test order, try running in a different order, or with a different seed.
Run a few of your tests by themselves. Do they produce the same results as when you ran the whole test suite?
ADVANCED: Create a Virtual Machine (using your favorite tool or cloud provider) that is significantly slower than your development box (for example, a virtualbox with a single CPU and a small RAM, or an m1.small in Amazon EC2). Run your test suite on this crippled hardware. Does it produce the same results as on your dev box?

Intermittent test failures are almost always due to test order or timing dependencies. Here are a few specific antipatterns:

Leaky database cleanup. Do you reset the database between tests? Is it possible for the reset to not be triggered, or for it to overzealously destroy seed data?
Editing globals. Directly editing a global variable (like, say, the one that holds your app's configuration) in one test will impact the following test run in the same process. Most dependency-injection test frameworks allow stubbing at various scopes. Most allow stubbing globals. Beware, though: stubbing frameworks that are bulletproof for instance-level stubbing sometimes have subtle bugs regarding clearing stubs on globals.
Hard-coded timeouts or other limits (file or screen sizes, etc.). The time it takes to complete an operation can change on different hardware, or over time as the implementation changes.
Time- or date-specific test cases. Do your tests fail on the first of every month for some spooky reason?
Other leaky global state. The most common case here is when one test imports a library that a later test uses, or defines a symbol, or writes to the filesystem (e.g., look for hardcoded references to `/tmp` or `~/tmp`).

Obscure Build Results

Once you’ve produced a reliable test suite, the next challenge is to get results quickly, take appropriate action on them, and distribute information to the people who matter.

Timely results – ideally delivered sooner than 15-20 minutes after a commit – are essential to maintaining focus. Digging through a failed build from a 4-hour old commit requires mentally paging back in old information, and it may be a complete waste of time, as the commit have been superseded by a later one. If you’re waiting longer than an hour for builds, consider a commercial or open source parallel test runner, or the deeper undertaking of refactoring your tests to be less database/browser intensive.

Most popular test frameworks have rich reporting functionality to allow you to get as much or as little output as you want. Unfortunately, most test frameworks are optimized for output to an interactive user running them from the CLI, and that’s not often suited to what you want from a CI build. The situation is complicated further when parallel test runners are introduced to test frameworks that don’t support them natively – each parallel run will generate its own separate log file. Even concatenating these logs together at the end of the build makes it easier to grep for failures.

Once your build completes, what happens next is probably the most important piece of the puzzle. At the very minimum, an email or IM notification should be sent to the team whose commits ended up in the build. More complex logic, like notifying users only on state changes, or only on failures, can help immensely by cutting down on notification volume, and making build events that much more noticeable. Beyond notifications, for a full continuous deployment setup, the code from the build and any artifacts should be pushed to a downstream server and made ready for review. Make sure that the results of deploy actions get included in notifications!

Wrapping Up

Automated testing makes development agile, and Continuous Integration takes the human error out of “manually run” automated testing. But getting a continuous integration environment working smoothly for your organization can be daunting, whether you hacked it all together, bought some off the shelf software, or are using a SaaS like Tddium. Approaching the problem methodically by automating your app setup and hardening tests will make your continuous integration server more trustworthy and reliable. Focused email notifications help keep the team coordinated and responsive to results. Having your CI solution scale to meet your teams needs is also crucial to the requirements of the system.

It’s easy for a small organization with a few developers to set up a CI server that works simply and well. But developers then have to maintain servers and to do system administration. This is overhead and distraction from building core product.

This overhead grows exponentially as the business grows and the CI server itself needs more advanced technology – like real-time UIs and intelligent notifications, process isolation, build and test parallelism, dynamic work scheduling, audit and governance controls, third-party API integrations, and automatic hardware scaling in public and private clouds.

Building your own CI server means you have complete control, but it also means that you and your busy team are on the hook for administration and improvements, and scaling. If you’d like to learn more about fast, easy-to-use CI servers, come talk to us…we’re pros!

If you are an Engine Yard customer, follow these steps to access Solano Labs’ Continuous Integration Service:

Go to https://cloud.engineyard.com/addons (login required) or navigate to “Add-ons” in Engine Yard Cloud
Click on the Solano Labs listing
Sign up and follow the instructions for “Activate”, “Update Code” and “Deploy”

Share your thoughts with @engineyard on Twitter