Why Your App Won’t Work In The Cloud

There are two kinds of apps for the cloud: ones that work and ones that don’t. The ones that work are called Twelve Factor apps, and they work because they were written specifically for the cloud. We call the ones that don’t work legacy apps. And these are designed to run on traditional VPS hosts.

Unfortunately, most popular apps are legacy apps. They weren’t written with the cloud in mind, so they generally won’t work without modification. Legacy apps include offerings such as WordPress, Magento, and Drupal. They might also include any in-house apps you are thinking about moving to the cloud.

So what’s to be done about this? Is there any way to run legacy apps in the cloud?

Approaches

Some approach this problem by deploying the legacy app to a single server, and then scaling that up to meet demand. But this is no different from traditional VPS hosting! And you’re going to get a nasty shock when Amazon retires (read: shuts off) your host…

A better solution is to make changes to the app itself, so that it’s compatible with the cloud. Being compatible means that it can be distributed across a cluster. This way, you can scale out by adding more servers. When servers get sick, you can replace them with new ones.

If you start making significant changes to an off-the-shelf legacy app, however, then you are essentially forking that code base. Not only will this require dedicated developer time, but you’ll have to continually merge in upstream changes. You’d think there’d be a better approach.

Fortunately, there is!

But before we go into that, we have a bit of ground to cover. There are a few basic concepts you need to familiarise yourself with before the way forward becomes clear.

Taking A Closer Look At The Problem

Here are some essential cloud computing concepts, carefully picked from a longer series of posts to bring you up to speed.

Pets vs. Cattle

With the traditional VPS model, servers are like pets. They are expensive to commision, complex to set up, and time-consuming to replace. As a result, we treat them with dedicated love and affection, and nurse them back to health when they get sick.

In the cloud, servers are inexpensive to commision, trivial to configure, and easy to replace. As a result, we treat them like cattle. They are functionally identical to each other, and when one gets sick we replace it with another.

Legacy apps expect to be run on pet servers. Our task is to make them run on cattle servers.

Configure Before You Boot

Why is it so easy to spin up new cattle servers? Because we have a new way of approaching configuration.

With the traditional model, a pet server is configured in an ongoing fashion. Changes are applied by hand, incrementally over time.

In the cloud, we want to spin up replacement servers as quickly as possible. We do not have time to wait for a human to configure the new machines. So we must write the configuration out ahead of time. In this way, each server in our cluster is configured identically.

To achieve this, we use a configuration management tool. And our configuration goes from being a live, adaptable thing to something that is frozen in time. Each time we make a substantial change, we must redeploy our entire cluster from the new snapshot.

Our task then is to trick a legacy app into using a frozen configuration…

Functions, Threads, and Processes. What’s Next? Cows

What’s so bad with writing out information to the file system anyway?

There are lots of ways to handle shared state in computer programs. If your app is using threads, you can use thread safe memory. If you’re using the multiprocess model, you can use any sort of Inter-Process Communication (IPC) such as shared pipes, the file system, or a database.

But what about a cluster of app instances? Clustering your app across many machines is a bit like forking a process. Except each process is running on physically isolated hardware. So you cannot use memory or IPC methods. You have to communicate state across the network.

What does this mean for your legacy app? It means that any attempt to write state to the file system is going to need to be replaced with a network call instead. If you don’t make this change, then state isn’t shared, and your app is going to break.

Share Nothing, Scale Everything

Is it really that bad? Can’t we use the file system for something?

You can, but beware the perils of putting your file system into the cloud. For example: many legacy apps expect to be able to modify their own files. But if you allow this, and then trigger a deploy from your Git repository, you will lose those modifications!

So what approaches can we take? We could take the stateful approach and just run the app on a single server. We can still trigger deploys, but we have to babysit the file system. And as we’ve already explained, that means we miss out on most of what the cloud has to offer.

We could also use a distributed file system, but then we hit another problem: we have to sacrifice something, whether that’s consistency, predictability, or performance. And if our app isn’t written with these considerations in mind, we’re going to run into trouble.

The best approach is to adopt a full shared-nothing architecture. That means using the file system, but only for temporary local state. Everything else must be written to a shared data store.

Putting Your Apps On Ice

App in block of ice

So where does all this leave us?

Legacy apps are not compatible with the cloud. They were created in cloud pre-history, and progress has left them behind. Thankfully you don’t have to rewrite them from scratch: you can fool these stateful apps into running in a stateless environment.

We do this by “freezing” the local state.

Start by running a copy of the app on your local machine. Then, use the standard administration console to the configure the app. Do this as if you were configuring it in production. You can install plugins, make configuration adjustments, and so on. In the background, the app is writing things out to your file system, because that’s how it expects to operate.

But here’s the clever part: once you’ve finished making your changes, you can simply commit the files to your version control system. Doing so effectively freezes the state.

Once you’ve frozen the state, you can deploy it anywhere you like—to one server, or to a thousand servers. The choice is yours.

Danger, Will Robinson!

There are three caveats to this approach:

  1. All configuration must be done in dev. But if you change a config setting that is kept in a database, you must make the same change in production, after you deploy.

  2. Once you’ve deployed your frozen app, admins must not make configuration changes that result in file system changes (for example, installing plugins). You may even be able to enforce this by locking down parts of the admin dashboard depending in production.

  3. Any part of your app that expects to be able to write out permanent files in production (for example, an image upload script) will need to replaced. There is no way around that. Fortunately, in most cases, an S3 plugin or similar makes an easy drop-in replacement.

This is likely an unusual workflow for the admins of your legacy app, and there may be no way to enforce it. So before you make somebody an admin, it is a good idea to explain the necessary upgrade and deployment steps to them.

Conclusion

Working with the cloud can be unintuitive at times. But these unfamiliar concepts provide us with enormous power: fault tolerance, high availability, and flexible scaling, to name just a few. Moreover, what is unfamiliar soon becomes familiar with practice. And once the principle of the stateless environment is learned, it can be applied over and over again.

To get better acquainted with this strange new world, we’ll be diving right in to looking at some specific, real world apps. Starting with WordPress, we’ll walk you through a full cloud setup, incorporating all of the techniques taught in the present article and more.

Have you been tasked with deploying a legacy app to the cloud? What sorts of changes did you have to make? Let us know, in the comments below!

About Noah Slater

Noah Slater is a Briton in Berlin who’s been involved with open source since 1999. They’ve contributed to Debian, GNU, and the Free Software Foundation. They currently serve as a member of the Apache Software Foundation. Their principal project is Apache CouchDB, the document database that kicked off the NoSQL movement. They also help out in the Apache Incubator, where they mentor new projects in the ways of community and open source.