High-Availability: Yes You Can!

Over the past few months we’ve noticed that some cloud terminology has been confusing for customers, so this is an Engine Yard PSA to help people understand how to achieve a more highly available (HA) environment.

While the public perception is that servers in the cloud are removed from failing hardware, etc., we often see ‘hardware decommission’ and ‘hardware failure’ as the reason for customer instance termination. You cannot expect constant uptime of any environment due to other issues, too (as we saw this summer, datacenter power failure), and if your business needs require you to be available at all times and more secure from IaaS outages, you may want to consider using Engine Yard’s platform to minimize outages by dispersing your application and databases instances to multiple datacenters (“availability zones” in AWS parlance).

You can prepare for an outage by making sure that you have application and database slaves, and that those slaves are in different availability zones than the database master or application master instance.

AWS, our IaaS provider for Engine Yard Cloud, has both regions (e.g. US-East-1, US-West-1, EU-West-1) and Availability Zones** (AZs) within regions (e.g. US-East-1, A, B, C, D, E).

At this time, AWS outages have only occurred in isolated physical regions, and often in a single Availability Zone within a region. For this reason, we strongly recommend having your master and slave instances in different Availability Zones. For example, if your database master is in US-East-1 A, you should not have the corresponding slave in that AZ, but instead have it in B, C, D, or E.

Engine Yard currently provides logic that attempts to spread masters and slaves across availability zones programmatically, but we also allow customers to choose AZ’s, and we have found that some customers are investing a single cluster too heavily in the same AZ.

Is your database a single point of failure because you do not currently have a database replica? Learn more about database replication.

Does your application master instance lack redundancy? If you add another application instance you’ll have redundancy and can take advantage of our takeover mechanism.

Do you have a database master and/or application master instance with slave instances co-located with their masters? We strongly recommend against this in order to improve reliability during outages. Read more about how to configure a high availability environment.

Some of these fixes are free, and some aren’t, but all of them are valuable in safeguarding your production environment for outages.

We can also engage our Professional Services team if you require professional assistance setting up your environment, etc. They can also work with you to develop a high availability solution in Engine Yard Cloud if you need additional assistance.

If you need technical help or want to open a Pro Servs request, just open a ticket! And if you want to learn more, check out our newest screencast on HA!

Thanks for being an Engine Yard customer!

* Read more about multi-region use here.

** Read up about availability zones here.