May 8th 2012

By Sal Cardello

All About High Availability

What is a High Availability system? There are multiple opinions/definitions of high availability. Some people refer to it as Disaster Recovery; I refer to it as an implementation to ensure that business systems spend minimal time down from a disaster.

For the purpose of this post, I think we should establish a description of Disaster Recovery and how it relates to High Availability. Disaster recovery includes the processes, policies and procedures related to preparing for recovery or continuation of technology infrastructure critical to an organization after a natural or human-induced disaster. There are 7 tiers to disaster recovery; tier 0 is no data loss prevention or basically several single points of failure, while tier 8, which is a fully automated failover system with zero to minimal data loss. Tier 7 and 8 are more in the classification of High Availability. Recognizing that there are plenty of discussions, opinions, and confusion regarding High Availability/Disaster Recovery, I have decided to discuss what I believe is the most logical solution for e-commerce and mission critical cloud applications.

Utilizing cloud technology for applications has grown immensely over the past 3 years. This growth is due to the fact that IT organization (or lack there of) has minimal responsibility, overhead and maintenance. All of this equates to less money spent and larger margins on revenues.

Although this has its upsides, there are many pitfalls associated as well. Some of these are lack of control over infrastructure, lack of knowledge as to what infrastructure is to be utilized and a lack of knowledge about the maintenance of this infrastructure. It is important to remember that cloud infrastructure is subject to outages just as normal infrastructure would be. A virtualized environment is a great way to minimize costs and maximize margins, but it does not prevent against outages. To minimize the lack of control clients have over these systems it is recommended that one invest into an insurance policy that would minimize down time during a man made or natural disaster.

The most logical or practical insurance plan to ensure ones business stays up with minimal down time is to implement a geo redundant High Availability system. A geo redundant system is basically a master/slave system located in two separate geographical locations. This is very similar to what most everyone implements with their local databases, in the event of the master database failing the slave takes over to where the application has minimal to no data loss with minimal application downtime.

Accurately implementing requires that the database replication is constant, in our agile world we also need to ensure that all code is pushed to both locations, the directory structures are replicated at a specific intervals and that all of the policies and procedures are in place to fail over from one geo location to the other geo location with minimal downtime and maximized efficiency. With the lack of control in cloud computing and the definite knowledge that an outage will occur, the logical solution to ensure Application up time and minimal Data loss is to implement a Geo-Redundant High Availability system.

We at Engine Yard recognize the need that clients have in running their applications on the cloud and that downtime equate to loss of revenue and more importantly possible loss of client. We have now implemented this technology and are offering it to clients that need that solid insurance.

Check out Engine Yard Professional Services page to learn more. If you have any questions about services offered for the Engine Yard PaaS, please feel free to contact [email protected].

Share your thoughts with @engineyard on Twitter