The April 21 Amazon Disruption: Wrapping Up

On Friday, Amazon posted a detailed summary of the events and causes related to the April 21 service disruption. In short, a number of Elastic Block Store (“EBS”) volumes in one of the US East Availability Zones became unable to service read and write calls. Amazon EC2 compute instances (which are the basis of Engine Yard AppCloud instances) that tried to access the affected EBS volumes during the disruption became sluggish or frozen.

A few circumstances helped reduce the impact of this disruption for Engine Yard customers. We were able to move applications to regions outside of US East, taking advantage of new multi-region capability. Single-tenant architecture helped in this case as well: each application runs on its own instance(s) without dependence on any Engine Yard entity intermediating the request-response path or presenting a potential single point of failure. In many ways we were fortunate that this particular problem was alleviated by Engine Yard strengths. We believe the underlying PaaS architecture should be robust and vital, and we are committed to that philosophy. Although we are pleased that our service held up well, we are thoroughly reviewing the events leading up to and during the outage and are considering additional methods and precautions to mitigate any similar occurrences in the future.

As a final note, thanks to our customers for your perseverance and willingness to work with us towards a solution–and also your kind words of appreciation! We apologize for any inconvenience and any instance where we could have helped more quickly or effectively. In addition, we will pass along the service credit Amazon discusses in the summary linked above. Affected customers will be contacted shortly regarding this credit.

We look forward to continuing to work with you to learn, improve, and provide an enterprise-grade Platform as a Service with the best quality and reliability in the industry.