High I/O on Engine Yard Cloud
The infrastructure options available to Engine Yard Cloud customers are constantly expanding. Not long ago Amazon Web Services released limited availability of a new High I/O instance type with staggering performance gains over the existing instance types.
Today we’re happy to announce availability of the hi1.4xlarge instance type as an Early Access feature for all interested customers wanting to take one (or more) of these for a spin. ###What is a High I/O Instance? High I/O instances are the most performant instances in terms of CPU and Disk availability on the AWS Elastic Computing Cloud (EC2). These instances sit on a 10 Gigabit ethernet network, providing unmatched performance in terms of latency and speed. The main difference between this instance type and all previous, is the ephemeral storage.
Storage Differences
Amazon EC2 has a number of different storage options, each with varying use cases. Here at Engine Yard, we take advantage of Elastic Block Storage (EBS) volumes for application and database storage and we use S3 for backups. Additionally we take snapshots of each of our customers’ EBS volumes at configurable time intervals. We currently do not utilize ephemeral storage for any platform features.
The hi1.4xlarge instance type is making a splash in the cloud computing community as it is the first to offer solid-state drives (SSDs) instead of the more traditional spinning disk drives for ephemeral storage. While the data on these drives remains volatile, many of our customers have expressed interest in taking advantage of the performance gains that come with this new disk architecture.
The Bottom Line
The High I/O instance type is not for everyone. Its best use case on Engine Yard Cloud is for deploying highly-performant, clustered databases.
The High I/O instance type comes with two terabytes of ephemeral local SSD storage. Databases such as Riak, Cassandra, and MongoDB can take advantage of super fast reads and writes while avoiding data loss in clustered environments.
The key word is ephemeral—if your instance dies, so does your data. This is why we would only recommend the use of local SSDs in clustered database setups (with three or more nodes) or for use in applications where data can safely be lost.
Enabling The Feature
We have created an early access feature for any customer wanting to try out the High I/O Quadruple Extra Large instance type. If you’re interested, please open up a support ticket and request access.
Note: At this time, it may be difficult to provision an h1.4xlarge instance due to AWS’s limited supply. Further, AWS currently limits each account to two of these instances per region. Please open a ticket if you need more. This instance type is only available to environments that are created in the Eastern U.S. and Europe. We will make it available in all regions as Amazon adds capacity in these regions.
What about MySQL and PostgreSQL?
Using ephemeral storage for data that you want to keep is risky business. Databases such as MySQL and PostgreSQL are simply not built to withstand sudden resource loss. We prevent data loss by using EBS devices to separate instance volatility from data volatility.
If, for example, an instance experiences a power failure, the data on the EBS volume is still persisted and can be snapshotted & cloned or reattached to a new instance. While it’s true that we could take database dumps and store them on S3 (as we do now), losing an instance could still mean losing hours of data without one or more slaves in place.
The use of ephemeral storage in place of EBS volumes for Relational Database Management System (RDBMS) data stores is still under internal discussion, and we need your feedback. If you have an opinion on the matter, head over to our docs and comment on the feature request.
What’s Next?
That said, we still feel that RDBMSs are a perfect solution for 99% of the web applications that are deployed on Engine Yard Cloud. This is why we are working to offer performance gains using Provisioned IOPS EBS Volumes and EBS-Optimized instances. The use of these two new offerings from AWS will afford better performance without the risk of volatility.
Share your thoughts with @engineyard on Twitter