Cloudwatch Metrics on Engine Yard
Since the introduction of T2 instance types from AWS, our customers have wanted a solution to see their compute credits. Monitoring this continuously changing metric, outside of our own platform, presented a lot of challenges to us. Luckily, AWS handles the brunt of this load for us through their CloudWatch service.
What is CloudWatch exactly? It’s Amazon’s monitoring solution for cloud resources. For EC2, you can create metrics that monitor, and act on, changes to your instances. Late last year, we added support for a few of these metrics to all of our existing instances. For all EC2 instances, we added support for what AWS refers to as the “Instance Status Checks” here:
These status checks monitor whether or not the instance is actually up. The first one monitors the host, the second one monitors the instance itself. This way, we can know almost instantly if an instance is no longer responding, and our support team can act accordingly.
For T2 instances, we added a 3rd check to monitor and alert if the CPU credits get too low, and would start causing performance issues for our customers. Our support team receives these notifications, and can provide guidance on how you should proceed. This could involve adding additional instances, or possibly adding larger instances to account for the required compute resources.
So, we added the CloudWatch metrics, have alarms associated with them, our support team can see them and act on them, but none of this addressed the original issue of: How do we show this information to the customer?
Amazon has a concept of credential management called federation, which allows an account owner to setup a policy that can be administered without providing credentials to the end user. We created a federation policy to provide access to these CloudWatch metrics. This policy provides secure access to the CloudWatch dashboard and all of the metrics we have set up.
If you have a T2 instance, you will see the following metric on the cloudwatch dashboard:
This will give you a graph where you can see your CPU Credit Balance and how it has changed over a number of hours. With this information, you should be able to make informed decisions about your T2 instances.
To access this feature, you can use the link you see on your environment dashboard:
We are excited to deliver this feature to our customers and we would love to hear from you. Feel free to reach out to us if you have any input!
Share your thoughts with @engineyard on Twitter