September 17th 2013

TAGS:
Paas

Using a Platform-as-a-Service vs. Doing it Yourself

Here at Engine Yard, we’re often asked “What’s a PaaS?” or, “Can’t I just install Ruby (or Node.js, PHP, MySQL, PostgreSQL, etc.), deploy my application and monitor my systems myself?” This is definitely true. There are thousands of companies doing their own “devops” today and that pattern works for them. But because we are asked these questions so frequently, we decided to sit down and really go through the pros and cons of both approaches.

Where a PaaS really shines is when a company doesn’t have the developer resources, in-house expertise, or contractor budget to properly manage their production infrastructure. A PaaS allows a development team of any size to focus on their application instead of their infrastructure, thus making them more productive and providing more “bang for the buck” with development dollars spent. Which would you rather do as a developer: write code, or get tied down in several days of yak shaving while building a new production cluster? And what if that cluster has a hardware failure at 4AM - how do you feel about being “on call”?

In our eBook, our own J. Austin Hughey lays out further explanation on what it takes to build up, and later monitor, support and manage, a production web application cluster at medium scale, and then contrast that with equivalent steps when using a PaaS.

Check out some highlights below, and read our full eBook here:

Building a Cluster

DO IT YOURSELF

ENGINE YARD

Create security group and manage SSH keys for you and your staff
Configure security group to allow traffic on ports 443, 80, and 22 (SSH)
Choose a virtual machine size and boot from an AMI.
Update package manager.
Upgrade all available base packages to latest versions to avoid known security vulnerabilities.
Install Ruby and RubyGems, probably from source since some distributions tend to lag behind on patchlevel vs. the latest available, and you should keep your version of Ruby patched to the latest patch level available to avoid commonly known security vulnerabilities.
Create another server in your security group for your database.
Configure security groups to allow access to the database server on its running port from only other members of the same security group and on port 22 (SSH). Deny all other access on all other ports.
Install and configure your database of choice.
Install and configure your application server of choice (Thin, Passenger, Unicorn, Puma, etc.) on your application instance.
Install and configure a front-end web server: e.g. nginx or Apache. Integrate it with your application server (tell nginx about the Unix socket that Unicorn is storing all its requests in; Passenger is much easier but may not be appropriate for all apps).
Configure Passenger workers for global queue and “always on” workers for single apps needing 24/7/365 constant performance if using Passenger
Set up deployment tools of choice (e.g. Capistrano)
1. Remember to have it clean up after itself; leaving old releases around will slowly but surely eat up all your disk space!
Configure your application to look at your single database master.
Deploy code.
Start your front end and back end application servers.
Test your application's functionality.
Deploy your code again to test that your deploys work as expected.
Test the application again to be sure that deploys didn't break anything.
Obtain an Elastic IP address and attach it to the one/primary application instance you have running.
Tweak DNS entries to point your domain to that IP address.

Time Estimate: 2-4 days

Upload your SSH public key through a web- based GUI.
Create an “application” in the dashboard. Specify the git URI for your application's source code.
Use the dashboard to create an environment and designate your public key as having access.
Select the instance size you want for your application, database and utility servers, and how many of each you want, all on one screen.
Click “Boot”.
Click “Deploy”
Verify the app is functioning as expected.
Adjust the DNS to point to the EIP already issued and wait for propagation.
High five everyone in the office.

Time Estimate: 45 minutes - 1 hour

When using any form of automation, you generally have less flexibility and control. Engine Yard solves this problem by using custom chef recipes. With custom chef, you can exercise nearly 100% total control over your cluster configuration. Once written, properly tested and in use, custom chef recipes allow you to automatically configure any aspect of any of your clustered instances at boot time.

Scaling the Application Tier

DO IT YOURSELF

ENGINE YARD

Add another application server and do everything you already did before above when standing up the server in the first place, minus the database related parts. That may take you an entire day, or more if you run into surprises.
Add an Elastic Load Balancer or install a software based load balancer on your primary application instance (where the EIP is attached).
1. If you add an ELB, you have to move the EIP to the ELB, or modify DNS with the IP address of the ELB. (Sorry about the TLA's, it's just part of DIY.)
2. If you use a software based load balancer, you'll need to configure it with the hostname and port of the other servers in the cluster. This means you'll likely have to run the load balancer on port 80/443, and then change your front-end web server to run on, for example, 81/444 and forward traffic to those ports on the other machines. This is how we do it with haproxy right now, for example.

Time Estimate: 1-2 Days

Go to the environment screen, click “add,” select an application instance and click “add to cluster.”
No, seriously, that's it. Off to the beach!

Time Estimate: 20 minutes

###

Database Backups

DO IT YOURSELF

ENGINE YARD

Establish a database replica. At any form of scale, you probably shouldn't be executing SQL dumps or snapshots against the master because various locks and I/O performance issues can come into play, so your snapshot/ sqldump should be taken from the replica.
Create an automated script in bash, ruby, python, etc. to run a SQL dump for your database of choice (e.g. mysqldump) and then ship the dumped file off to a remote storage location, Amazon S3 for example.
Test the script to ensure that it works.
Create a cron job entry to execute that script at your lowest traffic time(s) every so often— maybe once or twice per day.
Configure cron to email you if a problem occurs.
Create some form of automation to purge old SQL dumps from S3 (or your storage provider of choice) and ensure that it runs at regular intervals.
Test all of the above thoroughly to be sure that it works as expected.

Time Estimate: 1-2 days

By default, this is already done. All you have to do is tell Engine Yard how many backups to keep, and how long to keep them. That's it.
Time to kick back and relax.

Time Estimate: None. Built-in to the product.

At the end of the day, for any small to medium sized business, and for departments within the enterprise, it’s about budget and money. You’re going to spend money on hardware (physical or virtual) one way or another. The question therefore becomes, is the cost of a PaaS less than, or greater than, that of a systems administrator, and do you necessarily need a full time systems administrator for your application? In some cases you absolutely will need a dedicated systems administrator, but in most cases, a PaaS can provide agility, capabilities, and access to expertise otherwise not attainable at such a low cost.

Share your thoughts with @engineyard on Twitter