Losing the Forest for the Trees

Web developers have to be knowledgeable about a handful of different technologies. At the very least, these include Linux, HTML, JavaScript, CSS, SQL, a language like Ruby, and a framework like Rails.

Yet true mastery happens when each piece is understood both individually, and relative to the whole. In science and technology, this is called systems thinking, and in the context of a web application, it is measured in performance, especially under load.

In managing the Professional Services team at Engine Yard, I have seen a variety of problems where the forest was lost for the trees. In almost every case, the solution meant taking a step back and looking at the interrelationships of an application’s elements.

Here are some ways to explore those relationships and, hopefully, understand the overall health of your application.

Memory allocation and garbage collection
Web applications allocate memory on an as needed basis, and release it by performing garbage collection. For most dynamic languages, the garbage collector freezes all other activity while the heap is cleaned out. To collect metrics on the health of your GC, consider using ltrace, rbtrace, and check out Joe Damato’s presentation on Garbage Collection and the Ruby Heap.

In Ruby Enterprise Edition (REE), and JRuby, you can easily tune the garbage collector in accordance with the Goldilocks principle. The GC should not run too often, but it should run often enough that memory consumption does not become a problem.

Kirk Haines wrote a great blog post on memory allocation here. You can read more about performance tuning in REE here, or JRuby over here.</li>

Background processes
Starting with metric collection, use tools like RPM, or memprof to find your most expensive actions. Often, these actions can be moved to a background worker that run periodically using a separate GC. Engine Yard officially supports background processes through both Delayed Job, and Resque, but there are a handful of alternatives, both in Ruby and other languages.</li>

Copy-On-Write Web Servers
If you cannot move an expensive action to a background process, you can also save memory by forking a process using Copy-On-Write (COW). Both Passenger and Unicorn web servers support this.

COW allows you to save memory by forking a master process without allocating an equivalent amount of memory. Basically, the forked child gets enough memory to run the application, but shares code and libraries.

You can read more about COW on the Passenger website.</li>

There are dozens of others ways in which CPU, memory utilization, and disk I/O can suffer from a lack of systems thinking. Notably missing from this inadequate list is the topic of SQL database performance, including proper indexing and query optimization.

Like public health, or the environment, systems thinking tends to be forced upon us after a problem surfaces, but it should be preventative, not reactive. As irrational creatures, we know this, but we do little to help ourselves minimize the inevitable climb in response times that come with success (traffic).

If you work with Sinatra, Rails, or one of the other technologies that orbit around Ruby’s gravitational pull, Engine Yard Professional Services can get you on the right track by performing an architectural review or performance analysis. It is a good place to start thinking about the whole system, which is always greater than the sum of its parts.