Improving Rails App Performance with Database Refactoring & Caching

Performance is a major concern for any application developer. The problem is, often enough, people don’t worry about it until it starts to dip and there is cause for concern. Performance needs to be something we focus on up front, not just when the walls come tumbling down. It needs to be part of the process, not a once in awhile review thing. In this post, we take a look at a few things you can do to improve performance, starting with the database.

Database Performance

When using Rails, ORMs, such as ActiveRecord or DataMapper, make it easy to get data from an application, but can also often make it easy to neglect optimization and the importance of refactoring to improve database interaction. Relying on ORMs to do all the work can lead to issues further down the road of the application lifecycle. As we add features we need to focus on refactoring our ORM calls.

In development, it’s difficult to spot what are called N+1 problems, which can occur in Rails without refactoring. N+1 means one object was called, then a second object was also called, creating a second query. This then compounds. So you may be running 100 queries to get one result instead of running 1 query with 100 results. This is tough to see in development because a tiny dataset is being used, so database performance issues are difficult to spot and may only become apparent when moving to a production sized database.

One way to avoid the N+1 issue is using eager loading. This can be accomplished by using .include on querying code such as the following:

# app/views/customers/index.html.erb
<% @customers.each do |customer| %>
    <%= content_tag :h1, customer.name %>
    <%= content_tag :h2, customer.addresses.first.city %>
<% end %>

# will produce 101 queries if you have a database with 100 customers

To add eager loading, we add the .include:

# app/controller.customers_controller.rb
class CustomersController < ApplicationController

    def index
        @customers = Customer.includes(:addresses).all
    end
…
…
end
# this should produce 2 queries on the same 100 customers

If this technique is used up front, it may help prevent performance creep. If it’s a part of later refactoring, it may be difficult to go through the SQL backlog to discover where the issue is occurring. This is where a tool such as New Relic might come into play to help find where the issue is and to help resolve it more succinctly.

Another issue that may affect database performance, and therefore overall application performance, is the issue of slow queries. Slow queries are any query that drags on the performance of the database and takes longer to process than it should. If you are using MySQL, the slow query log should help you locate some of the issues. This log can be found by issuing the following on the database instance:

  cat /db/mysql/log/slow_query.log  

Once you know where the issue is taking place, you can look into adding indexes to troubled tables. Searching an index on a table with 1000 rows is 100 times faster than searching the same table without an index. When adding an index, it’s important to note that the table will lock, so no inserts will occur while the index is being built. To add an index, use the following example:

  class AddIndexForStuff
    def change
      add_index :stuff, :stuff_id
    end
  end

Caching

Caching is storing things in memory for repeated or future use. Rails makes caching easy, though the best caching is the type used without involving the application. It’s possible to leverage things like Nginx to cache static files. Page caching of static files when using Rails and Nginx is as easy as the following:

# creates on #index, #show, etc
caches_page :index
        
# expires on #creates, #update, #destroy, etc
expire_page :action => :index

In order to properly serve these cached objects, it will be necessary to setup Nginx to do so. Using the front end server you can do the following (note: this example assumes you are using unicorn as your webserver. For more info on Rails webservers, see this article):

upstream upstream_enki {
    server unix:/var/run/engineyard/unicorn_enki.sock fail_timeout=0;
}

location ~ ^/(images|assets|javascripts|stylesheets)/ {
    try_files $uri $uri/index.html /last_assets/$uri /last_assets/$uri.html @app_enki;
    expires 10y;
}

location / {
    if (-f $document_root/system/maintenance.html) {return 503; }
        try_files $uri $uri/index.html @app_enki;
}

This is a standard setup for an Engine Yard account, moving static assets like images or assets upstream so we can page cache and improve performance. If nothing matches the paths we have established, it will then hit the application and serve what is there.

If page caching is not an option, the best choice for performance improvement is memcache. This is the standard caching technique in rails and it is fairly easy to use. Simply set your cache_store to mem_cache_store and add memcache servers as follows:

# config/intializers/memcached.rb
config.cache_store = :mem_cache_store,
    "server-1:11211",
       "server-2:11211",
       "server-3:11211",
       "server-4:11211"

Rails will handle hashing the memcache out so it can be charted and, as in the example above, we recommend using multiple memcache servers in order to improve performance and get the expected boost.

Action caching is another way to improve app performance. It is similar to page caching except the entire contents of the action will be stored on a cache store. The benefit being any before_filters will be still be called. Usually this is for ensuring any validation or login functionality will be called while other cached items can be stored to improve performance:


before_filter :make_sure_things_are_ok
caches_action :all_the_things

def all_the_things
    @all_things = Thing.all_in_some_way
end

def expire
    expire_action :action => :all_the_things
end

Caching and database performance are only two areas to look at when considering how to make your application run at peak condition. The goal of every developer is to deliver the best possible user experience, and better performance is one way to ensure this.

About PJ Hagerty

Developer, writer, speaker, musician, and Team Lead of an elite band known as the Engine Yard Community Team, PJ is known to travel the world speaking about programming and the way people think and interact. He is also known for wearing hats.