Ruby 2.0 Under the Hood
Note: This great post was written by our great friends over at Crowd Interactive. Check them out!
It has been now almost 4 months since__Ruby 2.0__was released. Upgrading to Ruby 2.0 coming from 1.9.3 is a much easier task than it was to upgrade from Ruby 1.8 to 1.9.
Ruby 2.0 introduces new functionality and internal changes focused on improving Ruby’s performance.
So, lets see what Ruby 2.0 has to offer. This will not be a recount of all of the features/changes but more of a list of changes with a performance impacts.
Improved method dispatch
The method dispatch mechanism is quite complex and requires a lot of work in order to execute instructions. This involves (in simple terms):
- Method lookup in receiver and ancestors
- Verify method parameters arity
- Check method visibility
- Setup of a control frame
- Setup of local environment
- Bytecode execution
Since version 1.9, and with the introduction of Ruby’s VM - YARV-, optimization techniques where implemented to improve the method dispatch. The first two techniques listed below were introduced with 1.9, the last two came with 2.0.
###Specialized instructions
Specialized instructions are microoptimizations where the original call is replaced with faster optimized instructions, this change allows the elimination of the dispatch cost, by making it go through a shorter way.
But this technique is limited to certain instructions that can take advantage of this, unfortunately it can’t be applied to a wide range of procedures.
###Method Caching
With Ruby we have two different method for caching, one is the inline cache which is used to cache the method lookup for a specific caller site and the other is the global cache.
Cache helps reducing the method lookup overhead, instead of searching through the receiver and ancestors, a check is done first in the global cache, then Ruby looks in the in-line cache. The final option is to perform the actual method lookup, but this last one can only happen if method has never been called or cache was invalidated.
###Cache check results
Starting from Ruby 2.0, the checks for parameters arity and method visibility are stored into the _inline cache_after the first one. There is an increase in performance because we don’t need to do those checks again if Ruby already has them in cache.
###Frameless CFUNC methods
It turns out that several CFuncs do not need to have a control frame setup, for example; a call to String#length only returns the size of the given string but does not need to know anything else about its environment, a performance gain comes from not having to build a frame for CFunc that will never use it.
GC Optimizations
###Bitmap marking
A new Garbage collector was introduced with Ruby 2.0, it’s called Bitmap marking. From now on, each Ruby heap has a corresponding bitmap - an aligned memory structure -. GC, before 2.0, it used to have a flag inside_RValue_ structures, this flag was used to mark a structure that was still referenced in our program.
The big change from Ruby 1.9 GC is that the RValue structure does not contain this flag anymore, so when the GC process sets or clears the mark, our RValue structures are not modified, since this flag is now kept in the Bitmap associated with the heap.
This allows Ruby 2.0 to introduce a feature called__Copy On Write__.
###Copy On Write
Copy On Write or simply COW allows to reduce Ruby’s process memory usage when a process is forked. Instead of duplicating the memory space on each process fork, COW allows multiple processes to share the same memory - because GC does not modify RValue structures anymore -, until one of the process modifies shared data. In some way, we can see this like a lazy memory copy.
Unicorn’s webserver processes on a Rails application, for example, could benefit from COW. Unicorn users will be able to share memory and, by doing this, reduce the consumption of resources.
In some cases this would reduce the memory consumption of Ruby 2.0 programs and allow more processes working with less memory usage.
Faster Kernel#require
For a while there have been discussions on how to improve Kernel#require method performance, specifically for Rails applications.
Over the past years, several solutions have been offered to improve Rails’ boot. With Ruby 2.0’s release, an improvement has been done on Kernel#require that helps to improve Rails’ boot.
Here is a quick test - not scientific - that shows the difference between booting the Rails application with Ruby 1.9.3-p392 and Ruby 2.0p185.
###Ruby 1.9
$ time bundle exec rake environment
bundle exec rake environment 7.21s user 1.26s system 96% cpu 8.748 total
###Ruby 2.0
$ time bundle exec rake environment
bundle exec rake environment 4.60s user 1.18s system 91% cpu 6.289 total
My sample application is quite simple, so if you try on your own - maybe on a big Rails application - your mileage may vary.
Conclusions
Ruby 2.0 is a great improvement from earlier versions. It sure has a lot of eye candy features, but the most important features, are the ones related to improved performance for our web applications.
If you are at Ruby 1.9 the jump to 2.0 may be easy. If you are still at Ruby 1.8, the end of life is close, so it would be advisable to act now and upgrade your Ruby version.
As additional encouragement to switch to Ruby 2.0, bear in mind that Rails 4.0 has dropped support for Ruby 1.8 and Sinatra 2.0 might do that soon as well.
Ruby 2.1’s roadmap has been announced, and chances are that it will be available around the end of 2013, for now we only know that it may include a Generational GC, which may bring even more performance improvements to Ruby. Let’s see how that goes.
Share your thoughts with @engineyard on Twitter