Improving the Rubinius Bytecode Compiler
The Rubinius bytecode compiler is the gateway to all the magic that makes your Ruby code run. As you probably know, the Rubinius virtual machine is a bytecode interpreter. The Rubinius JIT compiler also processes bytecode, converting it into native machine code. Without bytecode, we’d be dead in the water. Recently, I’ve been working on improving the Rubinius bytecode compiler.
In this post, I’ll explain what I’ve been doing and how it relates to getting Rubinius ready for 1.0. We recently released version 0.12 and we’re going to be doing releases about every two weeks. If you haven’t built Rubinius yet to check it out, head over to our download page and get started!
Principles
Before we get into code examples, let’s lay the groundwork and review some general principles for writing software. We learn early in computer science that programming involves a series of trade-offs. There’s always the the classic trade-off between execution time and storage space. If we write a function that saves each value it computes, it can return the saved value instead of re-computing it, but saving the value requires using memory or disk space. A function that computes millions of values would potentially need millions of storage locations.
While speed versus space is probably the most well-known classic trade-off, there are others. With a nod to Philip Kapleau, here are the three pillars of developing Rubinius: Code Quality, Compatibility, and Performance, or QCP. These interact in complex ways and there’s no simple way to prioritize them. By compatibility, I mean conforming to the same behavior as MRI (or Matz’s Ruby).
These three characteristics are interdependent. Clean, high-quality code enables working more quickly on compatibility. When code is behaving correctly, it’s easier to profile and improve performance. Good performance, in turn, can make it easier to identify and fix compatibility issues. Quality code is easier to understand and work with when improving performance. These feedback loops push each of these code characteristics forward. The converse is also true. At times, it may be tempting to sacrifice quality to improve performance, but quick and dirty never pays off in the end.
To sum up, these are the goals we’re pursuing: simplify and improve the bytecode compiler code, improve performance, make it simpler and easier to bootstrap Rubinius, and ultimately make it easier to fix compatibility issues running fantastic Ruby software, like your Rails applications.
Parsers and Compilers
There’s a certain mystique that surrounds compilers. Rather than just accept the mystique, let’s take a peek behind the unicorns and dragons and check out the simple gears and pistons that make it all work. In general terms, a compiler is a process for converting data from one form into another. In the specific case of Ruby, the compiler converts text in the form of Ruby syntax, into operations performed by the computer’s CPU.
In discussing compilers, two rather distinct operations are often lumped together, namely, parsing and code generation. Parsing is the process of converting the source code into a data structure that the compiler can process to produce code that the computer, or a virtual machine, can execute, to perform computation. There are specific issues with each operation so let’s look at them separately.
A. The Parser
Humans are the most adept and complex parsers in existence. One natural language can choke up a powerful computer, but humans typically handle one and sometimes several languages with relative ease. Not just the words or sounds of a language either, but also the intonations, facial expressions and body language that accompany even the simplest communications.
Compared to natural languages, programming languages are very simple, but even the simplest languages can be challenging to parse. Syntactically, Ruby is a rich and complex language. We love it for the expressive programs we can write—but parsing Ruby is hard. Every Ruby parser in the Ruby implementations that I know of are based on Matz’s parser. From the beginning, Rubinius has used a directly imported version of Matz’s parser with a few minor modifications.
###Detour—Bootstrapping
Before continuing, we need to take a short detour. I’ll be writing a post on the Rubinius bootstrapping process in the future, but I’ll start by briefly describing part of the problem here.
The Rubinius bytecode compiler, and most of the Ruby core library (classes like Array
and Hash
), are written in pure Ruby—and herein lies the challenge. The Rubinius VM interprets bytecode. To run the Rubinius compiler, it needs to translate the Ruby source code for the compiler into bytecode, but to do so, Rubinius needs to run the compiler. You can probably see where this is heading: around and around without getting much done.
That is the essence of the problem of bootstrapping. To break this loop, we need to insert a process that does not depend on Rubinius. In other words, we need something that’s not Rubinius to compile the core library and bytecode compiler so that Rubinius can load the bytecode and run the compiler. Then Rubinius can compile its own compiler and core library.
One way to do this would be to load the Ruby source code in MRI and use the ParseTree gem to extract the MRI parse tree as a recursive array of symbols and values, something also known as an S-expression or sexp. But, since you can’t change the MRI parser without impacting its run time behavior, Evan Phoenix wrote the sydparse gem.
This gem is essentially a C extension combining the MRI parser with the sexp generation code from ParseTree. The gem enabled Evan to modify the parser if desired and get the parse results as a sexp. The sydparse gem is basically built into Rubinius currently under the vm/parser directory.
For example, try the following code:
$ bin/rbx
irb(main):001:0> "a = 1".to_sexp
=> s(:lasgn, :a, s(:lit, 1))
So, with the String#to_sexp
method, we have something that will take Ruby code and transform it to a data structure that should be fairly easy for a computer to process. And that is precisely how the early Rubinius bytecode compiler worked: it processed sexps into bytecode. It didn’t matter whether the sexps are output by sydparse running in MRI or by the Rubinius built-in parser. With that, a big part of bootstrapping was solved.
###Back to the Main Road
So we’ve got a compiler that processes sexps to bytecode and it runs in both MRI and Rubinius. Everything sounds copacetic—what’s the trouble, you ask? Well, compiling Ruby is hard, too.
To make the process more tractable, Evan rewrote the compiler so that the sexps are processed into an abstract syntax tree (AST) of Ruby objects. Each object has a #bytecode
method. To generate bytecode, start at the root of the tree and visit each node, calling the bytecode
method for the node.
For perspective, let’s enumerate all the stages in the compiler. I’m glossing over a few details here, but you’ll see the basic picture:
- parse tree: A tree of C data structs created by the MRI parser.
- sexp: A recursive array of symbols and values created by processing the MRI parse tree.
- rewritten sexp: The form of the sexp after certain structures are normalized to make converting the sexp simpler.
- AST: The abstract syntax tree of Ruby objects created by processing the rewritten sexp.
- bytecode: A stream of instructions that the Rubinius VM can execute.
- compiled method: The bytecode packaged with some additional information like the names of local variables and the amount of stack space needed when the compiled method runs. A typical Ruby source code file compiles to a tree of compiled methods.
- compiled file: The Rubinius VM can execute the compiled methods directly. But to avoid having to recompile them, the tree of compiled methods is serialized to a compiled file on disk. Rubinius can read the compiled file to recreate the tree of compiled methods in memory.
That’s a significant number of stages and it’s not hard to see where we can simplify to improve the process. Those sexp stages appear to just be passing data along. In fact, they require creating a lot of additional objects, time to process, and some seriously complex code to process them. The latter has a significant, negative impact on code quality.
S-expressions are just data, dumb and brittle. They only have form (e.g. [:x, :y]
versus [:x, [:y]]
) and position (e.g. [:alias, :x, :y]
versus [:alias, :y, :x]
) to encode information. If you wanted to add, for example, a line number to every sexp, you’d need to put the information in a form and at a particular position in every sexp, or you’d need to wrap every sexp in one that encodes the line number. The simplicity of sexps is beguiling, but you know what they say when all you have are s-expressions… Everything looks like function application.
On the other hand, we have this rich, object-oriented language that makes it trivially easy to create objects that conform to a consistent interface. We want to get from the Ruby text to Ruby objects as quickly, and simply, as possible.
One option would be to write the compiler so that it processes the MRI parse tree directly into bytecode. Unfortunately, that would require either rewriting the compiler in C (not gonna happen) or making the C data structs available in Ruby. The latter option sounds promising. We could translate the parse tree directly into an AST.
###Melbourne
Enter Melbourne, a C extension that can run in MRI or Rubinius and process the MRI parse tree directly into an AST. In case you were wondering, it was named in honor of Evan’s sydparse gem, and some rowdy Aussie developers we know.
In Melbourne, each node of the MRI parse tree is processed using a mechanism we all know and love: a method call. At each parse tree node, all children are processed by recursively calling the process_parse_tree
function (see lib/ext/melbourne/visitor.cpp). At a leaf node, there are no children to create, so an appropriately named Ruby method is called on the Rubinius::Melbourne
instance that is passed to process_parse_tree
.
Once all of a node’s children are created, a method is called for the parent node, passing along the objects that were already created for the node’s children. You can see the result of this process by running the following command:
$ bin/rbx -r compiler-ng -e '"a = 1".to_ast.ascii_graph'
LocalVariableAssignment
@line: 1
@name: a
FixnumLiteral
@line: 1
@value: 1
In lib/melbourne/processor.rb, you can see code like the following. Whenever a local variable assignment parse tree node (lasgn) is processed, this method is called and the LocalVariableAssignment AST node is created. Pretty straightforward.
def process_lasgn(line, name, value)
AST::LocalVariableAssignment.new line, name, value
end
One of my goals for improving the code quality in the compiler was to replace the use of conditionals by creating more explicit nodes in the AST. Sometimes code suffers from what I’d call conditionalitis, or inflammation of your conditionals (sounds painful, huh?). That’s where you maintain a lot of state and use conditionals to figure out what to do.
An alternative is to create different forms of things that just do what they are supposed to do. Simply avoiding using conditionals is not really an option, but where they’re used can have a big impact on code quality. It’s can be hard to understand by just looking at the code why one branch or the other would be taken when the program is running, but when the conditional results in different forms being created, it’s much easier to comprehend how the program will behave
For example, in MRI a.b = 1
and a \[b\] = 1
are parsed into the same parse tree node. But there are enough things that need to be done differently in the two cases that just creating different forms can make these tasks simpler and more explicit. The code in lib/melbourne/processor.rb for processing an attrasgn node looks like this:
def process_attrasgn(line, receiver, name, arguments)
if name == :[]=
AST::ElementAssignment.new line, receiver, arguments
else
AST::AttributeAssignment.new line, receiver, name, arguments
end
end
B. The Compiler
By this point we have a fully formed AST that faithfully represents the Ruby code we started with. Emitting bytecode now is almost anticlimactic. Not really, of course, because there is plenty of work left to do. Just keeping track of local variables in methods, blocks and evals could take up a whole post, but the basic idea is really simple. Start at the root node and call the #bytecode
method, walking down the tree until every node has been visited. All the details (and there are a lot of them) can be found in the #bytecode
methods on the AST nodes (see lib/compiler-ng/ast).
Besides generating bytecode, there are other interesting things you can do simply by defining methods on the AST nodes and visiting them. I’ll give you two examples. First, let’s look at defined?
.
# defined.rb
class A
class B
end
end
def x
puts "hey there"
A
end
p defined? A::B
p defined? x::B
(If you run this code in Ruby, you should see the following output.)
$ ruby defined.rb
"constant"
hey there
"constant"
This code illustrates something about defined?
. It doesn’t just check internal data structures. In some cases, like x::B
, it must do some evaluation.
In Rubinius, we simply add a defined(g)
method to any relevant AST node that takes an instance of the bytecode generator object and emits bytecode appropriate for evaluating the expression passed to defined?
. Here is the full code for the Defined
AST node.
class Defined < Node
attr_accessor :expression
def initialize(line, expr)
@line = line
@expression = expr
end
def bytecode(g)
pos(g)
@expression.defined(g)
end
end
Rather than emitting bytecode, it’s possible to simply evaluate the AST, performing actions when visiting each node. Rubinius has an evaluator for being able to write bits of code for which there is no simple Ruby syntax. The evaluator makes it possible to write directly in Rubinius assembly language (i.e. in the operations that the VM directly executes). Consider this example:
# asm.rb
def hello(name)
Rubinius.asm(name) do |name|
push :self
run name
string_dup
push_literal "hello, "
string_dup
string_append
send :p, 1, true
end
end
hello "world"
If you run this in Rubinius, you should see the following output:
$ bin/rbx asm.rb
"hello, world"
Of course, this example is much easier to write in plain old Ruby. However, we use this facility in the Ruby core library. For example, in the Class#new
method.
The basic idea is that a tree of Ruby objects representing Ruby source code is a powerful tool that can be easily extended to accomplish various tasks.
Compiler, Part Deux
Let’s revisit the list we presented earlier of compiler stages. How do we coordinate those stages? What if we need to insert a stage? Well, by creating more things we can name, and on which we can define behavior.
You can see the various stages laid out in lib/compiler-ng/stages.rb. By default, each stage know which stage follows it. You create an instance of the compiler by specifying the starting stage and the ending stage. Each stage has an interface with the preceding and following stage, with the compiler object itself, and with the object that performs the work for that stage. For example, if the compiler will compile a String to a CompiledMethod, it will consist of the StringParser, Generator, Encoder, and Packager stages.
The goal was to create an API for the compiler that made it simple to use programatically in a variety of circumstances. For example, using the compiler internally to compile a Ruby file when #require
is called or in a command line script like lib/bin/compile-ng.rb. The command line script may give the option to show the data structures as the compiler is processing them. Here’s an example:
$ cat var.rb
a = 1
p a
$ bin/rbx compile-ng -AB var.rb
Script
@name: __script__
@file: "var.rb"
Block
@line: 1
@array:
LocalVariableAssignment
@line: 1
@name: a
FixnumLiteral
@line: 1
@value: 1
SendWithArguments
@line: 2
@name: p
@privately: true
Self
@line: 2
ActualArguments
@line: 2
@array:
LocalVariableAccess
@line: 2
@name: a
============= :__script__ ==============
Arguments: 0 required, 0 total
Locals: 1: a
Stack size: 3
Lines to IP: 1: 0-4, 2: 4-14
0000: meta_push_1
0001: set_local 0 # a
0003: pop
0004: push_self
0005: push_local 0 # a
0007: allow_private
0008: send_stack :p, 1
0011: pop
0012: push_true
0013: ret
----------------------------------------
$ bin/rbx var.rbc
1
Take a look at bin/rbx compile-ng -h
for more options and try this out on your Ruby code. It opens up a pretty exciting world of understanding.
A Final Note
Melbourne and the new compiler will both be enabled in the up-coming 0.13 release, so anywhere I’ve used compile-ng or compiler-ng in this post, will become just compile or compiler. Until then, all the code exists in the Rubinius Github repository, so you don’t have to wait to check it out.
Thoughts? Questions? Leave them here!
Share your thoughts with @engineyard on Twitter