June 10th 2011

By Ken Collins

SQL Server 10xs Faster with Rails 3.1

Our latest guest post is from community contributor Ken Collins, aka MetaSkills. Ken is a software engineer at Decisiv Inc, author of the SQL Server Adapter for ActiveRecord, and organizer for the Hampton Roads Ruby Users Group 757.rb. He blogs at metaskills.net about all things Ruby, JavaScript and iOS related.

As you may have heard or seen, ActiveRecord 3.1 now comes with prepared statement support. This happens for you automatically since ActiveRecord will associate all SQL statements with bind substitute variables. The implementation of these is left to each database adapter. For the most part, the pattern is to work with the native connection to get a handle or token back that represents the cached query plan on the database server; then send these tokens with the necessary parameters on subsequent calls. The benefit to prepared statements is that the database does not have to compile a query plan for every piece of SQL sent to it, potentially saving a lot of time.

The speed improvements for SQLite3 and PostgreSQL were reported to be anywhere from 1.5x to 10x faster depending on which database and the complexity of the query. It was also reported that MySQL does not perform so well, but I won’t go into why here. The real question is what can you expect from the SQL Server Adapter? The answer is a lot. My benchmarks for SQL Server show that prepared statements are up to 2x faster for simple queries and 10x faster for complex ones! Impressive and right on par with PostgreSQL. What impressed me most, was how easy it was to implement and how clever SQL Server is with query plans.

Plan Caching and Reuse

It may surprise some of you to learn that SQL Server builds and caches a query plan for every statement sent to the database. Some may also be surprised to discover just how bad dynamic SQL is to that plan cache. We can glimpse into the plan cache with the following SQL snippet. It uses two system views and a fancy CROSS APPLY so that we can correlate the SQL text and the use counts for each query plan. In between my examples I will use DBCC FREEPROCCACHE WITH NO_INFOMSGS to purge everything from the plan cache.

SELECT [usecounts], [cacheobjtype], [objtype], [text]
FROM sys.dm_exec_cached_plans
CROSS APPLY sys.dm_exec_sql_text(plan_handle)
WHERE [text] LIKE '%SELECT TOP(1)%'
ORDER BY [usecounts] DESC

Let’s start off with a basic statement example, finding a post by its [id] column. We will execute these three statements to the database.

SELECT TOP(1) * FROM [posts] WHERE [id] = 1
SELECT TOP(1) * FROM [posts] WHERE [id] = 2
SELECT TOP(1) * FROM [posts] WHERE [id] = 3

When we lookup the cached plans on the server, we will see a saved plan for each query issued above, not good. SQL Server relies on a hashed value to determine if the SQL to be executed matches an existing query plan in the cache. Without forced parameterization on the database, it has no way of telling that these three statements only differ in a constant for the post’s identity column.

+-----------+---------------+----------+---------------------------------------------+
| usecounts | cacheobjtype  | objtype  | text                                        |
+-----------+---------------+----------+---------------------------------------------+
| 1         | Compiled Plan | Adhoc    | SELECT TOP(1) * FROM [posts] WHERE [id] = 3 |
| 1         | Compiled Plan | Adhoc    | SELECT TOP(1) * FROM [posts] WHERE [id] = 2 |
| 1         | Compiled Plan | Adhoc    | SELECT TOP(1) * FROM [posts] WHERE [id] = 1 |
+-----------+---------------+----------+---------------------------------------------+

Our goal is to allow SQL Server to reuse a query plan from the cache with little to no effort. Thankfully it gives us a ton of ways to deal with this problem, some of them even require elevated privileges that might not be appropriate to web applications. Others require two or more round trips to the database for each new query. If documentation is your thing, I highly suggest reading Execution Plan Caching and Reuse on the MSDN site. My research showed that the solution was simple and was actually recommended for applications that generate SQL statements dynamically. Enter the sp_executesql statement.

Basically sp_executesql takes two to n number of arguments. The first is a unicode string of your parameterized SQL statement. The second is a string that delimits each parameter name and its datatype. Each additional argument afterward is the name of the parameter and the corresponding value. Using parameterized SQL statements this way allows SQL Server to find and reuse an existing query plan in the cache because the statement body with parameter names are the same. Now let’s flush our cache and then issue the same 3 statements above using sp_executesql with parameters.

EXEC sp_executesql N'SELECT TOP(1) * FROM [posts] WHERE [id] = @0', N'@0 int', @0 = 1
EXEC sp_executesql N'SELECT TOP(1) * FROM [posts] WHERE [id] = @0', N'@0 int', @0 = 2
EXEC sp_executesql N'SELECT TOP(1) * FROM [posts] WHERE [id] = @0', N'@0 int', @0 = 3

Now when we look at the query plans in the cache, we find better news. See how this plan was used 3 times!

+-----------+---------------+--------- +------------------------------------------------------+
| usecounts | cacheobjtype  | objtype  | text                                                 |
+-----------+---------------+--------- +------------------------------------------------------+
| 3         | Compiled Plan | Prepared | (@0 int)SELECT TOP(1) * FROM [posts] WHERE [id] = @0 |
+-----------+---------------+----------+------------------------------------------------------+

Included With RailsInstaller 2.0

In my article “Modern SQL Server & Rails” I covered the new default connection mode for SQL Server called TinyTDS. Thanks to the wonderful work of Luis Lavena and Wayne E Seguin both TinyTDS and the SQL Server Adapter are included in the excellent RailsInstaller project, the easiest way to get up and running with Ruby on Rails for Windows. You can download the RailsInstaller for Windows 2.0.0 preview release now.

It is easy to see that as your application runs, a consistent pattern of SQL statements will emerge and their reuse from the plan cache will have a positive impact on the overall performance of your app. Since the latest 3.1 versions of the SQL Server Adapter leverage sp_executesql in every INSERT, SELECT, UPDATE, and DELETE statement, we can finally tap into these cached query plans on your database. We have already seen the adapter’s test suite perform 15% faster. The bigger and more complex your application/database, the better your performance should be when using the latest adapter.

So share your experiences with us and let us know what this means for you. Again, I invite anyone that uses the SQL Server Adapter to join the Google Group or stop by the #rails-sqlserver room on irc.freenode.net and help us make it better. Thanks, and I hope to see you there!

Share your thoughts with @engineyard on Twitter