Key-Value Stores in Ruby: The Wrap Up

This last article in our key-value series will briefly cover a few interesting topics that could each have had full articles of their own. This means that if they seem interesting to you, follow the links that I provide to get more information on them. Lastly, I’ll wrap up by introducing Moneta, written by Yehuda Katz, which provides a unified API for a wide variety of different Key-Value Stores. If you want to write code that allows the user to choose the store to use, you’ll want to pay attention to Moneta.

The difficult part of discussing Key-Value Stores stores today is that it’s a product area seeing rapid development and constant evolution. There are more interesting stores and libraries available than can easily be covered, even in a series like this. I could probably be writing posts every two weeks into next year without running out of subjects. So, alas, many things must be left undiscussed or underdiscussed. But let’s move on to the topics we can cover…

CouchDB

The first great Key-Value Store that isn’t going to get its own article is CouchDB. Apache’s CouchDB is a document-oriented database, like MongoDB. It, however, exposes a RESTful JSON based API that you address with a built in HTTP interface. Like MongoDB, it offers a schema free data store. CouchDB offers solid, built-in replication, and uses JavaScript as its query language. It is a powerful tool.

There are several Ruby libraries which can be used to facilitate using CouchDB. In the examples below, I have used CouchRest, which is based on CouchDB’s own couch.js library:

require 
'rubygems'
require 
'couchrest'
require 
'yaml'
 
DBH = CouchRest.
database!
(
'exercise-log'
)
 
response = DBH.
save_doc
(
{
  
:date 
=> 
Time.
now,
  
:activity 
=> ARGV
[
0
],
  
:duration 
=> ARGV
[
1
]
}
)
 
stored_record = DBH.
get
(response
[
'id'
]
)
puts 
"Stored:
\n#{stored_record.to_yaml}"

wyhaines$ ruby /tmp/couch1.rb
Stored:
--- !map:CouchRest::Document
duration: "97:34"
_rev: 1-eb6f6e3a3e2eae0cd99f3fcbc63d29d6
_id: 0d9e71f44b3e0d3a2013c282bbccb5a0
activity: pedaling
date: 2009/11/12 21:07:45 +0000

Like MongdoDB, one can store any set of keys/values together as a document in CouchDB, and then retrieve it later. CouchRest returns a response from the server that contains an id field, which can be used to retrieve the record that was just stored.

For more complex queries of the document store, one can use views. Views have a lot of power, because they are ultimately defined using JavaScript, but they don’t lend themselves to easy ad-hoc manipulation of the database.

DBH.
save_doc
(
{
  
"_id" 
=> 
"_design/query",
  
:views 
=> 
{
    
:allkeys 
=> 
{
      
:map 
=> 
"function(doc) { " \
              
"for (var word in doc) { " \
              
"if (!word.match(/^_/))
emit(word,doc[word])}}"
    
}
  
}
}
) That inserts a view into the database that will be identified by query/allkeys. What a view does is defined by the JavaScript code it contains. Once a view is inserted into CouchDB, using it is simple:

puts DBH.
view
(
'query/allkeys'
).
to_yaml

That particular function was lifted shamelessly from the CouchRest README, and just has a couple terms renamed to make it a little more clear. The output:

---

total_rows
: 
3

rows:

- id
:
0d9e71f44b3e0d3a2013c282bbccb5a0

  value
: pedaling

  key
: activity

- id
:
0d9e71f44b3e0d3a2013c282bbccb5a0

  value
: 
2009/
11/
12 
21:07:
45 +0000

  key
: date

- id
:
0d9e71f44b3e0d3a2013c282bbccb5a0

  value
: 
"97:34"

  key
: duration

offset
: 
0 This is really just the tip of the iceberg with CouchDB/CouchRest; there’s a wealth of functionality. CouchDB views are implemented with map/reduce capability, which means you can use them to crunch some pretty complex problems on your data. Additionally, CouchRest provides a `CouchRest::ExtendedDocument`, which your own classes can inherit from. This lets you easily create a Ruby model for your data, which is then transparently stored inside CouchDB.

class
Exercise  
"running", 
:date 
=> 
Time.
now, 
:duration 
=> 
"23:44"
)

Dig into the CouchDB and CouchRest documentation if this looks interesting to you.

S3

I just wanted to briefly mention Amazon’s Simple Storage Service. It is, fundamentally, a simple HTTP accessible Key-Value Store that Amazon has turned into a service. Requests to S3 will have higher latency than requests to a locally hosted data store (and its response latency can be high too), but if you want a simple, robust store that will scale to as much data as you have to push at it, you might seriously consider S3.

Moneta

Moneta is a unified interface to a variety of different key-value type data stores. That is, the same code can be run against a variety of different backing stores, and it will just work. Moneta supports the following stores as of this posting:

  • Basic File Store
  • BerkeleyDB
  • CouchDB
  • DataMapper
  • File store for xattr
  • In-memory store
  • Memcache store
  • Redis
  • S3
  • SDBM
  • Tokyo
  • Xattrs in a file system
  • Consider this example, which, again, uses CouchDB:

    require ‘moneta/couch’ require ‘rubygems’ require ‘yaml’ require ‘moneta’ require ‘moneta/couch’

    cache = Moneta::Couch. new ( :db => ‘football’ )

    cache [ ‘1a_final’ ] = {

    :where => ‘Laramie; War Memorial Stadium’, : when => “11:30 MST”,

    :who => “Southeast Cyclones & Lingle-Ft. Laramie Doggers”,

    • prediction: SE Cyclones by 14 when: 11:30 MST who: Southeast Cyclones & Lingle-Ft. Laramie Doggers where: Laramie; War Memorial Stadium

It works, very simply. If I want to change the code to use something else, like a file based store, it’s as simple as changing one line:

--- couch.rb    2009-11-19 15:00:07.000000000 -0700
+++ file.rb     2009-11-19 15:01:12.000000000 -0700
@@ -1,9 +1,9 @@
 require 'rubygems'
 require 'yaml'
 require 'moneta'
-require 'moneta/couch'
+require 'moneta/file'

-cache = Moneta::Couch.new(:db => 'football')
+cache = Moneta::File.new(:path => '/tmp/football')

 cache['1a_final'] = {
   :where => 'Laramie; War Memorial Stadium',

The rest of the code works without alteration. The Moneta API is designed to be very similar to that of Hash. It has a limited feature set, but the features it provides work identically across all of the supported platforms. For example, it doesn’t currently support iteration or partial matches. If your Key-Value Store needs are simple and you want something that can work with whatever store your users want to use, definitely check out Moneta; it’s a well written tool.

With that, we’ve reached the end of this series. It’s been fun to explore the unique features, as well as the threads that unify each of these different approaches to the problem, on a non-SQL key-value type data store. I hope that I’ve exposed you to new and useful tools.

The landscape of Key-Value Stores is changing rapidly, so it is difficult to stay fully informed all the time. For instance, just a couple days ago there was a blog post implementing a SQL front end for CouchDB. It’s done in Perl, but all it would take is an interested person and a little time, and you could have it in Ruby, too.

If you use a Key-Value Store system, or plan to, keep your eyes open for new developments, because you can bet that someone else will have something interesting next week or next month that may change the landscape again. As always, leave feedback in the comments, and thanks for reading!