Using DataMapper and Rails with Legacy Schemas
Today’s guest post hails from community contributor Piotr Solnica of Code Benders, an agile software collective based in Bend, Oregon. In his free time Piotr supports OSS, is a DataMapper](http://datamapper.org/) core contributor, an author of DataMapper plugins and adapters and one of the developers of rubytime.org and coderack.org.
Ruby on Rails makes it easy to bootstrap a greenfield project. Unfortunately not all projects are greenfield, so sometimes you need to work with a legacy database schema. ActiveRecord is a great choice when you have a full control over your database from the very beginning, but what should you do if you need to connect to a database with a schema that isn’t in-line with Rails conventions? In this post I will explain how to setup a Rails application that uses DataMapper as the ORM. I’ll also guide you through configuring your models to work with a legacy schema. If you are not familiar with DataMapper I recommend reading the “Getting Started” section of the official documentation. ###A legacy schema For the sake of this post let’s use an example schema of a blog application using MySQL:
CREATE TABLE `post` (
`postId` varchar(45) NOT NULL,
`postAuthorId` int(11) DEFAULT NULL,
`postTitle` varchar(45) DEFAULT NULL,
`postContent` text,
`postPublishTime` varchar(45) DEFAULT NULL,
PRIMARY KEY (`postId`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `user` (
`userId` int(11) NOT NULL AUTO_INCREMENT,
`userName` varchar(45) DEFAULT NULL,
`userEmail` varchar(45) DEFAULT NULL,
PRIMARY KEY (`userId`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
Here’s what’s “wrong” with it:
- table names are singularized
- column names use camel-case naming convention prefixed with a table name
- post primary key is an md5 sum of concatenated values of “author” and “publishTime”
- post => user foreign key has a custom name “author”
- “postPublishTime” is stored as a string and we would like to use DateTime object
###Creating a rails application with DataMapper as ORM
This is actually a straight-forward task. All you need to do is to use a template provided by DataMapper when generating an application skeleton as follows:
rails new blog-legacy -m http://datamapper.org/templates/rails.rb
By default the skeleton uses sqlite as the database. To change that, simply edit Gemfile and change ‘dm-sqlite-adapter’ to ‘dm-mysql-adapter’ and run bundle install. To generate empty models run:
script/rails g model user
script/rails g model post
###Naming conventions DataMapper gives you two ways of defining table and column names. You can either set it manually in every model or you can configure global naming conventions for all models.Here’s how to set table and field names explicitly in the model:
class User
include DataMapper::Resource
storage_names[:default] = "user"
property :id, Serial, :field => "userId"
property :name, String, :field => "userName"
property :email, String, :field => "userEmail"
end
With many properties it would add clutter to our models, that is why it is better to set global naming conventions for resources and fields. Those conventions will be applied to every model and will eliminate the need to define storage names and set :field options for every property.
To set global naming conventions, we can create datamapper.rb initializer like this:
# config/initializers/datamapper.rb
Rails::DataMapper.configuration.resource_naming_convention[:default] = lambda do |name|
name.downcase.singularize
end
# note: this will be available in dm-rails 1.1.1
Rails::DataMapper.configuration.field_naming_convention[:default] = lambda do |property|
"#{property.model.name.downcase}#{property.name.to_s.camelize}"
end
Now we can clean up our User model like this:
class User
include DataMapper::Resource
property :id, Serial
property :name, String
property :email, String
end
Much better! ###Dealing with natural primary keys In our example schema, a post’s primary key is an MD5 sum of concatenated fields “postAuthorId” and “postPublishTime”. If we want to have the value generated in Ruby we can use :default options in property declaration:
class Post
include DataMapper::Resource
property :id, String, :key => true, :default => lambda { |resource, property|
Digest::MD5.hexdigest("#{author}#{publish_time}") }
# ...
end
Now whenever you create a post and don’t provide :id it will be generated automatically based on the values of :author and :publish_time properties.
As you can see this doesn’t look very nice in the model. So, we could create a custom property with a generalized functionality to clean this up! This way we can re-use this code in different scenarios whenever we want an MD5 sum of concatenated fields. You might be surprised how easy this is to do:
module ::DataMapper
class Property
class MD5 < String
key true
length 32
accept_options :fields
default lambda { |resource, property|
Digest::MD5.hexdigest(property.options[:fields].join) }
end
end
end
Now we can define post :id like this:
class Post
include DataMapper::Resource
property :id, MD5, :fields => [ :author, :publish_time ]
end
How easy was that?!? ###Associations with custom foreign keys Our database has a basic 1:M relationship between User and Post models. The only caveat is that foreign key on post table pointing to user id has a name that doesn’t follow our global naming conventions. To configure the association we must override our defaults:
class User
include DataMapper::Resource
property :id, Serial
property :name, String
property :email, String
# We need to set child_key otherwise DataMapper would use :user_id
has n, :posts, :child_key => :author_id
end
class Post
include DataMapper::Resource
property :id, MD5, :fields => [ :author, :publish_time ]
property :title, String
property :content, Text
# This creates a property called :author_id for you. We also set “User” as the parent model
belongs_to :author, “User”
end
Easy enough. ###Fine tuning - typecasting properties Post :publish_time is a String property as values are stored in a VARCHAR column. Our goal is to work with DateTime objects, not strings. To achieve that we could simply override property accessor and do typecasting like this:
class Post
# …
def publish_time
DateTime.parse(super)
end
end
While this approach works, it is not a recommended approach. DataMapper is based on the concept that a model’s logic should be decoupled from the database schema. A much better and cleaner way is to abstract away that logic in to a custom property object. An extra benefit comes with that too - you can re-use that code in many applications as it’s not tied to a specific model. Here’s another approach:
module DataMapper
class Property
class DateTimeString < String
def load(value)
::DateTime.parse(value)
end
def dump(value)
value.to_s
end
end
end
end
Better? ###Wrap-up All right! We’ve got all we need to work with our legacy schema. Let’s take a look at the final version of the models:
class User
include DataMapper::Resource
property :id, Serial
property :name, String
property :email, String
has n, :posts, :child_key => :author_id
end
class Post
include DataMapper::Resource
property :id, MD5, :fields => [ :author, :publish_time ]
property :publish_time, DateTimeString
property :title, String
property :content, Text
belongs_to :author, "User"
end
At first sight it is hard to tell that the underlying database has a funky schema, isn’t it? These models look almost as if you were starting a fresh project.
Let’s take a look at how we could work with them:
user = User.create(:name => "John Doe", :email => "[email protected]")
=> #<User @id=2 @name="John Doe" @email="[email protected]">
This creates a following insert statement:
INSERT INTO `user` (`userName`, `userEmail`) VALUES ('John Doe', '[email protected]')
Let’s create some posts too:
user.posts.create(:title => "Hello World!", :content => "DataMapper is awesome :)", :publish_time => DateTime.now)
=> #<Post @id="4a585935e1d2dd21da146f6d3bcfc41b" @publish_time="2011-05-12T13:39:04+02:00" @title="Hello World!" @content="DataMapper is awesome :)" @author_id=2>
And SQL:
INSERT INTO `post` (`postId`, `postPublishTime`, `postTitle`, `postContent`, `postAuthorId`) VALUES ('4a585935e1d2dd21da146f6d3bcfc41b', '2011-05-12T13:39:04+02:00', 'Hello World!', 'DataMapper is awesome :)', 2)
###Further reading and resources This post explains basics as well as some more advanced things you can do with DataMapper. If you would like to learn more, please take a look at the documentation, subscribe to the Mailing List or jump into the #datamapper IRC channel.
Other useful links:
Author Bio
Piotr Solnica is a team member atCode Benders, an agile software collective based in Bend, Oregon. When Piotr’s not writing great code, he’s writing awesome code. He has been working as a web developer for 6+ years and has solid experience with Ruby-based technologies and frameworks. His proficiencies include a range of JavaScript libraries such as Prototype, jQuery, YUI and MooTools. Piotr is an unobtrusive JavaScript evangelist, an HTML/CSS purist and a linux geek. In his free time he supports OSS, is a DataMapper core contributor, an author of DataMapper plugins and adapters and one of the developers of rubytime.org and coderack.org.
Share your thoughts with @engineyard on Twitter