tagging.rb |
|
|---|---|
Tagging |
|
Intro |
|
|
When building a Web 2.0 application, tagging will probably come up as one of the most requested features. Popularized by Delicious, it has quickly become a useful way to organize crowd sourced data. |
|
How it was done |
|
|
Typically, when you do tagging using an RDBMS, you’ll probably end up having a taggings and a tags table, hence a many-to-many design. Here is a quick sketch just to illustrate:
As you can see, this design leads to a lot of problems:
|
|
The Ohm approach |
|
|
Here is a basic outline of what we’ll need:
|
|
Beginning with our Post model |
|
|
Let’s first require ohm. |
require 'ohm' |
|
We then declare our class, inheriting from |
class Post < Ohm::Model |
|
The structure, fields, and other associations are defined in a declarative
manner. Ohm allows us to declare attributes, sets, lists and
counters. For our usecase here, only two attributes will get the job
done. The |
attribute :body
attribute :tags
index :tag |
|
One very interesting thing about Ohm indexes is that it can either be a
String or an Enumerable data structure. When we declare it as an
Enumerable,
Pretty neat ain’t it? |
def tag
tags.to_s.split(/\s*,\s*/).uniq
end
end |
Testing it out |
|
|
It’s a very good habit to test all the time. In the Ruby community, a lot of test frameworks have been created. |
|
|
For our purposes in this example, we’ll use cutest. |
require "cutest" |
|
Cutest allows us to define callbacks which are guaranteed to be executed
every time a new |
prepare { Ohm.flush } |
|
Next, let’s create a simple |
setup do
Post.create(:body => "Ohm Tagging", :tags => "tagging, ohm, redis")
end |
|
For our first run, let’s verify the fact that we can find a |
test "find using a single tag" do |p|
assert Post.find(tag: "tagging").include?(p)
assert Post.find(tag: "ohm").include?(p)
assert Post.find(tag: "redis").include?(p)
end |
|
Now we verify our claim earlier, that it is possible to find a tag using any one of the combinations for the given set of tags. We also verify that if we pass in a non-existent tag name that
we’ll fail to find the |
test "find using an intersection of multiple tag names" do |p|
assert Post.find(tag: ["tagging", "ohm"]).include?(p)
assert Post.find(tag: ["tagging", "redis"]).include?(p)
assert Post.find(tag: ["ohm", "redis"]).include?(p)
assert Post.find(tag: ["tagging", "ohm", "redis"]).include?(p)
assert ! Post.find(tag: ["tagging", "foo"]).include?(p)
end |
Adding a Tag model |
|
|
Let’s pretend that the client suddenly requested that we keep track of the number of times a tag has been used. It’s a pretty fair requirement after all. Updating our requirements, we will now have:
|
|
|
Continuing from our example above, let’s require |
require "ohm/contrib" |
|
Let’s quickly re-open our Post class. |
class Post |
|
When we want our class to have extended functionality like callbacks,
we simply include the necessary modules, in this case |
include Ohm::Callbacks |
|
To make our code more concise, we just quickly change our implementation
of |
def tag(tags = self.tags)
tags.to_s.split(/\s*,\s*/).uniq
end |
|
For all but the most simple cases, we would probably need to define
callbacks. When we included
|
|
|
For our scenario, we only need a |
protected
def before_update
tag(read_remote(:tags)).map(&Tag).each { |t| t.decr :total }
end |
|
And of course, we increment all new tags for a particular record after successfully saving it. |
def after_save
tag.map(&Tag).each { |t| t.incr :total }
end
end |
Our Tag model |
|
|
The |
class Tag < Ohm::Model
counter :total |
|
The syntax for finding a record by its ID is To simplify our code, we override |
def self.[](id)
super(encode(id)) || create(:id => encode(id))
end
end |
Verifying our third requirement |
|
|
Continuing from our test cases above, let’s add test coverage for the behavior of counting tags. |
|
|
For each and every tag we initially create, we need to make sure they have a total of 1. |
test "verify total to be exactly 1" do
assert 1 == Tag["ohm"].total
assert 1 == Tag["redis"].total
assert 1 == Tag["tagging"].total
end |
|
If we try and create another post tagged “ruby”, “redis”, |
test "verify totals increase" do
Post.create(:body => "Ruby & Redis", :tags => "ruby, redis")
assert 1 == Tag["ohm"].total
assert 1 == Tag["tagging"].total
assert 1 == Tag["ruby"].total
assert 2 == Tag["redis"].total
end |
|
Finally, let’s verify the scenario where we create a |
test "updating an existing post decrements the tags removed" do
p = Post.create(:body => "Ruby & Redis", :tags => "ruby, redis")
p.update(:tags => "redis")
assert 0 == Tag["ruby"].total
assert 2 == Tag["redis"].total
end |
Conclusion |
|
|
Most of the time we tend to think in terms of an RDBMS way, and this is in no way a negative thing. However, it is important to try and switch your frame of mind when working with Ohm (and Redis) because it will greatly save you time, and possibly lead to a great design. |
|