Hierarchical Tagging with Rails 3 and Closure Trees

In rebuilding PhotoStructure on Rails, I was surprised that one of the most popular gem for trees used a nested set model. Nested sets are performant for reads, but for adding and deleting nodes, it’s extremely expensive — on average, half of the rows for your model acting as a nested set have to be updated for every add or delete. That’s crazy-talk! It requires a table-level lock for something that, at least for PhotoStructure, is a very common task. To top it off, the gem doesn’t even do the lock!

I then found the ancestory gem, which works by materializing the ancestral path as a string and storing that as a column. Bill Karwin’s excellent Models for hierarchical data presentation describes this as the Path Enumeration algorithm, which doesn’t have referential integrity, and relies on performant LIKE selects.

As the tag hierarchies in PhotoStructure are never moved, closure trees should prove to be ideal. It’s an excellent excuse to learn how to build and publish a rails plugin gem, so I’ve built a new gem to support closure trees, and I named it “closure_tree”. Check it out at:

http://mceachen.github.io/closure_tree/

 

Update, May 24

1.0.0.beta1 released. All public methods have test coverage.

Update, May 25

1.0.0.beta2 and 1.0.0.beta3 were released, which added find_or_create class and instance methods, ancestry_path, and root instance methods. Documentation is on the README and in the rdocs.

Update, May 29

1.0.0.beta5 is released, which cleaned up the ancestor and descendant relationships to use has_and_belongs_to_many, and adds leaves class and instance methods.

Update, Oct 26

2.0.0.beta1 is released, which added the :dependant option, switched from an embedded dummy rails app for testing to rspec, and much better test coverage (including tests and fixes for a couple reported issues) under sqlite, MySQL, and PostgreSQL.

Update, Nov 27

3.0.0 is released, which supports polymorphic trees.

Update, June 2014

BOOM. Lots of stuff since the last update!

  • support for concurrency, 5 rubies, and all current rails versions
  • more than 20 contributing engineers
  • 65 released versions in total
  • more than 80,000 downloads
Tagged with: , , ,
Posted in Technical HOWTOs
  • Upstill

    Matthew,
    This is exactly what I’ve been looking for–I can’t believe mine is the first comment. Two questions though: 1) what do you mean by “polymorphic”? 2) Is it a hard requirement that a tree be strictly hierarchical? I have a requirement to be able to put any key under any other key (while keeping the graph acyclic, of course), and 3) I want to edit a large number of terms by dragging and dropping onto an expandable hierarchical list. Do you know of any, say, jQuery tools for doing that?
    Cheers,
    Steve Upstill

  • Danchik Butcher

    Nicely done Matthew! Especially i like :name_column option with ability to get ancestry_path
     and then found it by those column, it’s very cool feature i think! Very helpful, thanks a lot!

  • Anonymous

    All existing nodes? ActiveRecord’s .all() should work: http://api.rubyonrails.org/classes/ActiveRecord/FinderMethods.html#method-i-all

  • Anonymous

    > 1) what do you mean by “polymorphic”? It means you can use STI, or single-table inheritance, in the model that you mark as acts_as_tree — so you can have subclass Tag with DateTags, ColorTags, and BaconTags, and when you fetch the tags back with .ancestors() or .descendants(), the entities will be instances of the correct class.> 2) Is it a hard requirement that a tree be strictly hierarchical?Yes. There is an underlying assumption that there is exactly zero or one parent for any given node in the tree.