Posts tagged ‘ruby’

Making ActiveResource 34x faster: QActiveResource

One of the things that really hurts us when we’re pulling data from ActiveResource (as we do with Shopify and a couple projects internally) is the slowness of Rails’ ActiveResource. Using the Nokogiri backend helps a lot, but it’s still painfully slow. It’s so slow that the bottle neck is actually parsing the data rather than transfering it.

So I set off yesterday to rewrite our importer in C++ using Qt (and libcurl to grab the data). The result is a nice Qt-ified ActiveResource consumer that throws things into a collection of QVariant / hashes / lists.

Once I had that done I wondered, “What would the performance look like if I wrote a Ruby wrapper for the C++ implementation?” The answer was, fortunately, “Great!” meaning fortunately that the application logic can stay in Ruby with QActiveResource doing the heavy lifting.

It’s still relatively light on features — it just supports the find method, but the numbers speak for themselves:

The first column is the default pure-Ruby ActiveResource implementation, the second is with the same, but using the implemented-in-C Nokogiri backend. The third is just using my C++ backend directly and the fourth is with that bound to Ruby objects.

Methodology:

The data set is the set of products that we have in our test shop on Shopify. There are 17 of them, for a total of a 36k XML file. Each test iterates over reading that set 100 times. To remove other bottlenecks, I copied the file to a web server on localhost and queried that directly.

So, then that’s reading 1700 records for each backend over 100 request, with that the average times were:

  • Ruby / ActiveResource / REXML: 34.60 seconds
  • Ruby / ActiveResource / Nokogiri: 12.87 seconds
  • C++ / QActiveResource: 0.79 seconds
  • Ruby / QActiveResource: 1.01 seconds

All of the code is up on GitHub here, including the test data and the raw results.

API in Ruby and C++:

The Ruby API is very similar to a limited version of ActiveResource and supports things like this, for example:

resource = QAR::Resource.new(ENV['AR_BASE'], ENV['AR_RESOURCE'])
resource.find(:all, :params => { :page => 1 }).each { |record| puts record[:title] }

The C++ implementation also follows the same basic conventions, e.g.

#include 
#include 
 
int main()
{
    QActiveResource::Resource resource(QUrl(getenv("AR_BASE")), getenv("AR_RESOURCE"));
 
    foreach(QActiveResource::Record record, resource.find())
    {
        qDebug() << record["title"].toString();
    }
 
    return 0;
}

At present this naturally requires Qt and libcurl to build. I may consider building a version which doesn’t require Qt if there’s general interest in such (we use Qt in the C++ bits of our code anyway, so there was no extra cost to schlurping it in).

There are more examples in the Ruby directory on GitHub. Once it matures a wee bit we’ll get it packaged up as a Gem.

Edit:

The API’s already been going through some changes, now it can be easily used as a mixin, a la:

require 'rubygems'
require 'active_resource'
require 'QAR'
 
class Product < ActiveResource::Base
  extend QAR
  self.site = 'http://localhost/'
end