Making ActiveResource 34x faster: QActiveResource
One of the things that really hurts us when we’re pulling data from ActiveResource (as we do with Shopify and a couple projects internally) is the slowness of Rails’ ActiveResource. Using the Nokogiri backend helps a lot, but it’s still painfully slow. It’s so slow that the bottle neck is actually parsing the data rather than transfering it.
So I set off yesterday to rewrite our importer in C++ using Qt (and libcurl to grab the data). The result is a nice Qt-ified ActiveResource consumer that throws things into a collection of QVariant / hashes / lists.
Once I had that done I wondered, “What would the performance look like if I wrote a Ruby wrapper for the C++ implementation?” The answer was, fortunately, “Great!” meaning fortunately that the application logic can stay in Ruby with QActiveResource doing the heavy lifting.
It’s still relatively light on features — it just supports the find method, but the numbers speak for themselves:
The first column is the default pure-Ruby ActiveResource implementation, the second is with the same, but using the implemented-in-C Nokogiri backend. The third is just using my C++ backend directly and the fourth is with that bound to Ruby objects.
Methodology:
The data set is the set of products that we have in our test shop on Shopify. There are 17 of them, for a total of a 36k XML file. Each test iterates over reading that set 100 times. To remove other bottlenecks, I copied the file to a web server on localhost and queried that directly.
So, then that’s reading 1700 records for each backend over 100 request, with that the average times were:
- Ruby / ActiveResource / REXML: 34.60 seconds
- Ruby / ActiveResource / Nokogiri: 12.87 seconds
- C++ / QActiveResource: 0.79 seconds
- Ruby / QActiveResource: 1.01 seconds
All of the code is up on GitHub here, including the test data and the raw results.
API in Ruby and C++:
The Ruby API is very similar to a limited version of ActiveResource and supports things like this, for example:
resource = QAR::Resource.new(ENV['AR_BASE'], ENV['AR_RESOURCE']) resource.find(:all, :params => { :page => 1 }).each { |record| puts record[:title] } |
The C++ implementation also follows the same basic conventions, e.g.
#include #include int main() { QActiveResource::Resource resource(QUrl(getenv("AR_BASE")), getenv("AR_RESOURCE")); foreach(QActiveResource::Record record, resource.find()) { qDebug() << record["title"].toString(); } return 0; } |
At present this naturally requires Qt and libcurl to build. I may consider building a version which doesn’t require Qt if there’s general interest in such (we use Qt in the C++ bits of our code anyway, so there was no extra cost to schlurping it in).
There are more examples in the Ruby directory on GitHub. Once it matures a wee bit we’ll get it packaged up as a Gem.
Edit:
The API’s already been going through some changes, now it can be easily used as a mixin, a la:
require 'rubygems' require 'active_resource' require 'QAR' class Product < ActiveResource::Base extend QAR self.site = 'http://localhost/' end |