Archive for May 2010

What programming languages do our customers use?

No huge shockers, but a few places where it’s interesting to compare perceptions to reality:

  • PHP, for better or worse, still dominates web development.
  • Python’s much closer to Ruby in the usage we see, despite making less noise.
  • Scala, despite being the new hotness has very little actual uptake.
  • Perl is all but out of the game (though still more than Scala)

Making ActiveResource 34x faster: QActiveResource

One of the things that really hurts us when we’re pulling data from ActiveResource (as we do with Shopify and a couple projects internally) is the slowness of Rails’ ActiveResource. Using the Nokogiri backend helps a lot, but it’s still painfully slow. It’s so slow that the bottle neck is actually parsing the data rather than transfering it.

So I set off yesterday to rewrite our importer in C++ using Qt (and libcurl to grab the data). The result is a nice Qt-ified ActiveResource consumer that throws things into a collection of QVariant / hashes / lists.

Once I had that done I wondered, “What would the performance look like if I wrote a Ruby wrapper for the C++ implementation?” The answer was, fortunately, “Great!” meaning fortunately that the application logic can stay in Ruby with QActiveResource doing the heavy lifting.

It’s still relatively light on features — it just supports the find method, but the numbers speak for themselves:

The first column is the default pure-Ruby ActiveResource implementation, the second is with the same, but using the implemented-in-C Nokogiri backend. The third is just using my C++ backend directly and the fourth is with that bound to Ruby objects.


The data set is the set of products that we have in our test shop on Shopify. There are 17 of them, for a total of a 36k XML file. Each test iterates over reading that set 100 times. To remove other bottlenecks, I copied the file to a web server on localhost and queried that directly.

So, then that’s reading 1700 records for each backend over 100 request, with that the average times were:

  • Ruby / ActiveResource / REXML: 34.60 seconds
  • Ruby / ActiveResource / Nokogiri: 12.87 seconds
  • C++ / QActiveResource: 0.79 seconds
  • Ruby / QActiveResource: 1.01 seconds

All of the code is up on GitHub here, including the test data and the raw results.

API in Ruby and C++:

The Ruby API is very similar to a limited version of ActiveResource and supports things like this, for example:

resource =['AR_BASE'], ENV['AR_RESOURCE'])
resource.find(:all, :params => { :page => 1 }).each { |record| puts record[:title] }

The C++ implementation also follows the same basic conventions, e.g.

int main()
    QActiveResource::Resource resource(QUrl(getenv("AR_BASE")), getenv("AR_RESOURCE"));
    foreach(QActiveResource::Record record, resource.find())
        qDebug() << record["title"].toString();
    return 0;

At present this naturally requires Qt and libcurl to build. I may consider building a version which doesn’t require Qt if there’s general interest in such (we use Qt in the C++ bits of our code anyway, so there was no extra cost to schlurping it in).

There are more examples in the Ruby directory on GitHub. Once it matures a wee bit we’ll get it packaged up as a Gem.


The API’s already been going through some changes, now it can be easily used as a mixin, a la:

require 'rubygems'
require 'active_resource'
require 'QAR'
class Product < ActiveResource::Base
  extend QAR = 'http://localhost/'