Announcing TweetRex: Twitter friend recommender built on the Directed Edge recommendations engine

So a couple weeks back we decided to have a little fun with the Twitter API and built a friend recommender using the Directed Edge engine. What was originally going to be a two day project that our friend Björn Günzel and I wanted to bang out turned, as two day projects are wont to do, into a 10 day project.

How hard could it be?

It turns out that spidering the Twitter graph in real time presented some interesting challenges for our database. This was the first extremely write intensive application we’d tested: one single recommendations run can often trigger adding thousands or hundreds of thousands of items to our database as we do the progressive spidering.

To make things more fun, with all of those requests out to the Twitter API, the application needed to be heavily multithreaded — often using up to 500 concurrent threads split between spidering Twitter, computing recommendations and handling incoming requests.

Sinatra and new database features:

We wanted to build this application using only the Directed Edge database — no messy ORM layer and all that jazz.  Sinatra seemed like a nice, low-fat alternative to Rails.

The new stuff in our database lets it act as a full-blown key-value store with support for storing arbitrary data with arbitrary mime-types in the DB.  Right now we’re just using this for Twitter profile caching.  In the next update of our database / web-services it’ll be possible to use the database just like webdav assigning images or text to graph-nodes using a simple HTTP put and immediately retrieving them.  This opens up some doors of possibilities to being able to do something like, “Show me pictures of all related items.”  More on that once those features make their way into the released version of the web-services.

Go easy on her, gents.

We’ll see if the app survives its relatively gentle launch.  Right now we hit the Twitter API quite hard — such that in fact we have to limit the number of concurrent users using TweetRex or else our Twitter queries start timing out and recommendations take ages to generate and are incomplete.  You’ll get a little message if we’re overloaded and it’ll automatically move on to the log-in sequence once things are live.

So, what does it do?

TweetRex looks through the Twitter graph for people that are in your “neighborhood” that you’re not following.  The results get better as we spider more of the Twitter graph, so check back in from time to time.  There’s a little message box that pops up asking if we can post a tweet in your stream.  We’d love it if you’d oblige to help us get the word out about the app.

This is just the first beta.  We’ll be adding some more features over time, and are looking forward to your feedback!

And now, without further ado, I give you TweetRex:

http://tweetrex.directededge.com/

4 Comments

  1. Jof:

    Congrats guys! Am going to check it out right now 🙂

  2. Dharm:

    Thanks for sharing the experience.
    What hardware platform are you using for service ?
    How would the performance get affected if one is using LAMP in place of Sinatra ?

  3. Directed Edge launcht TweetRex, einen Friend Recommender für Twitter | Gründerszene:

    […] Das Directed Edge Empfehlungssystem ermittelt dann anhand einer Graphenanalyse individuell die relevantesten Twitterer, die als Follow-Empfehlung in TweetRex ausgegeben werden. Mehr zum technischen Hintergrund von TweetRex findet sich im Blogeintrag zum Launch. […]

  4. Scott Wheeler:

    @Dharm Most of the critical performance stuff is happening in C++ code which is run in background threads. The front-end could really have been written in anything, but since we were planning on using Ruby for an upcoming project, we decided to give it a whirl here.

Leave a comment