Interview with Intruders.tv
Thanks to Jof and Vincent from Intruders.tv for doing the interview!
Archive for the ‘Uncategorized’ Category.
Thanks to Jof and Vincent from Intruders.tv for doing the interview!

Gleaned from the RSS feed for a few weeks plus Wordle. Here’s one with a hand-created list of stop-words to try to get out some of the irrelevant stuff. Gives an interesting look at what people are asking for recommendations for on Twitter:

We’ve been slack on blogging of late. The Twitter bug finally bit the two of us that founded Directed Edge. I finally got around today to setting up an “official” (meaning, “spared of our random personal musings”) Directed Edge Twitter account today:
http://twitter.com/directededge
You can find the two fouders streams linked there. But have no fear, there will be more good old fashion blogging action on the way. I’ve got posts in the pipe on some of our experiences in picking a CRM system and some of the optimization work on our graph storage engine that should be coming along in the next week or so.
Yes, that’s right, we need a logo. We’ve all run out of business cards, so before putting in an order for a new batch, I’ve decided to revisit the logo situation. We’re looking for something simple and business-y that also can be reduced to line art for print. Particular colors aren’t terribly important since the website needs a revamp too. If you or somebody you dig feels like moved to send us a bid (with a link to a portfolio), please do.
I’ll be doing a session tomorrow or Sunday at BarCamp Berlin (currently the most upvoted presentation topic). I’ll cover some of the basics of modern recommendation systems, including basic categories of algorithms and why recommendations are important for modern web applications. Dave Sharrock and Garik Petrosyan from Be2 (dating site) will be co-presenting, talking about some of the things they’ve had to overcome in building a scaleable match-making system.
Valentin will be doing a session on multi-lingual blogging.
We’ll also be at the following events in the Berlin web week:
Drop us a line if you’d like to meet and talk about recommendations and what they can do for your site!
This one was requested a few times. We’ve thrown together a simple AJAX widget for showing related articles on other sites. We’ll add a little more bling later on, especially once we get some more data sets out there.
The idea is this: You have a site that has information about music, or movies or airplanes or whatever, and you’d like to add links to related encyclopedia to your page with just a couple lines of customization.
To make that work as one would expect we also added fuzzy search. That’s important because, for instance, the article for “Madonna” in Wikipedia is titled, “Madonna (entertainer)”. So, now with our fuzzy searching you can search for the closest hit to a term with a specific tag, in this case “Madonna” tagged as “musician”.
The list of available tags is here.
The new AJAX widget is trivial to use. Here’s what it looks like. If you click it’ll take you to a page that you can use to copy-paste from.
You can get that on your page by adding two lines to you page / template:
<script src=”http://pedia.directededge.com/RelatedArticlesWidget.js” type=”text/JAVASCRIPT”></script>
<div class=”RelatedArticles” title=”Music Related to James Brown” fuzzy=”James Brown” fuzzytag=”musician” tags=”musician”></div>
All items that are returned use the “RelatedArticles” class, so that you can style them in your stylesheet as you see fit. These are the supported attributes (only title or fuzzy are required):
If you decide to link to Wikipedia or somewhere else using the last tag, we’d request that you at least give us a mention in your company / project’s blog somewhere.
Enjoy and feel free to drop us questions or comments!
Thanks to Ernst from The Next Web for publishing an interview with us.
So in the space since the last blog post we’ve been working on getting everything squared away for our commercial web services API. It’s now running live at webservices.directededge.com. There’s some documentation up on how the REST API works there. I also went through the hoops of moving over from our self-signed certificate to a proper certificate this week; I’d forgotten how much of a pain those can be to deal with.
If you’ve got any questions about the API or things that you’d like to do with it that don’t seem to be supported at the moment, we’d like to hear from you!
We’re very grateful that one of our users knocked out one of the items on our to-do list and created a Greasemonkey script for showing related articles on Wikipedia. If you have Greasemonkey installed in Firefox you can just click on “install script” on this page. To get related articles without being logged in to Wikipedia.
Per the comments on that page, we will start rolling out our Wikipedia demo in other languages probably about a week from now.
So, we think this is pretty cool beans. When we did our demo with a mashup of Wikipedia’s content we knew that we wanted something that potential customers could quickly look at and get a feel for what our recommendation engine is capable of, and we got a lot of good feedback about that in our recent technology preview. On the other hand, we knew that we weren’t going to get the masses to switch over to user our Wikipedia interface.
One of the open questions for us as we pushed out the first bits of our web-services API last week was, “Can we get this content to show up in Wikipedia proper?”
Last night after an extended hacking session where I tried a number of strategies for doing DOM scripting to pull in external content (and some misadventures in trying to do cross-site XMLHttpRequests) I managed to come up with a simple way of pulling in content from our web service via JSONP, and added support for JSON output to our web service along the way. For Wikipedians that are logged in, it only requires adding one line to your monobook.js file and I’ve created a short how-to here. The source code, for interested hackers is here.
Here’s what it looks like:

When we launched our demo a few people didn’t seem to get quite what it does that our engine is doing — we’re not just analyzing the current page and pulling in a few important links; we’re jumping out a few levels in the link structure and analyzing and ranking usually several thousand links in the neighborhood of the target page. Often those pages are linked from the target page, but that’s hardly a surprise. I come from a background of doing research in web-like search, so it’s no coincidence that our approach to finding related pages takes some hints from PageRank and other link-based approaches to sorting out the web.
We’d invite people to try this out and of course to keep playing with our mashup; we’ve gotten so used to having related pages that it’s hard to go back to the vanilla Wikipedia — having the related pages there makes it really easy to sort out things like, “What are the important related topics?” or “Well, I know about X, what are the main alternatives?” And so on. We’ve got some other exciting stuff up our collective sleeves that we’ll be rolling out in the next couple of weeks, so stay tuned!
Work on the web services API for the encyclopedia continues, now with tags. Here’s a quick rundown:
You can get a list of supported tags here:
http://pedia.directededge.com/api/v1/tags/
That currently returns:
<?xml version="1.0" encoding="UTF-8"?> <directededge version="0.1"> <tag>actor</tag> <tag>author</tag> <tag>book</tag> <tag>company</tag> <tag>film</tag> <tag>musician</tag> </directededge>
You can then get results from article queries based on a tag, using something like this:
http://pedia.directededge.com/api/v1/article/KDE/tags/company/
Which returns:
<?xml version="1.0" encoding="UTF-8"?>
<directededge version="0.1">
<item id="KDE">
<link>Trolltech</link>
<link>Novell</link>
<link>Hewlett-Packard Company</link>
<link>Nokia</link>
<link>World Wide Web Consortium</link>
<link>Mandriva</link>
<link>Canonical Ltd.</link>
<link>Sirius Satellite Radio</link>
</item>
</directededge>
You can query any article for any tag (unlike in the web interface). Right now the results for “off topic” tags tend to be hit-or-miss. One of the other big items on our to-do list is improving tagged results in our engine.
I’m posting incremental updates like this in the hopes that if you’re planning on using our API in a mashup that you’ll let us know what you like and don’t like before we freeze v1.
We’ve also decided on a couple of limitations for the open API that aren’t true for our commercial API (running either on customer data sets or open data sets):
We think those are pretty reasonable and still give users a fair bit of room to play for free. If you’re interested in using our commercial API, drop us a line! We’ve also just created an announcement list where we’ll notify folks that are signed up of important details. You can sign up for that here.
This will still definitely be in flux, but I started getting parts of the REST API up if folks want to play with it. Warning: the format may change.
You can now hit something like:
http://pedia.directededge.com/api/v1/article/KDE
And get back:
<?xml version="1.0" encoding="UTF-8"?>
<directededge version="0.1">
<item id="KDE">
<link>GNOME</link>
<link>Unix-like</link>
<link>Desktop environment</link>
<link>Konqueror</link>
<link>Qt (toolkit)</link>
<link>KDE 4</link>
<link>GNU Lesser General Public License</link>
<link>X Window System</link>
<link>KPart</link>
<link>Widget toolkit</link>
</item>
</directededge>
I’ll be adding support for JSON output and filtering based on tags in the next few days. Once I’ve got a set of features there that I consider feature complete I’ll freeze the “v1″ so that people can create mashups based on that and be sure that the API will remain stable.
This does do capitalization correction, but does not do redirect detection. I’m debating if I want to do that by default or use another REST path since it requires another couple DB queries and is as such a little slower.
Like any new startup co-founder, I’m obsessive about seeing how what we’re doing trickles out over the web. Being an RSS-warrior today I went looking for a Google search to RSS converter and found FeedMySearch, which now, a few hours into using it seems to do quite well in pulling in new information about Directed Edge as it hits Google’s indexes.
Nifty tool. Anything that stops me from compulsive reloading is a win. Now back to implementing new features. :-)
It’s an exciting day for us at Directed Edge. Today we’re finally putting our Wikipedia-based technology preview out there for people to play with. Before you click over to it, here’s a little about what you’re looking at.
As our name implies, we’re graph theory nerds. We look at the roughly 60 million links between the 2.5 million English Wikipedia pages, and with a few extra cues from the content, figure out pages related to the current one and put that in a little box in the upper left (as evident from the image on our home page). In some cases, if we’re able to pick out what sort of page it is, we also drop in a second box with just other pages of the same type.
Finding related pages in Wikipedia isn’t fundamentally what Directed Edge is about. We’ve got a super-fast in-house graph storage system that makes it possible to do interesting stuff with graphs quickly, notably figure out which pages are related. We’ve already got a couple of pilot customers lined up and will be working with a more in the next weeks to analyze their sites and figure out how things are related there. We’ve got a prototype of our web-services API that they’ll be using to send us break-downs of how stuff’s connected on their site and we’ll send back what we hope are some pretty groovy recommendations.
There are dozens of things in the pipe for us: ways to make recommendations better, ways to make the Wikipedia demo cooler, things customers want to see in our web services that we’d previously not thought of, and we could ramble on that for a while, but there are a few things that are on the very near horizon that didn’t quite make it into this round:
If you subscribe to our news feed you’ll see immediately when those services go live. Even though we’re still in the beta-phase and are only accepting a limited number of customers, if you think you’d be interested in using our engine for your site down the line, we’d encourage you to register now since we’ll be offering a discount for our commercial services to everyone who fills out their info in the contact form during the beta phase.
More soon. Enjoy!
We’ve now committed to going into public beta / technology preview next Wednesday, August 13th. We’ll be launching our new site with more information about our products and services at that time.
Press / bloggers may request invites by sending us a mail. It’ll be an exciting next few days as we iron out the last kinks and get ready for the onslaught.
The website will be a bit in flux, but our bio info is still available here.
Edit: If you’ve been testing the demo previously the location has changed. Drop us a line for the new URL.
I started writing web applications around 1997. On Solaris. Using Netscape’s web server. In Perl.
My LinkedIn profile starts thusly:
I began working with LAMP back in the days where the men were men and we ate Perl for breakfast. Installing Linux with 40 floppy disks puts hair on your teeth. One thing led to another, and before I knew it I was living with an E-Commerce system. What can I say? I was young and needed the money.
Around 2001 I took a departure from the web world to work on enterprise and desktop software. Sure, I slung a little web code here and there, but I didn’t track the technology landscape like I did back in the good old days.
When looking at founding Directed Edge, it was time to re-approach the web and get back on friendly terms. Like an old-timer desperately trying to identify all that is hip, I set out to figure out what the cool kids were doing.
Python with Django and Ruby on Rails.
I learned bits of Python and Ruby, things I’d been meaning to learn for ages. Ruby’s syntax and I didn’t hit it off at first, so I spent a couple days reading Learning Python, which had been sitting on my shelves for a few months.
But friends, home is where the heart is, and even after getting a reasonable grasp on Python I kept going back to Perl. There are two reasons for this:
The CPAN has grown so large and comprehensive over the years that many people learning Perl seem to elevate it to a sort of mythical status, and express surprise when they begin to encounter topics for which a CPAN module doesn’t exist already.
CPAN is huge, easy to search, well documented, and trivial to deploy. Need some code to do TLS authenticaed SMTP tranfers? Trivial. Need a WSDL compiler to work with an old SOAP API? Up and going in a few minutes. Need to test a REST API with a really mature HTTP implementation? It’s there. Need code for quickly generating mail routing code for feedback processing? Bingo. But wait — that’s all pretty common stuff, right? There are even CPAN modules for stuff like tracking quantum superpositions in quantum computing algorithms or quickly building genetic algorithm implementations (my two research areas in college). And all under the same roof.
And let’s back up one bit; for all of the perceived culture of sloppiness, I earnestly believe that Perl has the strongest documentation culture of any major programming language. By and large CPAN’s 12,000-something modules are rather well documented with examples and gotchas in addition to the basic API docs. As a special bonus as soon as you’ve installed them with the command line cpan tool (which automatically resolves dependancies, downloads, tests and installs) they’re available in your system’s man pages. The standard man pages for core language features are great, and well written to boot. The Camel Book will forever have a place on the gilded streets of O’Reilly’s hall of fame as possibly the most enjoyable to read 1000+ page technical book ever written.
Combined with the power of CPAN, Perl just has something about it that makes gruesome, and gruesomely fast hacks possible. Much of this is owing to CPAN solving 75% of the universe’s problems for you from the get-go. But Perl is something of the Sicilian mobster of the programming world — it gets stuff done.
Add to that that it’s one of the speediest scripting languages performance-wise, and it’s great for quick-and-dirty hacks that programmers invariably have to come up with on a regular basis. Perl seems to be optimized for writing as little code as possible to get the job done.
Most people talking about Perl are quick to groan about its ugliness. I’ll first note, most of them don’t know Perl, so it’s my earnest belief that much of that is fadiness. Perl can be well written, but its syntactic moral flexibility means that there’s a lot of ugly Perl out there. I’m not going to try to pass that off as a good thing. But a real Perl mensch can write Perl that’s as easy to read as code in most other popular programming languages.
I’m also not advocating doing large projects in Perl. In a decade of Perl slinging, it’s only happened a time or three that I written tools that were more than a couple thousand lines of code. (But again, the beauty lies in that I’ve rarely needed to.) Nothing particularly central to Directed Edge is written in Perl, but it’s been my Swiss Army Chainsaw on the fringes — converting data formats, processing simple forms, interfacing with databases — glue code, basically.
And there I have to say, despite wanting oh-so-desperately to be one of the Python cool kids, I think Perl is there to stay.
Check one thing off the list of stuff that’s been taking time for us: as of August 1st, Directed Edge is incorporated; we’re pretty excited. We’ll transfer the IP into the new company next week.
Even though updates here have been slow, it’s been for the best of problems — despite not having our open beta launched we’ve already got a couple of pilot customers and I’ve been trying to get the integration process nailed down for our web services API.
The prototype is already in the state that we’ll use for our launch and we’ve got a few dozen people testing it in our closed beta.
So what’s next? Well, we’re going to try to get our public presence ready for launch, mostly meaning finishing up our in progress web site, get press releases ready in German and English, get a few more bloggers on board and do the deed. If you’re a blogger or have press connections and want to get the scoop before we launch, drop us a line!
Getting the rest of the legal stuff taken care of will take some time in the next week as well, but after thinking too much on the incorporation issue, it’s nice to be moving along.
I stumbled across this photo of Valentin and I getting our questions answered about incorporation style at Seedcamp Berlin. I must say that was one of the more useful events that we’ve been to so far — it was a great chance to get in touch with an impressive collection of mentors and to network with other (mostly) German startups.
There will be another update on Directed Edge stuff soon-ish, but I just wanted to get out a list of the upcoming Berlin events where we and other Berlin startups will be talking shop and doing demos:
Just finished up attending the last of the events in the recent post and thought I’d mention the next batch on the Directed Edge calendar:
I’ve got two weeks left until I’m full-time on Directed Edge and it’s looking to be an exciting time. Like most startups stumbling towards launch, there are a thousand threads being followed up simultaneously, the last couple of weeks more business than technical. We’re going to make our demo progressively more open during the next weeks, but the feedback we’ve gotten from the first testers is encouraging.
But I did make it into the video feed from the Prague TechCrunch event. There’s about a half-minute pseudo pitch in here. Lessoned learned from watching it back? Loosen up. Get to the point faster. Like in 10 seconds. (Isn’t showing up in syndicated version, click through to the full post.)