Why aren’t we using humans as robots?

Photo by Regolare

Yesterday I had lunch with Stan James of Lijit fame, and it was a blast. One of the topics that’s fascinated both of us is breaking down the walls that companies put up around your data. In the 90’s it was undocumented file formats and this decade it’s EULAs on web services like Facebook. The intent is to keep your data locked in to a service, so that you’ll remain a customer, but what’s interesting is that they don’t have any legal way of enforcing exactly that. Instead they forbid processing the data with automated scripts and giving out your account information to third-party services. It’s pretty simple to detect when somebody’s using a robot to walk your site, and so this is easy to enforce.

The approach I took with Google Hot Keys was to rely on users themselves to visit sites and view pages. I was then able to analyze and extract semantic information on the client side, as a post processing step using a browser extension. It would be pretty straightforward to do the same thing on Facebook, sucking down your friends information every time you visited their profile. I Am Not A Lawyer, but this sort of approach is both impossible to detect from the server side and seems hard to EULA out of existence. You’re inherently running an automated script on the pages you receive just to display them, unless you only read the raw HTTP/HTML responses.

So why isn’t this approach more popular? One thing both me and Stan agreed on is that getting browser plugins distributed is really, really hard. Some days the majority of Google’s site ads seem to be for their very useful toolbar, but based on my experience only a tiny fraction of users have it installed. If Google’s marketing machine can’t persuade people to install client software, it’s obvious you need a very compelling proposition before you can get a lot of uptake.

How to build your own Facebook server

Photo by Coccinelle69

In the last post I talked about the mechanics of how an app communicates with Facebook. With the alpha release of Ringside, there’s now an example of how to implement the server side of Facebook. It’s open-source and the two most interesting parts are their underlying mysql database and the PHP interface code that implements the API on top of that. Using mysql makes it hard to scale to massive numbers of users, so it’s not ready to power Facebook yet. On the other hand, having enough users to strain a single database server is a good problem to have. At that point you should have the resources to reimplement something more advanced under the hood.

Having a reference host for any plugin architecture is immensely helpful, especially one that’s open source. For example, if I was having trouble with the details of fetching events, I could open up ringside/api/includes/ringside/api/facebook/EventsGet.php and inspect exactly what their implementation is. There’s no guarantee that it’s the same as Facebook’s code, but it’s at least an unambiguous and exact specification of what somebody else thinks it should be doing. To get your own copy of the source using SVN, run
svn co https://ringside.svn.sourceforge.net/svnroot/ringside ringside

The other exciting part of Ringside’s release is their mysql schema. It could become a defacto standard for expressing the data that underlies all social networks. Anybody who’s able to take their own data source and translate it into the same tables can plug that into Ringside’s system. Turn the key, and you’ve got your own private Facebook. The schema is at ringside/api/config/ringside-schema.sql

If you want to customize it, the API source is full of great examples of how to work with the database to extend its capabilities, though the LGPL licence might require your changes to also be published.

What’s going on under the hood of Facebook’s API?


Photo by fallsroad

Facebook’s API comes wrapped in libraries for all the popular server languages, but there will come a day when you need to debug the raw HTTP transactions that they all boil down to. As a scripting language, the PHP implementation is easy to understand, and I ended up tweaking mine to output the exact text that’s flowing between me and Facebook. This was partly to help debugging, but also for my own curiosity. I’d like to model some of my interfaces on Facebook’s since it’s simple, robust and flexible.

You call a method by sending an HTTP request to "http://api.facebook.com/restserver.php". Arguments to the method are passed in the POST string sent as part of the request. Here’s an example for an event API call, split up on ampersands so that it won’t go off the edge of the blog, and with any secret values replaced with X:


This is generated by taking the normal PHP arguments to each method, along with stored login and API keys, and serializing them into this string. If CURL is present on the server, this is then used to send the request, otherwise PHP’s native HTTP access functions are used.

Assuming that the call name (specified in "method") and the other arguments check out, then the Facebook server will return a string as its response. This string is in XML, and looks something like this:

<?xml version="1.0" encoding="UTF_8"?>
<events_get_response xmlns="http://api.facebook.com/1.0/&quot;
xsi:schemaLocation="http://api.facebook.com/1.0/ http://api.facebook.com/1.0/facebook.xsd&quot;
    <name>Blog World Expo example</name>

… <snip …

The library then takes this simple XML string, and parses it into a PHP hierarchical array of values that looks like this:

    [0] => Array
            [eid] => 5172087276
            [name] => Blog World Expo example
            [tagline] => http://www.blogworldexpo.com/
            [nid] => 0
            [pic] => http://profile.ak.facebook.com/object2/5/55/s5172087276_7478.jpg
            [pic_big] => http://profile.ak.facebook.com/object2/5/55/n5172087276_7478.jpg
            [pic_small] => http://profile.ak.facebook.com/object2/5/55/t5172087276_7478.jpg
            [host] => BlogWorld

… <snip> …

This always matches the structure of the XML. Facebook use a restricted subset that avoids tag attributes and anything else that might make it hard to map to this JSON style format.

Another possibility is that an error will be returned. In that case, the XML will normally just be a couple of tags, the error message string and the numeric error code. This gets converted to a PHP exception.

To dig into this code yourself, I recommend looking through facebookapi_php5_restlib.php in the client folder of the Facebook SDK. That’s a good place to add your own debugging code too, though there’s already some that can be enabled by setting the $GLOBALS[‘facebook_config’][‘debug’] variable to true.

A Facebook Ajax Example

Photo of the original Ajax by Oboulko

One of the toughest parts of the Facebook API is their Ajax support. There’s a good page on their wiki with a small piece of sample code, but since Event Connector uses Ajax heavily, I thought it would be a good real-world example. Here’s the PHP source code.

I’ve removed the application settings from config.php, so you’ll need to create your own application in Facebook and follow the same steps you do for the Footprints sample before you can use it. There’s some inline comments explain the control flow, and covering some of the Ajax quirks. One thing to be aware of is the 10 second time-out in all Facebook page requests. If you’re doing any heavy work on the server, or it could get overloaded, you’ll need a strategy to prevent your users seeing an error screen, which is exactly why I went with Ajax for this situation.

More Facebook API posts

Slinky companies and public transport

Yesterday, Brad posted an article talking about bubble times in Boulder, and quoted a great line from Bill Perry about how they spawned ‘slinky companies’ that "aren’t very useful but they are fun to watch as they tumble down the stairs".

Rick Segal had a post about why he took the train to work, and how people-watching there was a great reality check to a lot of the grand technology ideas he was presented with.

And via Execupundit, I came across a column discussing whether people were really dissatisfied with their jobs, or just liked to gripe and fantasize. One employee who’d been involved in two start-ups that didn’t take off said "Most dreams aren’t market researched."

These all seemed to speak to the tough balance between keeping your feet on the ground and your eyes on the stars. As Tom Evlin’s tagline goes, "Nothing great has ever been accomplished without irrational exuberance." I’ve been wrestling with how to avoid creating a slinky with technology that sounds neat enough to be funded, but will never amount to anything. To do that, I’ve focused on solving a painful problem, and validating both the widespread existence of the problem, and that people like my solution.

I’ve turned my ideas into concrete services, and got them into the wild as quickly as possible. Google Hot Keys has proved that it’s possible to robustly extract data from screen-scraping within both Firefox and IE, but its slow take-up suggests there isn’t a massive demand for a swankier search interface. Defrag Connector shows that being able to connect with friends before a conference is really popular, but the lack of interest so far in Event Connector from conference promoters I’ve contacted shows me it won’t just sell itself. Funhouse Photo’s lack of viral growth tells me that I need to provide a compelling reason for people to contact their friends about the app, and not just rely on offering them tools to do so.

I really believe in all of these projects, but I want to know how to take them forward by testing them against the real world. All my career, I’ve avoided grand projects that take years before they show results. I’ve been lucky enough that all of the dozen or so major applications I’ve worked on have shipped, none were cancelled. Part of that is down to my choice of working on services that have tangible benefits to users, and can be prototyped and iteratively tested against that user need from an early stage. Whether it’s formal market research, watching people on trains, or just releasing an early version and seeing what happens, you have to test against reality.

I’m happy to take the risk of failing, there’s a lot of factors I can’t control. What I can control is the risk of creating something useless!

Funhouse Photo User Count: 1,746 total, 70 active. Much the same as before, I haven’t made any changes yet.

Event Connector User Count: 73 total, 9 active. Still no conference takeup. I did experiment with a post to PodCamp Boston’s forum to see if I could reach guests directly, but I think the only way to get good distribution is through the organizers.

Facebook and event promotion

As I’ve been approaching conference organizers to try Event Connector, I’ve been surprised at how few have Facebook events. It seems like a no-brainer to me if your audience includes anyone under thirty, since it only takes a couple of minutes to create an event. In return, you get a great platform for potential guests to discover your conference, and attendees to hear from you and each other before and after the event. You’re being given permission to market to them, and even better, the participants themselves will spread the word as their attendance shows up on their friends’ feeds, and they get involved on the discussions on the event page itself.

Most of the events I have run across have been unofficial, started by participants rather than organizers. Without publicity from the promoters, these tend to attract only a few guests. To be effective you need to include a link to the event in some material that goes out to a decent number of your guests.

I don’t think it’s that conference organizers don’t want the benefits that facebook events offer, since I see a lot of organizations trying to hand-roll similar services. PodCamp Boston has a page listing all of the attendees who wanted their names to be public, but as a plain text alphabetical list, it’s a lot harder to discover friends than the equivalent on facebook. Facebook events are popular with guests, the New Media Expo 2008 one picked up over a hundred guests in the first few hours after it was created, and this is for an event almost a year away!

Trying to put myself in their shoes, I’d guess that the main obstacles are the fact that no one else is doing it, it’s an unknown quantity, it feels a bit out of their control, and they’ve never needed it before. It does require a willingness to try something new, but the reward for doing so before it’s mainstream is that you’ll get a lot of buzz, publicity and guest goodwill for taking that leap!

If you’re an event promoter, I’d highly recommend you set up a Facebook event, and give it a little promotion. It’s quick, free, and offers both you and your guests significant benefits.

Even better, once you’ve got one set up, you get an Event Connector for free. Go to the main page of the app, and your event will show up at the top. There’s a link you can mail out, and free blogger, typepad and facebook profile badges you can distribute. It adds value to the plain facebook events by allowing users to see which of their friends, and friends-of-friends, are going, which supplies the social proof that will persuade them to sign up.

Funhouse Photo User Count: 1,729 total, 63 active. The same steady growth, and looking at the breakdown, I see the same pattern of non-viral acquisition of users, mostly through the directory and searches.

Event Connector User Count: 72 total, 8 active. Still very quiet, with no conference signed up, and a trickle of users from the directory.

Facebook’s new application statistics


I’m a statistics junkie. Picking some significant metrics, and sticking with them to measure performance is the only way to figure out what’s working and what isn’t. I usually try to design in some measurement tools, but that’s hard with Facebook apps, since their setup hides the referring address/previous page and other useful information.

Luckily, they recently introduced a new statistics page for every app, which you can access from the More Stats link below the application name, from the main developer page.

The first big innovation is the ability to see how many people added and removed you during the previous 24 hours. Before, you could only guess at this by comparing the user totals from day to day, but this wouldn’t tell you how much turnover you had from people removing your app. The most useful part of this, and one that’s a bit hidden, is that the total number of adds is actually a link. If you click on it, you’ll see a bar graph like the one above, showing you exactly where your new users came from.

The top picture is for Funhouse Photo, and it tells me a lot. I was suspicious that my app was very non-viral because the growth in users was very linear, but this confirms that I’m getting the majority from the directory and direct searches, rather than feed stories or other friend-to-friend communications. To improve growth that needs to change, and I’ll be able to tell very quickly if alterations to the app help by looking at those stats.

Less exciting, but still very useful, are the response metrics. I’ve had a recurrent problem with time-outs on my facebook apps because they’re doing heavy processing on the server. It seems like any page request that takes more than 8 seconds to complete results in a Facebook error screen for the user, so to work around that I had to implement asynchronous Ajax loading of page elements that might take a while. Looking at the response statistics shows that both my apps that use this aren’t returning error pages for any users, something I couldn’t verify before.

The final interesting feature is a selection of the URLs that were requested from the app recently. This sampling is a great way to figure out how people are using your app, which features they’re accessing and how often. My apps generally encode a lot of information in the URL using GET rather than POST, so I’m able to get quite a fine-grained look at my users’ interactions with them.

Funhouse Photo User Count: 1,723 total, 95 active. As I mention above, I’ve got new insight into the growth pattern from the add statistics. It explains why growth is so linear, there’s little friend-to-friend spreading of the app.

Event Connector User Count: 71 total, 11 active. The add source statistics show that most of the trickle of new users came from the product directory, which is what I’d expect since I don’t have a conference signed up yet.

Facebook app submission


There isn’t an official guide on how to submit your app to Facebook’s directory, so here’s what I’ve found after going through the process several times.

You don’t have to submit your app to the directory. I never did for Defrag Connector, since it was distributed directly by the conference promoter, and wouldn’t have been helpful to the general public anyway. The strength of the directory is that it gives you access to the early-adopters, people who are actively looking for a new app to try. What you want long-term is an app that’s viral, so friends spread it to their friends to reach a much wider audience, but the directory is a good way to start the ball rolling.

In preparation for submission, you need to go to your Developer application and click on My Applications. Below each of your apps is an ‘Edit about page’ link. This is where you can provide a short description, a screenshot, and importantly, set the two directory categories you want your app to appear in. I’ve got no statistics to back this up, but it seems like "Just for Fun" is one of the most popular categories, so I’d recommend that if possible.

Next, make sure your app has an icon, by going to ‘Edit Settings’ for your application. Even though the icon is part of the optional fields section, Event Connector was initially rejected for not having one.

Once you’ve filled out all the information on those pages, you can click on "Submit Application" to the right of your app. There, you’ll need to fill out another description, and supply a larger icon for the directory listing. Once you’ve done that, your app is almost ready to be submitted.

The final hurdle is making sure you have enough users. You need a minimum of five to demonstrate you’ve been testing your app, and that you can personally persuade five people to use it. Hit up your friends and get them to give it a test run, and let you know what they think.

Once you’ve met all those conditions, you can submit the app for review. I’ve found it’s taken me two to three days, and as far as I can tell the inspection is fairly perfunctory. I can’t imagine it’s a sought-after job reviewing the apps, and I’ve had a spotty experience with the quality of the examination.

I submitted Event Connector four times in total before it was accepted. The first rejection was for the lack of an icon, which I fixed, but then the next two were random and confusing, claiming that my app had no content, and then that it violated the ToS because it gave access to secret events. Since there was little information in the rejections, and after research I couldn’t reproduce the problems, I resubmitted the app unchanged both times, and it was eventually accepted. Unlike the Firefox directory, there’s no two-way communication between the reviewers and the app developers, so I wasn’t able to get any additional guidance.

Once you’re in, you should get an immediate spike in the number of users, as the app shows up in the ‘Just added’ section of the directory. Then, I’d expect a steady flow of users, the size depending on the categories and appeal of your app. Hopefully you can gather a good base to build on with some viral distribution.

Funhouse Photo User Count: 1,697 total, 76 active. Still ticking upward, with fairly slow growth.

Event Connector
User Count
: 67 total, 6 active. I’m getting the Facebook equivalent of cosmic background radiation, in terms of new users through the directory. Still an uphill battle winning a new conference.

Limitations of the Facebook API


Facebook is walking a tightrope with its API; they need to expose enough functionality so we can develop compelling services, but guard against malicious applications that could degrade the Facebook user experience, for example by flooding people with spam.

The API is a compromise between these two conflicting goals, and I’m going to cover what you can and can’t do. Overall, I’ve been able to see some patterns in the decisions they’ve made about what to expose, and how to expose it:

  • Apps can only see what the logged-in user can see.
  • Getting access to any information held by Facebook requires the user to go through a screen where they temporarily authorize, or permanently add, the application.
  • The Facebook team are very conservative about letting applications change data held by Facebook. Most of the API is focused on reading data, there’s only a few specific places where you can alter data:
  1. Adding an application box to the user’s profile. This gives the app a small sandbox to draw something interesting, but the content has to be statically set by the application, and then is stored by Facebook. The only time you can update it is if the user takes an action that involves your application, there’s no way to fetch it dynamically. If there’s some scripts within the markup you place in the box, they aren’t run unless the user clicks on it in the profile.
  2. Publish an item on the feed. You can only publish to the current user’s feed, and the app is has time limits on how often it may call it, once every 12 hours for stories, 10 times a 48 hour period for actions.
  3. Send notifications or emails to friends. Again, there’s limits on how many you can send in a day, up to 10 emails and 40 emails and notifications. The user also has to go through an additional screen to authorize emails.
  4. Photo upload. An app must get additional permission from the user before it’s allowed to upload photos. Each application only has to do this once per user, the permission is granted permanently.
  5. Setting the user’s status text. This is another operation that requires an additional step of seeking permission from the user.
  • There’s no way to use the API to affect anything not covered here, such as adding information to group or event pages.

There is an alternative way to perform some actions that aren’t covered by the API, by hand-crafting Facebook internal URLs, and redirecting to them. There’s actually a bit of official documentation on this here.

Funhouse Photo User Count: 1493 total, 123 active. Back to the more typical growth rate, which is strong evidence the strong growth of the last few days was caused by Columbus Day boredom.

Event Connector User Count: 40 total, 10 active. Not much happening here so far, I’m reaching out to some more event promoters to get them to give it a go.

Funhouse photo gets a lot more users

After growing so linearly it was almost creepy, Funhouse Photo gained over 200 users since yesterday. And I have no clue why! I’ve been searching to see if it got mentioned or reviewed somewhere, but I’m drawing a blank. I don’t get much information from Facebook about how people found the app, I’d need to manually build in some tracking myself, and I’d probably not be able to get much information about the referring page anyway since Facebook does so much redirection.

Oh, I did just have a thought. It was Columbus Day yesterday, I bet there was a lot of bored people looking for something fun to do on Facebook. Interesting, probably won’t translate into a long-term trend if that’s the case.

I wish I could have made it to Graphing Social Patterns, the Facebook developer conference. From the notes, it looks like Danny Sullivan is thinking along the same lines as me with social search; wanting to use your friends’ browsing habits to give you more focused results.

Funhouse Photo User Count: 1466 total, 344 active. Probably a Columbus Day blip, but nice to see a growth spike.

Event Connector User Count
: 39 total, 5 active. Still working with Emile and Tim from New Media Expo to arrange some distribution of their connector, but not much activity otherwise.