Social network data and research

Image by Yankee in Canada

One of my favorite parts of my own Facebook research has been discovering some of the existing work in this area I didn't know about. Here's some of the most interesting papers:

Inference of Profile Elements of Individuals Using Publicly Available Social Web Data

Using Rapleaf's massive data store of publicly-available social network data, Piotr Kozikowski wrote his master's thesis on inferring attributes like gender, location and age from other known information about a person.

Contains details on the EuroSys '09 academic data set containing both connections and interactions for

Real-world separation effects in an online social network

A paper on how geography influences social networks,
using 30,000 users public friendship data from a German social network.

Arvind's got a few notes about the LiveJournal, Twitter and Flickr
data they're using. It sounds like Mislove has been willing to share
LiveJournal network data with other academics in the past.

Cameron Marlow is the head of Facebook's data mining team, and covers
their internal research on his blog.

Finally, it's in a different area, but one of the scariest datasets I've run across is the Enron collection of 500,000 emails released as part of the investigation. I was a heavy user of this for developing my email services, but I'm still amazed it's out there!

One response

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: