WebFinger in PHP

Webfinger
Photo by Arenamontanus

I've spent a lot of time writing code to connect up user's contacts across different services, so I've had high hopes of WebFinger, the universal API to connect public profiles with email addresses. To be honest though I worried it might just be a geeky pipe dream that never gained widespread support from ISPs. I'm extremely happy to be proved wrong by Google's release of an implementation for all their public profiles.

To celebrate, I've added some PHP code to implement the protocol within FindByEmail, my attempt to combine all the email APIs into one package. The WebFinger code is actually fairly self-contained, so it should be easy to rip out and use in isolation. It is very early days for the protocol, and as far as I know only Google has implemented it, so I'll be keeping it updated as I learn more. To see if it works on your email address, give it a try for
yourself here
.

Interestingly enough Google already had their Social Graph API that exposed a lot of the same information about public profiles, but it never really achieved widespread adoption. I'm really hoping WebFinger will become a much more widespread standard.

The lizardmen tunnels beneath LA, and other strange maps

Americasteak

Maybe it was growing up glued to The Reader's Digest Book of Strange Stories and Amazing Facts, but I've always had a weakness for the odd and neglected corners of history. If you suffer a similar affliction, you'll be as hooked as I am to the Strange Maps blog. Exploring the dark side of geography from the lost Lizardman tunnels of Los Angeles to an atlas of time itself, it manages to take in an asphalt map of Maine and a drive-thru scale replica of the solar system along the way.

Ringroadrose
 

The Beer Belly of America and other geographic mischief

Beerbelly

I just discovered the wonderful floatingsheep.org, home to a whole bunch of new perspectives on the US and the world. My personal favorite is the Beer Belly of America, a map highlighting parts of the country with more bars than grocery stores! I love visiting Wisconsin and can well believe that they have a bar for every 1,700 people. A neon sign in Hayward advertising "Liquor & Live Bait" particularly sticks in my mind.

Nearly as much fun is their comparison of pizza, guns and strip clubs:

Pizzamap 

Also highly recommended are Baptists, Bibliophiles and Bibles, User-generated Swine Flu and Allah vs Buddah vs Jesus. I'm so pleased to discover someone else as obsessed with these sort of random but fascinating views of the world. Great work by Matthew, Mark and Taylor!

Your chance to hire an amazing QA engineer in Austin

I was sad to hear that Apple are letting go some remote workers in their video software division, especially because that means Doyle Rockwell is leaving. He's been a driving force behind Apple's professional video products like Motion and Final Cut Pro for the last 8 years, but since he recently moved from LA to Austin to be closer to family, he's fallen prey to some job cuts focused on off-site employees.

I know the whole of my old team is going to miss him, he's truly one of the rare great testers that Joel so recently described. He's deeply interested in software, very good at both automating tests and spending days tracking awkward bugs down manually when he needed to. He actually really cared about our customers, which both made him an awesome resource in design discussions, and led him to spend many long hours of his own time building helper tools and tutorials to work around issues in the software. You can check out some of them at motionsmarts.com, as you can see he did all of this anonymously. Anyone who's asked a motion graphics question on Apple's official forum is likely to have got a reply from 'specialcase' too. This exchange from December is typical Doyle, with the user responding "Nothing short of sweet. Thanks specialcase, worked perfectly!" None of this was an official part of his job, he went above and beyond to help Apple's users.

Anyway, he's a terrible self-promoter and has a young family to support, so I wanted to thank him for all the help he's given my products over the years with a heads-up to the tech world that they have a wonderful chance to hire a great new employee. His email address is specialcase at mac.com, please drop him a line if you're interested in hearing more about how he could help you, you won't regret it!

The Facebook Whisperer

Horsewhisperer
Photo by Gerald Davison

That's Andrew Hyde's new nickname for me after reading the ReadWriteWeb article! More seriously, it's been amazing to see the reaction and support I've received from everyone. I'm so excited by the possibilities and insights we can gain by this sort of analysis, and I'm hopeful that we'll see a lot more of it coming. I'm still trying to catch up with my email and the blog comments, so my apologies for the slowness in responding, but I did want to mention a few things here about the maps.

Mercator

I want to make sure Manfred gets credit for the awesome Mercator Flash component he open-sourced, I used it as the basis for the interactive maps. I've blogged about it before, but he deserves a lot of kudos for making a great visualization building block. If you're interested in the interactive heat maps I use on fanpageanalytics.com, I've also made those available as open-source.

Twilight and Utah

I'm feeling very old and out of touch! I was unaware that Stephenie Meyer is from Utah, which explains the Mormon connection. Thanks to everyone who helped educate me, even Edward Cullen himself.

Alexandria, Georgia and ambiguous place names

Some of you spotted some mis-classification of people's locations, with people from Alexandria, Egypt showing up in Alexandria, LA, and the ex-Soviet Georgians showing up in Dixie. That looks like a bug in the location sorting I'm using, I'll investigate and get a fix in for the next version.

Data release

I was hoping to get the first release of the academic data set out today, but Facebook have asked for a little more time to check the privacy implications. I'm very keen to avoid inadvertently helping spammers and scammers, so I'm working with them to make sure the data set is useful for network research but not malicious purposes. I'll keep you up to date on how that goes.

How to harvest Facebook profiles from emails without logging in

Safe
Photo by Squacco

Max Klein recently posted a how-to on connecting a mailing list of users to their Facebook profiles, giving business owners a deep look into their customer's lives. There's one flaw with his technique, you need to be signed in to a Facebook account before you can get the information. The theoretical drawback here that you've clicked through their terms-of-service which prohibit you from these sorts of shenanigans, and thus taint the data if you wanted to sell it on. The practical problem is that Facebook claims to spot account holders doing these sort of bulk uploads, and blocks their accounts.

Recently I was surprised to discover that you don't need to be signed in to an account to search by email addresses and match them to profiles. To my mind this is a nasty hole both because it gives companies legal cover to resell the linked data, and in practice makes it tough for Facebook to crack down on firms siphoning off user data. It's a little bit more complex than Max's original approach, so I'll go through the steps below. I've met a brick wall trying to contact Facebook about previous security issues, so I'm hoping this might persuade them to close it.

1 – Create a free email account, and upload 2,000 of the addresses you want info on as contacts

2- Make sure you're logged out of Facebook, then go to http://www.facebook.com/find-friends/

3 – Enter your email account details, and answer the captcha

4 – Wait a couple of minutes, and you'll see a list of Facebook profiles for your addresses:

Findprofilesblurred
This is the sneaky bit – [Removed temporarily at Facebook's request, until they can get a fix in]

Write a script to handle the contact upload, and to [Removed temporarily] to pull out the IDs, and all you need is some Turks to handle the Captcha to have a fully functioning pipeline. You could easily be processing tens or hundreds of thousands of addresses an hour, and Facebook would have to resort to IP blocking to shut you down. I'll be watching to see how long this hole remains open…

[Update – Facebook got in touch, they've implemented a reporting system for vulnerabilities since the last time I tried to track someone down. It's at www.facebook.com/security, and it sounds like they're paying attention]

How to split up the US

Finalmap

As I’ve been digging deeper into the data I’ve gathered on 210 million public Facebook profiles, I’ve been fascinated by some of the patterns that have emerged. My latest visualization shows the information by location, with connections drawn between places that share friends. For example, a lot of people in LA have friends in San Francisco, so there’s a line between them.

Looking at the network of US cities, it’s been remarkable to see how groups of them form clusters, with strong connections locally but few contacts outside the cluster. For example Columbus, OH and Charleston WV are nearby as the crow flies, but share few connections, with Columbus clearly part of the North, and Charleston tied to the South:

Columbus   Charleston

Some of these clusters are intuitive, like the old south, but there’s some surprises too, like Missouri, Louisiana and Arkansas having closer ties  to Texas than Georgia. To make sense of the patterns I’m seeing, I’ve marked and labeled the clusters, and added some notes about the properties they have in common.

Stayathomia

Stretching from New York to Minnesota, this belt’s defining feature is how near most people are to their friends, implying they don’t move far. In most cases outside the largest cities, the most common connections are with immediately neighboring cities, and even New York only has one really long-range link in its top 10. Apart from Los Angeles, all of its strong ties are comparatively local.

In contrast to further south, God tends to be low down the top 10 fan pages if she shows up at all, with a lot more sports and beer-related pages instead.

Dixie

Probably the least surprising of the groupings, the Old South is known for its strong and shared culture, and the pattern of ties I see backs that up. Like Stayathomia, Dixie towns tend to have links mostly to other nearby cities rather than spanning the country. Atlanta is definitely the hub of the network, showing up in the top 5 list of almost every town in the region. Southern Florida is an exception to the cluster, with a lot of connections to the East Coast, presumably sun-seeking refugees.

God is almost always in the top spot on the fan pages, and for some reason Ashley shows up as a popular name here, but almost nowhere else in the country.

Greater Texas

Orbiting around Dallas, the ties of the Gulf Coast towns and Oklahoma and Arkansas make them look more Texan than Southern. Unlike Stayathomia, there’s a definite central city to this cluster, otherwise most towns just connect to their immediate neighbors.

God shows up, but always comes in below the Dallas Cowboys for Texas proper, and other local sports teams outside the state. I’ve noticed a few interesting name hotspots, like Alexandria, LA boasting Ahmed and Mohamed as #2 and #3 on their top 10 names, and Laredo with Juan, Jose, Carlos and Luis as its four most popular.

Mormonia

The only region that’s completely surrounded by another cluster, Mormonia mostly consists of Utah towns that are highly connected to each other, with an offshoot in Eastern Idaho. It’s worth separating from the rest of the West because of how interwoven the communities are, and how relatively unlikely they are to have friends outside the region.

It won’t be any surprise to see that LDS-related pages like Thomas
S. Monson
, Gordon
B. Hinckley
and The Book of Mormon are at the top of the charts. I didn’t expect to see Twilight showing up quite so much though, I have no idea what to make of that! Glenn Beck makes it into the top spot for Eastern Idaho.

Nomadic West

The defining feature of this area is how likely even small towns are to be strongly connected to distant cities, it looks like the inhabitants have done a lot of moving around the county. For example, Boise, ID, Bend, OR and Phoenix, AZ all have much wider connections than you’d expect for towns their size:

Boise Bend 

Phoenix

Starbucks is almost always the top fan page, maybe to help people stay awake on all those long car trips they must be making?

Socalistan

Sorry Bay Area folks, but LA is definitely the center of gravity for this cluster. Almost everywhere in California and Nevada has links to both LA and SF, but LA is usually first. Part of that may be due to the way the cities are split up, but in tribute to the 8 years I spent there, I christened it Socalistan. Californians outside the super-cities tend to be most connected to other Californians, making almost as tight a cluster as Greater Texas.

Keeping up with the stereotypes, God hardly makes an appearance on the fan pages, but sports aren’t that popular either. Michael Jackson is a particular favorite, and San Francisco puts Barack Obama in the top spot.

Pacifica

The most boring of the clusters, the area around Seattle is disappointingly average. Tightly connected to each other, it doesn’t look like Washingtonians are big travelers compared to the rest of the West, even though a lot of them claim to need a vacation!

So that’s my tour through the patterns that leapt out at me from the Facebook data. This is all qualitative, not quantitive, so I’m looking forward to gathering some numbers to back them up. I’d love to work out the average distance of friends for each city, and then use that as a measure of insularity for instance. If you’re a researcher interested in this data set too, do get in touch, I’ll be happy to share.

Update – I wasn’t able to make the data-set available after all, but if you liked this map, you can now build your own with my new OpenHeatMap project!