Five short links

Photo by Chris in Plymouth

Visualizing Large Facebook Friendship Networks – There’s lots of academic work emerging using social network information. What I find really interesting are the techniques people are developing to make sense of the ‘hairball’ that results from a naive approach to plotting the raw networks, since people have so many friend connections.

What the strange persistence of rockets can teach us about innovation – A really fresh way of looking at technological progress, and a reminder that what seems inevitable now is often actually very path-dependent on the past.

Why did economists not spot the crisis? – The compelling answer is “We don’t reward or encourage people to be generalists”. Academic kudos is only available to hedgehogs who know one thing really well, not foxes who bounce around. I think the skills required to be a generalist are undervalued in the technology world too, and that causes very similar problems.

Africa Rules the World – Some commentary on a slick visualization of growth rates around the world. As he says, it’s a bit misleading because a 20% growth rate in a desperately poor country is not that much in absolute terms, but it does show the dynamism of Africa.

Hilary Mason on NPR – “Everything is interesting”.’s chief scientist does a great job explaining the joys and perils of data.

The American Way of Dating

Photo by Brandon Warren

With the (mostly) shared language, it's easy to for people from the UK to think that America is basically like Britain, apart from the funny accents. I had a little of that attitude when I moved here, but rapidly learned how wrong I was. With Valentine's coming up, I was reminded of one of the best examples of the alienness lurking under the surface; dating. As Kira Cochrane amusingly chronicled in The Guardian, the British standard is "go to a party, down some drinks, make eye contact with a person you fancy, proceed to kissing and often much more, wake up the next morning to find that you have magically become one half of a couple". It seems like the goal was to avoid any unambiguous declarations of interest, so that at any point either person can end the process without the other losing face.

This isn't how it usually works in the US, at least in the mainstream. The formality and rituals surrounding courtship feel like something out of a Noh play. The very idea of actually asking a near-stranger for a date, explicitly and with no particular preamble, in the full knowledge that you may be turned down, seems nothing short of revolutionary compared to the system I grew up with.

Kira ended up avoiding the rules when she was over here, but even she acknowledges there's a need they're filling. Maybe it's because American culture is so varied that the system has to be so explicit about intentions, since people growing up with radically different backgrounds will never be able to communicate using the subtle signs that the British rely on. There's also something refreshingly honest about the whole procedure. A friend was telling me about her travels in Ireland, and being romanced by a hopeful local man. She discovered he was married, with kids, so she asked if it was an open relationship? "Don't be disgusting, woman!" was the reply.

Eighteen Short Links

Photo by Laura Thorne

With my book launch, BigDataCamp and Strata, I’ve accumulated a backlog, so here’s five short links, plus 13!

Gluecon – Eric Norlin knows how to put on a great conference on emerging topics, and the world of integrating different web services, APIs and data sources is one that’s close to my heart. I’m looking forward to seeing the tribe that he gathers in Colorado, and if you’re part of it, you should think about taking up this opportunity to demo your application.

Big Data with Ken Krugler – Ken’s off-the-cuff talk on the pre-electronic US Census was one of the highlights of BigDataCamp for me. This covers a lot of the same ground, but in much more depth. O’Reilly folks, you need to pull this guy on board somehow!

Mapfluence Data Catalog – A well-chosen set of geo and demographic data sets from UrbanMapping. It’s all commercial, which I have no objection to at all, but the lack of obvious pricing means you’ll have to invest time in negotiation with them to decide whether it’s for you. An unlabeled graph doesn’t count.

pipe2py – An intriguing open-source project that takes data flows built in Yahoo Pipes, and converts them into pure Python code. There’s also a quick tutorial available describing how to run the results on Google’s AppEngine.

PeopleSearch – A simple but effective hack, using Google’s custom search APIs to find people’s profiles on major services.

$3m Heritage Health Prize – A fantastic idea, using a Netflix-style data competition at Kaggle to research better ways to predict healthcare needs. There’s some questions around how to best preserve anonymity, but this is such an important goal that it’s worth accepting some small risks on the privacy front.

The O’Reilly Stylesheet – I love reading through stylesheets from different publishers. There’s been a few rules in here I’ve struggled to follow, like referring to a company as ‘it’ rather than ‘they’.

GroundCrew – A simple but effective service for organizing volunteers using cell phones.

Walkshed – There’s a lot of promise in visualizing attributes like walkability and accessibility across cities. A lot of these attributes are really hard to understand unless you devote serious time to exploring the neighborhoods, which made it tough to chose a location when I had to move to San Francisco as an outside.

Map of Scientific Collaboration – A beautiful view of the citation networks in research papers, presented geographically. The next step is to make these interactive and explorable.

Chequered Airwaves – How the high-brow Czech language radio stations ceded the battle for minds to the less scrupulous German broadcasters in the run-up to the Second World War. This struck me as relevant when we consider the right approach to ignorant populist diatribes, in the debate I keep having with myself about how sensational to go.

Ruby Geocoder – The most recent version of the original Perl Tiger/Line US geocoder, rewritten in Ruby and able to ingest the latest shapefiles.

Hacking Lottery Scratchcards – There’s a whole world of statistical data hacking out there, revealing information that publishers never believed they could possibly be exposing.

Small Business Innovation Research Grants – There’s a massive world of US government money available to startups. The main drawbacks are the almost overwhelming barriers to getting through the initial paperwork, the pernicious influence of managing to please federal managers instead of real customers, and in this case becoming part of the military-industrial complex.

Where the Ladies At? App – I may not like it, but this is probably the future of location-based services. After all, Facebook basically started as a way to stalk fellow students at Harvard.

How the O’Reilly Animals are Chosen – I still have no idea how I got a bull for my cover, but given my childhood in a farming village I can’t complain.

Strata Interview – I talk about the Data Source Handbook on camera. I wasn’t happy with this one, I should have talked about all the cool maps people are building with OpenHeatMap instead of going off into an abstract ramble.

Europe vs the US on Privacy – There’s a strong tradition in Europe of assigning a higher value than the US to privacy relative to freedom of expression and innovation. There’s going to be an increasing clash over this as more and more data sources merge and reveal increasing amounts of personal-but-public information.