With my book launch, BigDataCamp and Strata, I’ve accumulated a backlog, so here’s five short links, plus 13!
Gluecon – Eric Norlin knows how to put on a great conference on emerging topics, and the world of integrating different web services, APIs and data sources is one that’s close to my heart. I’m looking forward to seeing the tribe that he gathers in Colorado, and if you’re part of it, you should think about taking up this opportunity to demo your application.
Big Data with Ken Krugler – Ken’s off-the-cuff talk on the pre-electronic US Census was one of the highlights of BigDataCamp for me. This covers a lot of the same ground, but in much more depth. O’Reilly folks, you need to pull this guy on board somehow!
Mapfluence Data Catalog – A well-chosen set of geo and demographic data sets from UrbanMapping. It’s all commercial, which I have no objection to at all, but the lack of obvious pricing means you’ll have to invest time in negotiation with them to decide whether it’s for you. An unlabeled graph doesn’t count.
pipe2py – An intriguing open-source project that takes data flows built in Yahoo Pipes, and converts them into pure Python code. There’s also a quick tutorial available describing how to run the results on Google’s AppEngine.
PeopleSearch – A simple but effective hack, using Google’s custom search APIs to find people’s profiles on major services.
$3m Heritage Health Prize – A fantastic idea, using a Netflix-style data competition at Kaggle to research better ways to predict healthcare needs. There’s some questions around how to best preserve anonymity, but this is such an important goal that it’s worth accepting some small risks on the privacy front.
The O’Reilly Stylesheet – I love reading through stylesheets from different publishers. There’s been a few rules in here I’ve struggled to follow, like referring to a company as ‘it’ rather than ‘they’.
GroundCrew – A simple but effective service for organizing volunteers using cell phones.
Walkshed – There’s a lot of promise in visualizing attributes like walkability and accessibility across cities. A lot of these attributes are really hard to understand unless you devote serious time to exploring the neighborhoods, which made it tough to chose a location when I had to move to San Francisco as an outside.
Map of Scientific Collaboration – A beautiful view of the citation networks in research papers, presented geographically. The next step is to make these interactive and explorable.
Chequered Airwaves – How the high-brow Czech language radio stations ceded the battle for minds to the less scrupulous German broadcasters in the run-up to the Second World War. This struck me as relevant when we consider the right approach to ignorant populist diatribes, in the debate I keep having with myself about how sensational to go.
Ruby Geocoder – The most recent version of the original Perl Tiger/Line US geocoder, rewritten in Ruby and able to ingest the latest shapefiles.
Hacking Lottery Scratchcards – There’s a whole world of statistical data hacking out there, revealing information that publishers never believed they could possibly be exposing.
Small Business Innovation Research Grants – There’s a massive world of US government money available to startups. The main drawbacks are the almost overwhelming barriers to getting through the initial paperwork, the pernicious influence of managing to please federal managers instead of real customers, and in this case becoming part of the military-industrial complex.
Where the Ladies At? App – I may not like it, but this is probably the future of location-based services. After all, Facebook basically started as a way to stalk fellow students at Harvard.
How the O’Reilly Animals are Chosen – I still have no idea how I got a bull for my cover, but given my childhood in a farming village I can’t complain.
Strata Interview – I talk about the Data Source Handbook on camera. I wasn’t happy with this one, I should have talked about all the cool maps people are building with OpenHeatMap instead of going off into an abstract ramble.
Europe vs the US on Privacy – There’s a strong tradition in Europe of assigning a higher value than the US to privacy relative to freedom of expression and innovation. There’s going to be an increasing clash over this as more and more data sources merge and reveal increasing amounts of personal-but-public information.