ASTER GDEM – It turns out there's more than one global set of elevation data! Thanks to Matthieu Molinier for pointing out this alternative to SRTM3 that has better coverage on steep terrain and high latitudes.
Frontend view generation with Hadoop – Anyone who's built big data pipelines has to confront the problem of how to efficiently output the results. If the output data is small, doing a normal load from CSV into a database, or even running dynamic insertion calls can be fast enough, but as soon as it's something larger (like a search index) writing out the results will be a bottleneck for the whole process. I first ran across a pattern to tackle this at Backtype; writing out binary BerkeleyDB database files directly to disk from the final reducer stage and then just hot-swapping them in so they're available to the front end. This post from Datasalt looks at some other ways of doing the same thing with different technologies, including Voldemort and SOLR. I'd never seen SOLR used as just a distributed key/value store, it feels a bit like using Concorde for crop-spraying, but they seem to have had luck with it for their application.
Vizify – A clean dashboard of statistics and visualizations of your Twitter activity (here's my profile). They have some fun with an Angry Birds clone, with a cunning hook asking you to tweet your high score.
Topsy Analytics – I knew these guys had been doing some fascinating backend real-time search work, but I didn't realize they exposed analytics too. We'll actually be moving into their building in a couple of weeks, to cope with the growing team, so I look forward to geeking out with them.
A simple explanation of Benford's Law – I don't find it quite as simple as they hope, but it is an approachable but rigorous look at one of the most fascinating statistical hacks around. I'm just worried that by popularizing it, fraudsters will wise up and we'll lose one of the easiest ways to spot dodgy numbers!