Five short links

Letterv2Photo by Chris in Plymouth – A catalog of data catalogs from governments around the world. The really hard problem with all 'open data' is making the connection between a developer's immediate problem and an available data set or API, but at least sites like this are building a foundation for solving that.

A proposal for making Ajax crawlable – I didn't realize the hashbang syntax was actually backed up by an informal standard for making the same content available to crawlers through a traditional URL. This is much better than completely opaque Javascript-driven pages, but I am left wondering how tough it is to maintain two separate content delivery paths in the code?

Disruptor – Much as I dislike queues as a general-purpose primitive for data processing (I see them as a necessary evil when you're dealing with the subset of problems that require streaming solutions) I am impressed by this high-performance framework. A recurring theme in many of my optimization investigations over the last few years has been the painful cost of locking, so I bet their focus on lockless parallelization will be very powerful.

Adventures with venture capital – Chasing investment is both time-consuming and uncertain. A cautionary tale from Tim on how the process can go wrong, which unfortunately is more often than you'd think.

How much compute power do you need for next-gen sequencing? – Bioinformatics tasks are much larger than most web problems, but this analysis of their computing needs has some useful parallels. In my data jobs, CPU has never been the bottleneck, it's always been memory or IO. I don't think I'll be moving to a 1TB RAM machine any time soon though!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: