Five short links

Picture from Pulp Covers

Ayasdi – A very seductive new visualization and analysis tool, it feels like they've learned a lot from Palantir's success.

Benford's Law: A revised analysis – I'd been using the original study that analyzed public company accounts for fraud over time using Benford's Law as a poster child for the application of numeric methods to journalism. I'm sorry to see that it turned out to be a bogus correlation (thanks to an increase of zeroes in revenue figures) but it's a good reminder of how important peer review and humility are as we're charging ahead with our new techniques. It's the sort of mistake that keeps me awake at nights, knowing how easy it would be to make.

Tiki – A lovely collection of open source code to handle all sorts of file conversions to text. I built some similar functionality into the Data Science Toolkit, but I'm excited to see an Apache-supported alternative.

Stanford Part-of-speech Tagger – A walk-through of a slick project for categorizing words within unstructured English-language text.

The Next Big Thing – How Amazon should be using their information on customers' book habits to drive a social network. I'm convinced that implicit signals will win out over the follow/friend model when it comes to building communities of people, but nobody's built an example that actually works yet.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: