Spotlight your startup at Strata

Spotlight
Photo by Bryan Stevenson

Are you a data startup who'd love to be at Strata but can't afford the admission? You now have a chance to attend the conference and show off what you've been building, thanks to the Strata Startup Showcase. There's space for fifteen startups, and successful companies will be given two free passes and five minutes to show off their work in front of investors. It's a great opportunity, but the deadline for admissions is Friday, so you'll need to be quick.

Don't forget the free Big Data Camp Unconference on the Monday before the main event too, the price is specially tailored for starving entrepreneurs' wallets.

What makes a good data API?

Centaurskeleton
Picture by Victoria (Mouse World)

I’ve been working on a guide to data APIs, and making decisions about what to include has forced me to think about exactly what I look for. If you’re going to build an API that’s useful to a wide range of people, and will add value to the whole data ecosystem, here’s what you need.

  • Free, or self-service signup. Traditional commercial data agreements are designed for enterprise companies, so they’re very costly and time-consuming to experiment with. APIs that are either free or have a simple sign-up process make it a lot easier to get started.

  • Broad coverage. There’s been quite a few startups that build infrastructure, and hope that users will then populate it with data. Most of the time, this doesn’t happen, so you end up with APIs that look promising on the surface but actually contain very little useful data.

  • Online API or downloadable bulk data. Most of us now develop in the web world, so anything else requires a complex installation process that makes it much harder to try out.

  • Linked to outside entities. There has to be some way to look up information that ties the service’s data to the outside world. For example, the Twitter and Facebook APIs don’t qualify because you can only find users by internal identifiers, whereas LinkedIn does because you can look up accounts by their real-world names and locations.

The first three principles are just about ease of use, but having linkable data is essential if you’re going to allow developers to innovate by combining data sources. Once you’ve got an external reference point, we can join information to come up with insights you’d never expect.

Five short links

Paintedfive
Photo by Chris in Plymouth

The Linked Open Data cloud diagram – I disagree with the Linked Data philosophy, I think top-down, formal semantic approaches are a dead end, and believe RDF is the Devil’s Own Format. I can’t deny that the array of sources they’ve linked together is impressive though, and it’s beautifully presented here.

Taco Bell Programming – The hacker mentality can be an incredibly powerful tool for compressing days-long tasks into minutes, if you can just look at them from the right angle. Mmmm, Taco Bell….

The Perils of Kinder Surprise – I’m so glad we’re being protected from the dangers of small chocolate eggs with plastic toys inside. I never really liked them growing up in the UK, but it depresses me that here we’re paying border guards to seize an average of 25,000 of them every year. When Kinder Surprises are outlawed, only outlaws will have Kinder Surprises.

The Myth and Truth of the NYC Engineer Shortage – Hiring ‘A players’ doesn’t mean hiring people with the exact skills you need, or even experienced engineers. Hire for smarts and enthusiasm, give your experienced folks time to help them, and within a few months you’ll have productive employees. Even better, they’ll be cheaper, and more loyal than that hot-shot you keep dreaming of. Hire for the right mentality, and everything else will follow.

Elusive Forger, Giving but Never Stealing – My favorite character reading the Norse myths as a kid was always Loki the Trickster, so I find this story of a non-profit forger slipping his works into museum’s collections delightful.

Thoughts on London

London
Photo by Ian Brumpton

It's always strange going back to the country I grew up in. I spent my early academic and professional life there in a near-constant state of frustration, so it's hard for me to analyze it rationally. Bearing that in mind, here's some of the impressions I was left with after spending a week back in London.

High Finance. I was there to help out some startup friends, and their biggest problem was that big financial firms could easily outbid any early-stage startup for technical talent. If an experienced developer can get $500,000 a year, it takes a lot to lure them. This might sound great for developers, but only if it's a long-term, sustainable situation. My fear is that the current high levels of financial firm profits won't last, those jobs will vanish, and without a widespread startup culture there will be no good replacements. Felix Salmon did a great article on the problem of finance sucking up all the oxygen, and I think he's spot on. I don't have figures to back this up, but even New York with its massive finance industry feels like it has a lot more diversity to fall back on than London.

Deference. There's a real reluctance to give young punks responsibilities, and a separation between management and engineering. As a 25 year-old with big ideas, the difference between what I was allowed to do in British and American companies was amazing. I went from getting into trouble to being given pats on the back for coding outside of my assigned areas. I was included in management discussions, not kept in the dark. The conservative social system in the UK makes it tough to be flexible like that, and discourages a lot of troublemaking innovators.

Don't look! Keep your eyes focused on the ground five feet in front of you at all times when walking. I hadn't realized how much my habits had changed until I was wandering around London and wondering why everyone was bumping into each other. Do I walk like an American now?

Drip-feeding. European investors are complete wimps. They have the terrible habit of handing out investment in tiny chunks. This forces entrepreneurs to constantly be fund-raising, unable to plan more than a couple of months ahead. I talked to several startups with traction that would earn them millions in VC investment on the west coast, and they're all struggling with this issue. I'm not the only one to have spotted the problem, though Paul has different ideas on the cause.

Raw Potential. Despite all these criticisms, I met so many clever, motivated people and great startups. I'm just a tourist there these days, so my hat goes off to everyone working to make London the tech innovation hub it deserves to be.

Five short links

Numerofive
Photo by Francisco Nogueira

History of the English language – I knew the general outlines already, but there’s some fascinating details in here, especially the example of how the Lord’s Prayer would have been written at different times. The 1000 AD sample is unintelligible, but by 1384 it’s hard but readable. It also led me to discover that Illinois had a law on the books until the 1960s that the official language was American, not English. Makes sense to me.

CC San Francisco Salon – This looks like a stellar line-up of data folks for an informal discussion around openness in a data-driven world. I’m disappointed I can’t make it since I’m out of the country, but I’ll be checking out the video record of the event.

DataDay Austin – Texas has a cluster of cutting-edge data companies, and they’ve lined up an impressive day of training and talks. Folks from Infochimps, Google, 80legs and more will be there.

DataSets, Redistributable Data Sets – Delicious is still an essential tool for easily sharing resources, and I’m thankful that Julian and Peter are publishing their finds.

AsciiDoc – Why didn’t somebody tell me about this before? It’s an elegant little tool for taking the plain-text conventions we all use when creating READMEs, and formalizing them into a markup language that can be used to create everything from HTML to PDF and epub documents. I’ve been using Pages or Word to build books, and the boiler-plate formatting work was so time-consuming. This has made my latest project a breeze.

Five short links

Fiveballoon
Photo by Balloon Shop Enfield

DataSift – UK startup focused on making it easy to build your own tools on top of massive social media streams like the Twitter firehose. Seems a bit like Yahoo Pipes for social data, without the visual interface, and could open up the area to a much wider audience of developers.

The Doctor vs the Computer – A thousand-character limit on descriptions in medical records is so obviously arbitrary and unneeded, it hurts. Websites that have code to complain about spaces in credit-card numbers but somehow can't strip them out are bad enough, but here the bondage-and-discipline over-specification could kill people.

Trouble in the House of Google – Google's had massive success because they realized that inelegant statistical methods of detecting things like spam, plagiarism and relevance work a lot better than more elegant traditional semantic/AI techniques. Unfortunately, the black hats have figured out that there's no statistical technique in the world that can truly rate the quality of a page. Google's relying on statistical measures that used to correlate with that quality, but as the bad guys mimic those more closely, they are tricking the search engine into believing spam is the real thing. We need more inputs, whether that's a return of some kind of manual rating system, data from social networks or click-through rates.

Bike Accidents in Tucson – Exactly the sort of thing I built OpenHeatMap for. Collin Forbes is using it to help influence the debate about policing in his city.

This isn't a post about Facebook – Mourning the rise of a service that's a closed system, instead of the openness of Google. I'm not as pessimistic as Paul, I think that Facebook is demonstrating how much people want tools that reflect their off-line social world and behaviors, and once the open world absorbs that lesson, we'll see a new wave of competition for the social network. That competition will have to be more open in a technical sense, just because that's such a tempting way to get early traction.

Leave a trail of breadcrumbs

Breadcrumbtrail
Photo by Virelai

Maybe your purpose in life is to serve as an example to others of what not to do? That's a thought that actually cheers me up when I'm feeling down, because at least it adds some meaning to horrible experiences. I was thinking about that when I read about Jud's brush with personal disaster. Anyone searching on BPPV now has a detailed account of what went wrong and how he recovered. That may seem like a small thing, but for a handful of sufferers it will be information that helps them immensely. There's no theoretical limit on how long it could remain useful either – our great-great-grandkids could still be learning from his experiences.

We take economic growth for granted, but did you ever stop and think about what it actually means? Why should the same number of people be able to produce a few percent more for the same amount of effort, year after year, for centuries now? The secret is culture. As one person or organization discovers how to do more with less, that secret gets passed around and remembered collectively by humanity. Productivity is actually a massive series of niche lessons about what works and what doesn't. Our whole world is built on millenia of anecdotes like Jud's.

That's why the internet leaves me with so much hope for the future. Over my lifetime we've created an incredibly powerful way of transmitting our experiences to others who care. Even if there's only a handful of people in the world who might benefit from a particular insight, for very little effort you have a good chance of reaching them and improving their lives.

People ask me if they should blog or Twitter, and I tell them it won't make you money, it won't bring you fame, and in terms of the concrete returns, it's a waste of time. I still encourage them to do it though, because every true story is worth telling. For years despite low traffic I'd keep going because the search logs would tell me there were one or two people a day who found a solution to their problem thanks to a post I'd written. If you think about it, that's hundreds of people a year you can help, just by writing down a few of your experiences.

So, when you look at your life in 2011, ask yourself if you're leaving a trail of breadcrumbs? It might be the most effective way you can make the world a better place.