Five short links

Photo by Mitchell Gerskup

Kaggle – A site dedicated to improving our data analysis algorithms by running frequent Netflix-style contests. Both the data providers and the scientists win, I think this is an excellent idea and I’m pleased to see it looks like there’s been a lot of uptake – via Anthony Goldbloom

Storytelling, statistics and other grave insults – “Statistics is often too dry and too abstract for us to understand intuitively, to generate that comfortable internal feeling of understanding“. There’s a lot of truth in this analysis of how we crave, and believe, narratives. Stories persuade in a way that numbers don’t.

The Illogicality of Stock-Brokers – In a similar vein, this study shows how even experts let their intuition override basic logic. The authors experimented by posing trading questions to experienced stockbrokers, and found that the plausibility of the answer strongly determined whether they chose it, even if applying simple logic would clearly show it was wrong. More evidence that we rely more on our pattern-matching skills than our rationality when making decisions – via FreeExchange

myWorld Demo – The web opens up so many ways of putting advanced geo tools like this into the hands of everyday people – via Peter Batty

Datavis Tumblr – Beautiful examples of data visualization – via Julian Green

How to use the new Salesforce REST API from PHP

Photo by Napanee Gal

Salesforce just released the REST version of their API, and while there's a Java example, there's no sample code for other languages. Since I'll be calling it from PHP, I used their documentation to build my own sample code. The source is available at and you can see a live version running at The code demonstrates how to authenticate, get an access token and then call the API to grab information about the sales accounts for the current user.

To use the API at all, you'll need a server setup up with an SSL certificate and https, since the OAuth 2.0 authentication requires a secure connection. I found this guide from Ubuntu useful in getting that set up, and bought my certificate from GoDaddy.

With that sorted out, go to to create a Developer Edition salesforce account. You'll also want to sign up for the REST API preview beta program (though they're currently experiencing a few technical hiccups with the process).

Next, navigate to the Setup link in the top-right corner of the page, then click on Develop, then Remote Access. Pick an application name, and add the location where you'll be uploading the example index.php file as the callback URL. I also checked the No user approval required box even though I'm not sure exactly what it does! After you've saved you should see a screen giving you your access credentials. Copy the Consumer Key, Consumer Secret and callback URL values into the start of your copy of the index.php sample code and then upload it to the server.

Now point your web browser at the address you uploaded the sample code to. The first time through it should redirect you to a login page on the Salesforce site, ask you whether you want to let the application access your data, and then send you back to the sample code location. If all goes well, you should see a list of your sales accounts:

Congratulations, you've just written your first Salesforce application!

The dark side of entrepreneurship, continued

My last post led to a flood of comments and emails. There's so much sobering insight packed into the reactions, so many personal stories that stand out, I'll highlight a few of them below.

First though, I want to talk a little about the connection between my work and the failure of the relationship. As I said, there wasn't a direct, clear link. I fought hard to carve out time to spend together, since I knew that was the classic mistake. It was more subtle – when at some level I started to sense deeper problems with our relationship, I'd try to spend more and more time together as a fix instead of talking about the issues and confronting them. I had my work to soak up any frustration and bring me some measure of satisfaction, so I let things linger. Looking back, I should have known what was going wrong, but it was easier to hide in the world I could control than face painful conversations. I shut myself off, and left her facing our problems alone. When she finally had the courage to confront me with them, she felt it was too late for a fix. It wasn't that work kept me physically away, it's that it gave me a place to hide from our problems.

123 wrote:

"I know how it is to be on the other side of the story. It's hard for us too. I'm really sorry."

That's what I'm going to regret for the rest of my life, the pain I caused someone I love.

Alex Dong paints a picture of where I could be in 50 years time if I'm not careful:

"When I met Frank, he was 97. Just came back from his first accident falling off from stairs. A visit to his house was like a tour of the car history museum. He was there when Ford was just starting up. Frank invented a shitload of gadgets and mechanical devices that made the model-T possible. Most bleeding edge technologies they created have long gone. Today we don't even have a chance to see them anymore. Like a reflective mirror on top of the front light that can tell the driver whether the light is on or not. He was so lonely by then that he took us for a ride in his still brand new Ford Model A and refused to stop and let us go home.

Frank's personal life was a complete failure. He was so passionate about changing the world that he had a workshop at home, with lathe and tons of great manual tools. His wife became an alcoholic because Frank completely ignored her. His two sons hated him so much that none of them came to visit him in the last 5 years.

A month ago, I heard that Frank finally moved into a nursery home. He sold his house before he left. My friend Walter was there when one of his son hired a dumpster to take away all Frank's tools. The whole workshop was thrown away. Nobody, even a museum, wanted his stuff. All the great books, models, designs, gadgets. All into the dumpster."

Jud Valeski talks about the impossible balancing act:

"Thanks for the honesty. I'm sorry this happened. The apparent out-of-the-blue nature scares me. Things seem "fine" in my world. I fear coming home one day to an empty house and a note on the table.

There are so many conflicting desired behaviors between work and non-work relationships. In order to make my company succeed, I have to pour everything I have into it. In order to make my marriage and parenthood succeed, I have to pour everything I have into it. What the!?!

Over the past six months, since taking on the CEO position at work, my world has shifted. Even more than before, as a founder, my time is absorbed by work. I've never been good at balance, but I like to think I'm doing ok with it. I get support from my spouse (tremendous support), but I don't know how real it is. Not because she's potentially dishonest, but because I don't know if these are the kinds of things that even she can know in the moment (can any of us?). I can't ignore the frustration in her voice when talking to her on the phone on a work trip asking "how many more nights are you gone?"

For better or worse will be determined in time, but we've split the balance up at a unit level. We've curiously fallen straight back into the 1950's; I bring home the bacon, and she runs the house and kids. I try to be a good father by showing my children what hard work means. What it means to dedicate yourself. What it means to be passionate. What it means to run fast. What it means to pick yourself up after you face-plant. What it means to work and love. What it means to love work. Those are the examples I can give. That is my life. That is how I can contribute to our child rearing as a parent. It's not necessarily my preference, but it's what I can do, and do well, at the moment."

Adrian Ashton on the hidden costs behind work we admire:

"I occassionally get to guest lecture to enterprenuers clubs in colleges and universities and always end with an image of Edward Much's 'Scream' to illustrate the point that the piece of art has changed the world, touched and changed countless lives… and all it took was for the artist to have a breakdown."

Mitch Fillet on a time limit for startups:

"It is very hard to navigate the desire for achiievement, the demands of your employer aas they heap on both responsibility and compensation and the needs of your family.
Wrap those demands woth some aging parents and a few chldren and it becomes an almost impossible cycle to shoulder for more then a year or two. That is why we speak about a 36 month exit of some sort. This, of course, does not mean an IPO. It just means a recognition that a combination of delegation, infrastructure build-out and possibly the inclusion of other stakeholders preserves the sanity and family of the founder group."

Nicholas Napp on the sacrifices you have to make:

"We talked and agreed that there needed to be more ground rules other than simply keeping to the truth. If you're not willing to sacrifice your relationships (something I am no longer willing to do) you have to make other compromises. That's why I now have a startup and a consulting business. Yes, VC's hate the idea, but they're not the people I want in my personal life.

For me, there has to be more balance. You can't stay happy and walk away from your passion and a desire to build things, but that passion can easily blow up the rest of your life. Being hyper focused on work is the natural tendency for an entrepreneur, but for most of us, I don't believe it's effective. You lose perspective, miss opportunities and make mistakes. Not that you don't make mistakes otherwise, but at least if my work life sinks, I have a rich personal life to anchor me. That gives me the chance to reset and try again."

Kin Lane on sharing the obsession:

"Definitely an area that young entrepreneurs do not consider. My marriage eventually ended after 10 years due to my chronic entrpreneurialism.

I still suffer from it, but manage it better these days. Also found an equally geeky, obsessive GF….and our obsessive world is shared. "

The dark side of entrepreneurship

My work has never been better, but my personal life is in tatters – an eight-year relationship just ended. I probably shouldn't even be discussing this here, but writing it down helps as I try to make sense of what went wrong – what I did wrong.

There's no direct line I can draw between what happened and my startup life, but I have to wonder. Something people seldom talk about with entrepreneurship is how corrosive it can be to relationships. Founders are driven, and that doesn't make for a comfortable world for them or those around them. I always tried to make my home life the top priority, but my obsession is a part of me.

There's so many characteristics of entrepreneurs that make us hard to live with, from a constantly uncertain future to long working hours and a monomaniacal focus on our projects, but these tend to get lost in the classic Romantic mythology that is perpetuated by our willingness to lie about the reality of startup life. There's plenty of counter-examples that show it's possible to be a founder and a good spouse or parent, but the 'craziness' that drives us to build castles in the sky is not always a benign force.

Please, take a minute to think about the people you love, and be certain you're giving them everything they deserve, that you're truly there for them. I have lots of time to reflect on that, now it's too late.

OpenHeatMap now supports states and provinces worldwide

One of the most frequent requests for OpenHeatMap has been better support for provinces/states outside of the ones I already offer. The bottleneck's been finding the data in a form I can use, with a lot of help from locals I found a handful of usable maps for India, Mexico and Canada, but it was a slow process. All that changed when I discovered the public domain Natural Earth data set. Taking one map containing top-level administrative districts for every country worldwide, I was able to extract the states and provinces for hundreds of nations. This means you can now upload a spreadsheet containing province names for almost any country and get a detailed map.

This is the map you get when you upload Afghan provinces, and below is a complete list of the examples for the countries I support. I'm very excited to see how people are able to use this, so let me know how you get on.


Land Islands
United Arab Emirates
Burkina Faso
Central African Republic
Cocos (Keeling) Islands
Cote D'Ivoire
The Democratic Republic of the Congo
Costa Rica
Christmas Island
Czech Republic
Dominican Republic
Faroe Islands
United Kingdom
The Gambia
Equatorial Guinea
French Guiana
Heard Island And McDonald Islands
Kyrgyz Republic
South Korea
Lao People's Democratic Republic
Libyan Arab Jamahiriya
Sri Lanka
Moldova, Republic Of
The Former Yugoslav Republic of Macedonia
New Caledonia
Norfolk Island
New Zealand
Papua New Guinea
Democratic People's Republic of Korea
Saudi Arabia
Svalbard And Jan Mayen
Solomon Islands
Sierra Leone
El Salvador
Serbia And Montenegro
Trinidad & Tobago
South Africa

Five short links

Photo by Joseph Robertson

OmniMark – Old-school but impressive tool for turning arbitrary semi-structured data into XML. I’ll be trying to learn from this as I look to improve my ETL process – via Kevin Marshall

The rise and fall of Swivel – So many lessons for any data startup in here. Swivel took several million dollars in funding before they had a plan of where they were going, and built a generic platform instead of a focused application targeted at users who would them some benefit. That they had less than ten paying customers, despite tens of thousands of registered users is a good reminder of the work you have to put in to create revenue, you don’t just get a fixed percentage of active users upgrading – via Joe Parry

The Obese Surfer Problem – Russell explores a compelling visualization that serious surfers are willing to pay money for. I like the idea of ‘predictive models’ as a more general category for what I often talk about as recommendations. Showing you what could happen is a lot more valuable than just a rear-view mirror showing the history – via Russell Jurney

HexFiend – “A fast and clever open source hex editor for Mac OS X.” Does exactly what it says on the tin, I’ve been searching for a good hex editor since Codewright died, and so far it’s been great

Benoit Mandelbrot is gone, but he shouldn’t be forgotten – A strong reminded to everyone; don’t assume a Gaussian for your probabilities if your events don’t follow that distribution. We have so much faith in numbers as summaries of reality, but like spherical cows, unrealistic assumptions can lurk behind the most solidly calculated figure – via Behavior Gap

Visualization myths around Snow’s cholera map


Thanks largely to Tufte's evangelization, John Snow's map of the 1854 cholera outbreak in Soho has become the classic example of the power of visualizations. I've just finished Steven Johnson's The Ghost Map that tells the story behind the graphic, and it's surprisingly different from the simplified explanation that usually accompanies the picture.

The map wasn't that innovative

Snow wasn't the first person to draw these kinds of maps, he wasn't the first to draw them to track disease, and in fact he wasn't even the first person to map this particular outbreak! The Sewer Commision produced a very detailed map showing the death locations. The power of Snow's version came from his decision to leave out a lot of details (sewer locations, old grave sites, etc) that cluttered up the Commision's version. Their map was so muddled that it didn't tell a story, but Snow's was stripped-down to show exactly what he needed to bolster his theory that the epidemic spread from the water pump.

The only technical innovation that Johnson identifies was his use of boundary lines to mark the areas that were closest to particular pumps by walking distance, to demonstrate that many of the cases nearer to other water sources as the crow flies were actually in the catchment area of the Broad Street pump. Unfortunately that version of the map is rarely shown, and Tufte himself dismisses it as "Voronoi baloney"!

Theory came first

From the popular account it's easy to imagine that Snow plotted the deaths on his map, then the pump locations, and that triggered a revelation. In fact he'd been fighting for a decade to prove that cholera was a waterborne disease, not spread atmospherically as the miasma theory claimed. He'd already gathered a lot of evidence from the differing rates of the disease amongst neighbors using piped water from different suppliers. It was a tool for "hypothesis testing" not "hypothesis generating".

Data gathering was the key

Together with the Henry Whitehead and local doctors, Snow spent weeks going door-to-door gathering detailed information from area residents. He was then able to present that data as evidence for his theory in a variety of forms, including anecdotal case histories, numerical analyses and his maps. The key was that this hands-on experience with the raw data gave him the story he wanted to tell, and then he was able to make his argument using a variety of different presentation tools.

These two ideas are essential points for my work; a lot of the recent approaches to visualization assumes that you can give ordinary people simple map or graph creation tools, and they'll be inspired to create powerful graphics. With OpenHeatMap I've concentrated on people who already have a story to tell; journalists, activists and other people who are highly motivated to make an argument. It's about empowering people who are looking for a solution, not hoping that we'll turn passive observers into active participants just by handing them the tools.

The map became marketing

The actual story and evidence behind Snow's work is complex and hard to explain. As his theory became widely accepted as a massive historical advance, the map came to stand as shorthand for the story behind it. After that, it was easy to imagine that the graphic was the central evidence of his report on the outbreak. In fact it was just one piece of evidence, but it was so accessible and easy to use as an illustration that it spread slowly but virally through different publications. As Johnson puts it in his book "the map was a triumph of marketing as much as empirical science".

This is something I've seen in my own work too. Visualizations are fantastic at engaging people, everyone loves maps. When it comes down to detailed analysis though, a spreadsheet or other list-based interface is almost always better. Maps and other visualizations tell stories so well because of how much they leave out, but textual representations still rule when it comes to actually working with the full data. Think of your visualizations as powerful marketing tools, as bait to get people in the door, but expect to offer them something deeper when they want to work with that data.

There's a lot more to the story than I can cover here, so if you've got any involvement in data analysis or visualization you should pick up The Ghost Map, it's full of so many lessons and is a gripping read on top. I also recommend this short academic paper "Essential, Illustrative, or . . . Just Propaganda?" that argues for a different perspective on Snow's work than both the traditional popular account, and Johnson's revised approach.