A simple PHP LinkedIn OAuth example

Photo by Ranger Gord

While I was researching my LinkedIn data scientists article I had coffee with Adam Trachtenberg. I haven't used the LinkedIn API since when I last looked into it there was no way to find users just by their names and locations. I was happy to discover that the People Search API now makes this straightforward, so in the last few days I've been researching how I can integrate this into my work.

The biggest obstacle was getting past the OAuth login stage, since implementing a secure protocol over plain http means a convoluted dance, and no two vendors do it quite the same. There's a few other examples out there, but I adapted my Gmail/IMAP OAuth PHP code to work with their setup. For your delectation and delight, I present LinkedInOAuthExample, now live on github. It's written to be as concise and dependency-free as possible, but I still find the steps involved somewhat mind-bending. OAuth 2.0 is a lot cleaner, at the cost of requiring an https server, I hope that will become the default for future APIs.

Five short links

Photo by Erwin Morales

Insurers Test Data Profiles to Identify Risky Clients – Intriguing research from the WSJ, and sheds light on possible downsides of all the information we’re sharing online. The concrete examples seem highly unlikely to pan out though, we’ve had detailed household purchase data available for decades, and insurers haven’t found it useful.

IP over Avian Carriers – My favorite part is the test implementation.

A Bully Finds a Pulpit on the Web – The writing and human-interest angles on this article are excellent, but its central thesis that Get Satisfaction and other review service are boosting complained-about sites PageRank is completely wrong. They use rel=”nofollow” specifically to avoid those sort of manipulations by spammers, as do all the major services that support user-generated content. Even Thor from GetSatisfaction was baffled – “The article approaches SEO in near-mystical terms”

HMS Invincible – Interested in a cheap aircraft carrier? I remember walking around this ship on Navy Days (my grandad was in the service), a bit of a shock to discover it’s being sold off.

The Xenotext Experiment – Encoding poetry on the genome of a bacteria capable of surviving heavy radiation and hard vacuum. If this works, the poem will endure until the sun burns out.

Hiking Golden Gate Park from Ocean Beach


Every few minutes a street car pulls up below my window, promising a trip to Ocean Beach. With its echoes of Dark City, I knew I had to find an excuse to see where it took me. My bike's still in transit from Colorado and thanks to the general chaos of moving me and Thor have been getting less exercise than we're used to, so I decided to plan an urban hike using the Muni N line as a shuttle. I got on at the corner of Duboce and Church, but you can pick it up at a lot of points between there and downtown too.

Getting there was easy, if a bit confusing. I read up on the system beforehand, so I had Thor in a container and was prepared to get on at the front and pay $4 total, $2 for each of us. The driver refused to take my money though, just thrust a ticket into my hand, mumbling something I couldn't make out! Very odd, but the rest of the journey was a lot smoother. Within fifteen minutes, we'd arrived at the end of the line. Walking across the street onto the beach, the surf was roaring. There were a few brave souls with boards, but it was a lot fiercer than I was used to around LA.


Walking a few blocks north along the beach, we turned inland by the Beach Chalet and entered Golden Gate Park proper. If you want to start with some food, or do this hike in the opposite direction and end with lunch, I highly recommend the restaurant. Some friends treated me to a delicious meal there recently, and if you get a window table the views are jaw-dropping.

Once in the park, there's a lot of trails to choose from. I recommend wandering where your fancy takes you, there's a lot of hidden treats to stumble across. I ran into a great dog park near the start, so Thor got even more exercise this morning than I bargained for. I did find myself refering to my compass though, with so many winding trails to pick it's easy to end up wandering in circles.

It was Sunday so the park was busy, but it never felt crowded. We wandered past lakes, barbecue pits, meadows and near the end the de Young art gallery (which a group of us had tried and failed to visit yesterday, thanks to a five hour wait for entry to the Impressionists exhibition). All the trail users we met were mellow and friendly, even bikers caught behind the occasional knot of walkers.

Leaving the main park, we continued along the panhandle section that parallels Haight. In previous explorations I've taken Haight itself, but I found the western end a bit sketchy, I had to pick my way through a gauntlet of cheery drug dealers even in mid-afternoon.

When the panhandle ended, we had about another mile to go along Oak to get home. It was roughly six miles, with some respectable grades but no steep hills, and it took a little over two hours to complete. If you're looking for a straightforward hike through some beautiful scenery, right from the center of San Francisco, this one seems hard to beat, especially when you can customize the trip by hopping back on the N Line at any point.

Five short links

Photo by Ocean of Stars

Changing behavior – it ain't easy – The fundamental challenge for any startup is how to change people's behavior so that they begin using your service. This article is about energy conservation, but everything in it applies to us too. The bottom line is that "changing people's habits in a systematic way turns out to be a painstaking, labor-intensive undertaking". The solution sure sounds like a lean-startup approach:

"You research the barriers and benefits that are most salient or motivating. You don't guess at them — as people all too frequently do — because they're often surprising or counterintuitive. You do the research. Then you select promising methods to reduce barriers and increase benefits and run a small pilot program; you keep at that until you find what works. Then, and only then, you scale up"

Steal this heatmap – Thanks to Lauren Kirchner at the Columbia Journalism Review for this great writeup of how OpenHeatMap can help journalists

Probabilistic Data Structures and Breaking Down Big Sequence Data – A great talk on using tools like Bloom filters to handle problems at a massive scale, with slides available here

Eye on Earth – An intriguing experiment in showing pollution data on an interactive online map, funded by the EU. I really like the concept, but I was somewhat baffled by the interface.

A world of tweets – A simple but compelling view of Twitter messages from around the world. I particularly like the animation showing incoming tweets, that helps the viewer make sense of what's happening.

Why we need startup lies

Photo by Sally Crossthwaite

A lot of people have got very frustrated with the tendency for successful startup founders to embellish their company's creation myth until it becomes an outright lie – Pez dispensers anyone? I'm driven crazy by the misinformation floating around too, but I'm wary of destroying the beneficial aspects of the startup culture in an attempt to stamp out the lies. As Tom Evslin puts it "nothing great has been accomplished without irrational exuberance". If founders were truly sensible, all of us would be squirming our way up the corporate ladder instead of trying to build castles in the sky, the probabilities of success are way better.

These thoughts came to the front of my mind when I read The Grandeur of Glory, an exploration of the role of stories based around a research paper on sheep foraging patterns. The conclusion of the study was that you need a certain number of 'risk-taking' sheep who will wander widely to find new patches of grass, along with a large population of more sedentary animals to safely graze the known resources. What's intriguing about stories in human cultures is that they're almost exclusively dedicated to lionizing risk-takers. This might seem a given, but there's no fundamental reason why we should find Achilles a more interesting person than someone who's spent an uneventful forty years in accounting. My favorite definition of drama is simply "There's a problem". Our stories all center around characters tackling those problems, people with something at stake, taking risks, and Brendan advances a theory about why:

"Does this suggest that humans are too predisposed toward meekness, so that we require cultural encouragement to develop a sufficient number of risk takers to sustain the species?"

This feels intuitively right for the startup world. Everybody in positions of power has an interest in encouraging young entrepreneurs to jump in and take risks, so we all tell stories that at the very least involve some creative editing to paint a rosy picture of the process. The obvious downside is that we may be driving people with slim hopes of success into sinking their life savings and endangering their relationships in pursuit of a mirage.

To my mind, that's a price we have to pay. I grew up in Britain with a much more 'sensible' culture, and the priority given to safety drove me crazy. I need some risk in my life, and startups are a great way of channeling that tendency in a way that hopefully benefits society at large. The romanticization and mythology that surrounds the work that I do is definitely part of why I stick at it. I love having interesting stories to tell, it means I get something from even the most abject failure.

My hope is that we can unearth more complex, true-to-life tales from the startup world. I find it a lot more inspiring to hear from someone who's been repeatedly knocked down, and picked themselves up every time, than a sanitized tale from a founder who seemed to stumble across a winning formula with no false starts. Encouraging more founders to blog would be a good start, Tim Bull's site is a great example of how this can work. What we shouldn't aim for is the eradication of the whole mythology that has emerged around startups. It's actually the main reason we have an entrepreneurial community here in the US. The stories we tell each other (and ourselves) may be misleading, but they're necessary.

The problems with cloud computing

Photo by Tipiro

Carlos Ble's recent post about his experiences with Google AppEngine has made quite an impact. I found myself nodding along to almost all his points, I've evaluated GAE and found it unsuitable for my data processing work, largely because of the time limits it imposes. This isn't really a criticism of the system, it's just not the right tool for the job. Most of the limitations Carlos hit can be found by studying the documentation and mocking up a few test apps.

The problem is, the cloud has been oversold. I'm an enthusiast, I couldn't have done most of what I've done in the past two years without EC2, but it's no magic bullet. The cloud is like any other software engineering framework, the price you pay for the added convenience is an extra dependency in your stack. Here's the drawbacks:

Reliability. The hardest problems to track down are failures that are caused by the underlying virtualization infrastructure. Carlos discusses the outages he hit, but it's not just Google, this is a general issue with all cloud-based services. Early in OpenHeatMap's development one of Amazon's load balancers failed mysteriously, killing my site.

Apathy. Neither Google nor Amazon would really care if their cloud services divisions disappeared. Their revenue is dwarfed by the rest of their business, they're primarily a strategic bet and a way of re-using their existing expertise. In practice this means that they sit low on the corporate totem pole and don't have the resources or will to respond actively to issues. My load balancer problem never got resolved, despite continuously harassing the forums, sending emails to support, even tracking down AWS engineers at conferences. They just blew me off, even though I've spent $1000+ with them every month for the last two years. It's not just me, check out Jud's experiences trying to delete millions of files from S3.

Why not just buy premium support? There are cases where I'd consider it, but in my experience companies that are useless at free support don't magically become awesome once you hand them more money. It's baked into the culture – instead of ignoring your emails, they'll fire back detailed replies giving you 16 point checklists that will take you two days to complete, and still don't resolve the problem. Fundamentally, you don't have a 'throat you can choke' to get real help.

Conformity. If you use AppEngine for simple page serving and data entry forms you'll be fine. Amazon is great for running your own LAMP server. That's what these services were designed for and lots of people are using them that way, so you're unlikely to hit many problems. It's once you stray off that well-lit path that you stumble upon issues that the system designers hadn't anticipated, and bugs that escaped their testing.

I'm still using EC2, but trying to keep my usage as simple as possible. I've spoken to several startups recently who are trying to figure out how to host their services and laid out my experiences. At least one of them has gone for Rackspace, primarily for the support. I've considered that, unlike the big players it's the central focus of their business and so they're unlikely to be apathetic. The downside is that there's a lot fewer people using them, making it harder to find solutions and tutorials.

I love the cloud, it's made things possible I could barely imagine a few years ago, but there's no free lunch. Before you build your business around a cloud infrastructure, make sure you've factored the tradeoffs into your plan.

The unsung heroes of email


I spent most of my career working in the desktop world, and despite the obvious shift towards web-based applications, there are still hundreds of millions of users who rely on Outlook. I was one of those users for a decade, and I still haven't found anything online that functions nearly so well as an integrated communications hub. It's a shame that desktop-only startups don't get their fair share of attention, so I was pleased to see ClearContext's latest Outlook plugin get some love from GigaOm.

I've known their founder Deva Hazarika for several years, and was initially intrigued when I learned he'd raised his seed funding by playing poker! It's an unusual company in a lot of other ways too. They have an old-fashioned business model in that they are focused on making money through selling downloadable software to customers. What a crazy idea, eh? It's led them to be fanatical about improving their product, since they have to wow their end users in order to make any money.

The Personal edition they've just launched is interesting because it's their first free offering, and works as a showcase for some of the cool features they add on to Outlook. In particular they bring a priority inbox-like experience to the system, along with some very handy pre-populated rules for recognizing and filing common bacn emails like Facebook notifications.

Anyway, if you're an Exchange user looking jealously at the latest Gmail Labs creations, give ClearContext a try, they add a lot to Outlook.

Biking over the Golden Gate Bridge – Downtown San Francisco to Sausalito

Since my apartment hunting was over more quickly than I expected, and I’ve been enjoying the San Francisco food far too much, so I decided to go for a bike ride this morning. With the Golden Gate bridge looming, how could I resist? The guys at Blazing Saddles gave me some general directions when I rented my bike, but there were a few sections that were tricky to follow, so I’ve mapped the route I ended up taking from downtown San Francisco, and give my own directions below. I rode from the city to Sausalito and back again, with a nice lunch spot at the turn-around point, and it was about 22 miles total with some non-trivial hills.

For convenience I’ve started the map at Powell Street BART station, but I actually did a more complex route through downtown, since I had errands to run. There’s a great bike lane down Market Street though, so I recommend taking that south until you get to Page Street. This is then a straight shot to Golden Gate park, and it’s mostly quiet, residential and bike friendly, though you will need to navigate frequent stop signs. In the park itself you take bike paths through the north-east corner to Aguello Boulevard. That heads north towards the massive Presidio park, and it can be a little tough dealing with the traffic despite the bike lane. Once in the park, continue along Aguello until Washington splits off to the left. Washington will take you up and over the ridge, and then merges into Lincoln. A little way down Lincoln, turn left on Merchant, and follow that until you’re approaching the PCH/101.

This was the hardest part of the ride for me to follow. I knew that on weekends, the western sidewalk of the bridge was reserved for cyclists, but I had a hell of a time figuring out how to get there. Eventually I worked out you had to take the tunnel under the roadway from Merchant, then bike up on the east side past the gift shop, then follow another tunnelled path under the roadway back to the western side and the bridge’s sidewalk.

The bridge itself was fantastic to bike on, even on a busy Sunday morning. The weather was clear and sunny with breath-taking views across the water, the grade was steady and even the most aggressive bikers had plenty of room to swerve around tourists. At the other end you come off into a parking lot, and can then head onto Alexander Avenue to Sausalito. This can be a bit hairy, since the road is narrow and steep in sections, but the drivers seemed quite bike-aware thankfully. In Sausalito itself, you end up riding along the waterfront. There’s a lot of choices for refreshments, but I was intrigued by a tiny hole-in-the-wall place that simply promised ‘Hamburgers’:


Yelp gave it a thumbs-up, so I ventured in for a cheeseburger and fries. Man, that tasted good! They’re apparently a bit Soup-Nazi in their customer service, but I didn’t see anything other than slight surliness. It was lucky I got there early because they soon had a queue out the door, so I guess I’m not the only one to be hooked by their food. There’s almost no space to eat inside the restaurant, but the park across the road had a gorgeous view over the bay and plenty of benches.

Blazing Saddles had suggested taking the ferry back to the city, but after that burger I had a whole new set of calories to burn, so I headed back the way I’d come. It was pretty easy to follow the same route, though I do recommend a detour to the Dolores Park Cafe, which I checked out since its only a block from my new place. Their hot chocolate hit the spot, though you do need to be a fan of canines since its across the street from a great dog park.

If you find yourself in San Francisco with a bike and a few hours to spare, I highly recommend this ride. You’ll pass through some funky neighborhoods like Haight, two gorgeous parks and experience the Golden Gate bridge from a unique angle. It will also help justify all those lovely Bay Area meals that are tempting you…

Five short links

Photo by Metrix X

How to not log personally-identifiable information – IP addresses are PII, so removing them from your server logs should be standard practice unless you have a specific need.

Inside Google's MapReduce infrastructure – Bloody hell, they're processing one exabyte of data a month! I didn't even know the term for 1,000 terabytes before, that's an astonishing number.

Netflix cloud storage – A white paper on Netflix's use of SimpleDB. I have to admit I've given up on it as a solution, the obstacles to large data loads overwhelmed me, but great to see they've had success.

Feedera – An intriguing take on 'personalized pagerank' for surfacing interesting Twitter articles. Had a great geek out last night with its creator Sachin Rekhi too.

Tealeaf – Remember that data entry field that cost Expedia $12m in lost sales? Julian Green had lots of similar tales to tell from his Ebay experiences, and apparently Tealeaf is a great tool for analyzing and diagnosing that sort of customer behavior.

Mapping apartment prices in San Francisco


I've decided to move to San Francisco. Much as I love Boulder, over the past year I've found so many of the people I'm working with or would like to work with are in the Bay Area, I'm making the leap.

Being the data-driven geek that I am, I wanted to understand the rental situation for different neighborhoods before I chose an apartment, so I watched Craigslist for a couple of weeks and collected 160 properties from around the city that met my basic criteria of having one bedroom and accepting a small dog. Once I had those in a CSV file, I could then upload them to OpenHeatMap to get this visualization:


This actually helped me narrow down my search, since it made it clear that SoMa and the Embarcadero area were too pricey, and pointed me towards Mission and Lower Nob as my main targets. Over the next few days I'm going to be trying to pick out an apartment, and then moving right after I speak at Defrag next week.

If you like this sort of thing, you should also check out PadMapper, a Y-Combinator startup that has a great apartment search interface. I need to get them using OpenHeatMap though, I'd love to see their data in a non-map-pin form!