Does reality improve when your numbers do?

Hockeystick
Photo by Judy and Ed

I had a tough meeting with an advisor this week. I was proudly showing off how we've managed to triple the amount of time that first-time users spend on Jetpac, when he interrupted. He wanted to know why he should care? It forced me to quickly run him backwards through our decision-making process, looking at why we'd chosen that as one of the numbers we wanted to improve. We'd started there because we noticed that our most successful users, those who enjoy the app enough to keep coming back, tend to interact with the app a lot on their initial visit. Users who take more actions spend more time on the app, the correlation has always been strong in our case, so time was a good approximation of how much they were interacting. That had became the goal, and I had been so focused that it took me a moment to reconstruct how we'd got there.

The dangerous part was that there were lots of ways we could keep users on the app longer without improving the experience at all, or even making it worse! Luckily we have a lot of different methods of understanding how the experience is holding up, from surveys, crowd-sourced user tests, and contacts with power users, but it's still a risk. 

When I was in college, a lecturer who was a grizzled engineering veteran warned us "You'll start off wanting to measure what you value, but you'll end up valuing what you can measure". You need to have fresh eyes looking at how you're evaluating your own progress, not only so you avoid the more obvious problem of vanity metrics, but also so you don't follow your numbers down a rabbit hole. Any measure is only a projected shadow of reality. When somebody asks "So what?", you always need to be able to point to something in the outside world that gets better when the metric does!

What should a lead engineer code on?

Lead
Photo by Cindy Cornett Seigel

If you're a programmer who's been thrust into management, you'll probably want to keep coding. It's the only way to truly understand what's happening inside the engineering team, and nobody wants to become a pointy-haired boss. Your non-programming responsibilities will take a lot of your time though, so how can you pick the right tasks to take on? I've worked with several outstanding lead engineers at Apple and elsewhere, and here's what I've noticed about what their coding responsibilities look like.

Boring

The only way to motivate good hackers is to give them something interesting and challenging to work on. As a greybeard engineer, you've probably gone through your career fighting for chance to work on tough, rewarding problems, so your reflex will be to jump in on the most daunting and fun tasks. If you're a good manager, you'll stop yourself! Look for tasks that nobody else wants to take on instead. You shouldn't need the motivation yourself, leading the team should be enough, and you'll be able to offer your engineers a more rewarding bunch of work. It also builds respect for you in the team if they can see you're willing to sacrifice something meaningful for their benefit.

Ubiquitous

In the short-lived Police Squad, Johnny Shoeshine always supplied the 'word on the street' for all sorts of implausible topics. Being a lead is a lot like that! You need to know the nitty-gritty details of what's happening in the code base, and understand intimately how it's evolving so you can offer meaningful advice and head off potential problems early. The only way to do that is to touch as much of the code base as possible as often as possible. That means picking tasks that are cross-module, whether it's integrating multiple parts of the code, or a service that's used everywhere.

Non-blocking

The sad reality of a manager's life is that you're unpredictably called away from your day to day duties, especially when deadlines are looming. That can be disasterous if other team members are relying on you to deliver code so they can make progress, or if bugs are going unfixed because you're unavailable. You need to find something that can be worked on incrementally in small chunks, and doesn't prevent others from making progress if you do get waylaid for a week.

Following this philosophy, one of the things I've ended up building is the activity log analysis system. It's not something anybody else wants to work on, we need to record events almost everywhere in the code so it touches every module, and it doesn't stop us shipping if improvements get delayed.

If you're a lead, give boring a chance, you'll be amazed at how effective an approach it can be!

How I learned to stop meddling

Maryworth

I ran across Fred Wilson's latest post this morning, and I have something to confess. I'm a meddler. If I see someone struggling with a task I know well, I have a strong urge to jump in and 'help'. This isn't always a bad thing, in the past it's helped me train up more junior folks, and experienced folks could always tell me to go take a hike.

That's all changed since I've become a CTO. Even though it's a small team, I'm a 'boss', which means that people are prone to humoring me more. It took me a while to realize, but no matter how diplomatic I think I am, my guys don't feel as comfortable telling me to bugger off.

Over the last couple of months, I've had to learn a new style of interacting with them. Instead of giving 'helpful' suggestions on the best approach to solving a problem, I'll lay out the goals and some thoughts at the start, and then step back and let them find their own path to an implementation. I'm always available to answer questions and give advice when they ask for it, and we'll often do an informal post-mortem on what did and didn't work at the end of the sprint, but otherwise I try to give them the freedom to code their own way.

I'm lucky enough to be working with a bunch of very smart folks, so the results have been impressive, the solutions have been much more imaginative and effective than they were. It's been humbling to see how strong a negative effect my frequent interventions had, but thinking back on my own career it makes sense. "Voice and choice" were the keys to the jobs I loved. If I'd been involved in planning my own work, and then made decisions about how to tackle it, it turned from being a servile task I was grudgingly performing for someone else, into my project that I worked extra hard on because I truly felt ownership. I would even go out of my way to work in areas that were difficult and unpopular because those were the ones where I had the most freedom. Nobody wanted to interfere with my work on video format conversion code in Motion, for fear they'd be pulled into the quagmire too!

The liberating thing has been how much it has freed me up to work on other vital parts of my job, but that's a subject for another post. If any of this is sounding familiar to you, try really giving your team voice and choice, you'll be amazed at the results!

Five short links

Handprints
Photo by Ryan Somma

DocHive – Transforming scanned documents into data. A lot of the building blocks exist for this as open source, the hard part has always been building something that non-technical people can use, so I'm looking forward to seeing what a journalist-driven approach will produce.

How I helped create a flawed mental health system – There are a lot more homeless people sleeping on my block in San Francisco than there were even just two years ago. I'm driven to distraction by the problems they bring, but this personal story reminded me that they're all some parent's son.

Can you parse HTML using regular expressions? – An unlikely title for some of the funniest writing I've read in months.

Forest Monitoring for Action – A great project analyzing satellite photos to produce data about ecological dama around the world. I ran across this at the SF Open Data meetup, it's well worth attending if this sort of thing floats your boat.

Data visualization tools – A nicely presented and well-curated collection.

Why you should try UserTesting.com

Humancannonball
Photo by Zen Sutherland

If you're building a website or app you need to be using UserTesting.com, a service that crowd-sources QA. I don't say that about many services, and I have no connection with the company (a co-worker actually discovered them) but they've transformed how we do testing. We used to have to stalk coffee shops and pester friends-of-friends to find people who'd never seen Jetpac before and were willing to spend half an hour of their life being recorded while they checked it out. It meant the whole process took a lot of valuable time, so we'd only do it a few times a month. This made life tough for the engineering team as the app grew more complex. We have unit tests, automated Selenium tests, and QA internally, but because we're so dependent on data caching and crunching, a lot of things only go wrong when a completely new user first logs into the system.

These are the steps to getting a test running:

 - Specify what kind of users you need. In our case we look for people between 15 and 40 years old, with over 100 friends on Facebook, who've never used Jetpac before, and who have an iPad with iOS 5 or greater.

– Write a list of tasks you want them to perform. For us, this is simply opening up the app, signing in with Facebook, and using various features.

– Prepare a list of questions you'd like them to answer at the end. We ask for their overall rating of the app, as well as questions about how easy particular features are to find and use.

Once you've prepared those, you have a template that you can re-use repeatedly, so that new tests can be started with just a few seconds of effort. The final step is paying! It does cost $39 per user, so it's not something you want to overuse, but it's saves so much development time, it's well worth it for us.

It normally takes an hour or two for our normal three-user test batches to be completed, and at the end we're emailed links to screencasts of each tester using the app. Since we're on the iPad, the videos are taken using a webcam pointing at the device on a desk, which sounds hacky but works surprisingly well. All of the users so far have been great about giving a running commentary about what they're seeing and thinking as they go through the app, which has been invaluable as product feedback. It's actually often better than the feedback we get from being in the room with users, since they're a lot more self-conscious then!

The whole process is a pleasure, with a lot of thoughtful touches throughout the interface, like the option to play back the videos at double speed. The support staff has been very helpful too, especially Matt and Holly for offering to refund two tests when I accidentally cc-ed them on an unhappy email about the bugs we were hitting in our product.

The best thing about discovering UserTesting.com has been how it changes our development process. We can suddenly get way more information than we could before about how real users are experiencing the app in the wild. It has lowered the barrier dramatically to running full-blown user tests, which means we do a lot more of them, catch bugs faster, and can fix them more easily. I don't want to sound like too much of an informercial, but it's really been a god send to us, and I highly recommend you check them out too.

Strange UIWebView script caching problems

Hieroglyphics
Photo by Clio20

I've just spent several days tracking down a serious but hard to reproduce bug, so I wanted to leave a trail of Googleable breadcrumbs for anyone else who's hitting similar symptoms.

As some background, Jetpac's iPad app uses a UIWebView to host a complex single-page web application. There are a lot of independent scripts that we normally minify down into a handful of compressed files in production. Over the last few weeks, a significant percentage of our new users have had the app hang on them the first time they loaded it. We couldn't reproduce their problems in-house, which made debugging what was going wrong tough.

From logging, it seemed like our app setup Javascript code was failing, so the interface never appeared. The strange thing was that it was rarely the same error, and often the error locations and line numbers wouldn't match with the known file contents, even after we switched to non-minified files. Eventually we narrowed it down to the text content of some of our <script> tags being pulled from a different <script> tag elsewhere in the file, seemingly at random!

That's going to be hard to swallow, so here's the evidence to back up what we were seeing:

We had client-side logging statements within each script's content, describing what code was being executed at what time, combined with <script> onload handlers that logged what src had just been processed. Normal operation would look like this:

Executing module storage.js

Loaded script with src 'https://www.jetpac.com/js/modules/storage.js&#039;

Executing module profile.js

Loaded script with src 'https://www.jetpac.com/js/modules/profile.js&#039;

Executing module nudges.js

Loaded script with src 'https://www.jetpac.com/js/modules/nudges.js&#039;

In the error case, we'd see something like this:

Executing module storage.js

Loaded script with src 'https://www.jetpac.com/js/modules/storage.js&#039;

Executing module profile.js

Loaded script with src 'https://www.jetpac.com/js/modules/profile.js&#039;

Executing module storage.js

Loaded script with src 'https://www.jetpac.com/js/modules/nudges.js&#039;

Notice that the third script thinks it's loading nudges.js, but the content comes from storage.js!

Ok, so maybe the Jetpac server is sending the wrong content? We were able to confirm through the access log the file with the bogus content (nudges.js in the example above) was never requested from the server. We saw the same pattern every time we managed to reproduce this, and could never reproduce it with the same code in a browser.

As a clincher, we were able to confirm that the content of the bogus files was incorrect using the iOS 6 web inspector.

The downside is that we can't trigger the problem often enough to create reliable reproduction steps or a test app, so we can't chase down the underlying cause much further. It has prompted us to change our cache control headers since it seems like something going wrong with the iOS caching, and the logging has also given us a fairly reliable method of spotting when this error has happened after the fact. Since it is so intermittent, we're triggering a page reload if we do know we've lost our marbles. This generally fixes the problem, since it it does seem so timing dependent, though the hackiness of the workaround doesn't leave me with a happy feeling!

If you think you're hitting the same issue, my bet is you aren't! It's pretty rare even for us, but if you want to confirm try adding logging like this in your script tags, and log inside each js file to keep track of which you think is loading:

<script src="foo.js" onload="console.log('loaded foo.js');"/>

In foo.js

console.log('executing foo.js');

Comparing the stream of log statement will tell you if things are going wrong. You'd expect every 'executing foo.js' to be followed by a 'loaded foo.js' in the logs, unless you're using defer or async attributes.

Things users don’t care about

Yawning

Photo by DJ Badly


How long you spent on it.

How hard it was to implement.

How clean your architecture is.

How extensible it is.

How well it runs on your machine.

How great it will be once all their friends are on it.

How amazing the next version will be.

Whose fault the problems are.

What you think they should be interested in.

What you expected.

What you were promised.

How important this is to you.

 

I have to keep relearning these lessons. Finding an experience that people love is far more precious and rare than most of us realize.