How to add a brain to your smart phone

I am totally convinced that deep learning approaches to hard AI are going to change our world, especially when they’re running on cheap networked devices scattered everywhere. I’m a believer because I’ve seen how good the results can be on image recognition, but I understand why so many experienced engineers are skeptical. It sounds too good to be true, and we’ve all been let down by AI promises in the past.

That’s why I’ve decided to release DeepBeliefSDK, an iOS version of the deep learning approach that has taken the computer vision world by storm. In technical terms it’s a framework that implements the full Krizhevsky stack of 60 million neural network connections, with a customizable top layer inspired by the Decaf approach. It does all this in under 300ms on an iPhone 5S, and in less than 20MB of memory. Here’s a video of me of me using the sample app to detect our cat!

This means you can now easily build cutting edge object recognition into your iOS apps. Even the training on custom objects can be done on the phone. Download and build the sample code and judge the effectiveness for yourself. Then all you need to do is teach it the things you care about, link against the framework, and you’ve given your app the ability to see!

What an ARM chip designer taught me about my career

Almost twenty years ago, I was an undergraduate at Manchester University in the UK. I’d been lured there by the promise of being in the same room as Steve Furber, a key designer of the original ARM chip, and so spent a lot of my time on hardware courses in the hope of hanging out with my hero. After a couple of years I eventually plucked up the courage to have a few words with him over warm white wine at a departmental mixer, blurting out the first thing that came into my head, something about how chip designers didn’t seem to get good salaries. He told me “If you’re really good you’ll get paid well, it doesn’t matter what the field is“. I liked that advice, it gave me permission to focus on learning the craft of programming, in the hope that it would pay off down the line whatever my initial job decisions were.

That was lucky, because my job choices were terrible. My first professional job paid less than I made at the shelf-stacking one I took to make it through college, I wasn’t working on interesting technology or problems, the company itself was beyond chaotic, and I was an awful programmer. The only thing that kept me going was the hope that I was learning how to suck less. One of the bright spots was our lead engineer Gary Liddon, a ‘veteran’ of a very young industry. He scared the living daylights out of me, he didn’t have any patience at all for time wasters, but he made sure I had support as I tried to figure out how to do my job. He showed me the secret to debugging impossible crashes for example – just comment out half the code, see if it still happens, and binary search your way to the answer.

As I bounced between jobs, I tried to spot the people like him who were better than me and learn whatever I could from them. I made some idiotic career decisions (who moves to Scotland to find work?) but over time I found myself becoming more capable as a programmer, which gave me the chance to recover from those bad choices. I found that hanging out with smart people who could show me amazing new ways to build software makes me want to go to work too! Learning started as something I focused on for my career, but I found it made me happy as well.

I’m not much of a believer in life advice, I feel like I’m still groping in the dark myself, but Steve’s throwaway remark has served me well as a guide. At least focusing on getting better at what I do is something I have control over.

Five short links

fivecircles

Photo by Tanakawho

A convention for human-readable 128-bit keys – Another nice example of natural and computer languages colliding, found via lobste.rs, which I’m hoping might turn out to be a version of Hacker News without the Hacker News commentators. I’ll still be lurking on the HN submission page though, since there are a lot of good links being thrown into the woodchipper.

Speedtree – A store devoted to beautiful computer models of trees. I remember wasting many evenings on simple fractal models of trees as a kid, it’s wonderful to see how far the technology has come.

Questionable crypto in retail analytics – Hashes are a flimsy curtain around private information that organizations want to share publicly, and they almost always reveal a lot more than the publishers like to think.

Starship Troopers and the Killer Cuddle Hormone - CPUs are tough to understand, but our brains are a whole different level of kludge. A great piece on how oxytocin interacts with our propensity to lie.

Binary Boolean Operator – The Lost Levels – If history had been a little different, we’d have the ‘implies’ operator, which would definitely make some of my assert()’s more readable.

Five short links

zombielincoln

Photo by Jovino

Debunking the 100x GPU vs CPU myth – I love having GPUs available, I couldn’t have done my recent deep belief web demo without them, but this paper matches my experiences. You can do amazing things with highly-tuned CPU code, and in a lot of real applications any speed gains on the computation side are swamped by the time it takes to transfer data to and from the graphics card.

OpenStreetMap isn’t all that open – I never understood the intent of the bondage-and-discipline open database license OSM adopted, but in practice it means that it’s very hard to use for general geocoding. If you use the data to look up the coordinates of a street address, then any data derived from that position is subject to the attribution and sharing requirements in any application it’s used in, no matter how many generations removed from the original. I can’t ask users of the Data Science Toolkit to publicly share their spreadsheets just because I’ve added a lat, lon column for them, so I’m using alternative open sources that don’t infect data sets they interact with.

Privacy in sensor-driven human data collection – I’m not sold on all the recommendations, but this working paper is a must-read if you’re working with sensor data and want to understand where the land mines are. Even plain-old accelerometer data can be very revealing.

Biometric word list – Like “Alpha/Tango/Foxtrot”, but for clearly speaking byte streams aloud, with words picked to minimize errors. Full of fragments of stories, including a “skydive racketeer”, “facial fortitude”, “slingshot rebellion”, and “highchair holiness”.

Potential, possible, or probable predatory scholarly open-access publishers – The dark side of the opening of the academic world, I keep getting contacted by dodgy-looking publications, so this list looks like a great resource. [Update - Cameron Neylon has a good comment on the background of this list. I'll be looking at his suggestions instead!]

Five short links

fivepoints

Photo by Axel Taferner

Downloading software safely is nearly impossible – I’m resigned to the fact that a determined-enough attacker can access my data, since at the end of the day there’s always duct tape and rusty pliers, but the size of the holes in the stack we have to trust to get our hands on software is still painful to behold. See the followup too.

Bulk whois data – If you ask them nicely, ARIN will send you a complete dump of all their whois contact information, or you can buy it with no questions asked from a third-party supplier. More data that we theoretically know is public, but that becomes more problematic when it’s available en masse.

Dog poop, Facebook, and optimism – In the computer world we’re uncovering all sorts of interesting insights into hidden aspects of humanity, but we haven’t been able to get them into the hands of all the sociologists, historians, planners, aid workers, medical researchers et al who can really use them. I’m hoping Nicholas Christakis’s Human Nature Lab at Yale will bridge some of that gap, I’m very interested to see what emerges.

Potato programming – Even though I’m not a fan overall, I still learned a lot from my forays into functional programming. I almost always never mutate values after I’ve assigned them, and I find code a lot cleaner when I can avoid lower-level for loop constructs in favor of something like a map or each. I just ran across this term for the clunky code that explicit looping produces, and it’s a memorable way of describing the one-potato, two-potato anti-pattern.

Verification handbook – This is a free handbook aimed at helping reporters separate rumor from fact when news is breaking, but it’s just as useful for readers of journalism. There are so many more sources of information these days that responsible citizens in a modern society have to be able to intelligently question what they’re being told.

Five short links

famousfive

Photo by Man’s Pic

The man who tried to build the Second Coming – I just visited Joachim Koester’s art exhibit on its last weekend at the YBCA, and while I was left cold by the actual experience, I loved the idea of designing a machine through a seance. As design methodologies go, it’s far from the strangest I’ve run across and I bet it would have a better success rate than your average design-by-committee. Joanne was fascinated by the concept too, and found me this off-beat blog post describing how 19th century spiritualists ended up trying to trance a whole new world of technology into being.

“It’s just a population map” – Andy Woodruff thinks we’ve taken the XKCD warning about population maps too much to heart.

Attempto Controlled English – With hindsight it’s obvious there would be a strictly-parseable and semantically-defined subset of our natural language out there, but it still feels like a peek into the future. Native English-speakers like me may feel lucky that our language is becoming the default, but I bet the need to communicate with machines is going to radically warp how we talk and write. We’ve already been trained to talk to Google in Searchese after all.

The mid-career crisis of the Perl programmer – An honest and insightful look back on a couple of decades of writing code.

The open-office trap – A huge pile of research indicating open offices should be considered harmful, undermining my own (conflicted) affection for them.

Writing code is not most programmers’ primary task

lonelychair

Photo by Brecht Soenen

I just read Nathan Marz’s argument against open plan offices, and this sentence leapt out at me as almost always wrong: “The primary task of a programmer is writing code, which involves sitting at a desk and thinking and typing“. I’ve been a big fan of Nathan’s since his Backtype days, but while his prescriptions make a lot of sense if you really are spending most of your time typing out new code based on clear requirements, I’ve found that a very rare position to be in. In my experience writing the code takes way less time than integrating and debugging it, let alone the open-ended process of figuring out what the requirements are. Most of my time goes on all those squishy bits. I look forward to the uninterrupted intervals when I get to just write code, but they’re not all that frequent.

When I started in games people who could write 3D software renderers were rare and highly-paid, but libraries and books came along that made their skills obsolete. Then it was folks who could program the fiendish DMA controllers on the Playstation 2, but soon enough they were sidelined too, followed by the game physics gurus, the shader hackers, and everyone else who brought only coding skills to the table. It turns out that we’re doing a pretty good job of making the mechanical process of writing code easier, with higher-level languages, better frameworks (something Nathan knows a lot about), and training that’s creating far more people who can produce programs. What I saw in games was that coding was the ticket that got me in the door, but improving all the other skills I needed as an engineer was what really helped me do a better job.

I learned to write a software renderer, but chatting with the artists who were building the models made me realize that I could make the game look far better and run much faster by building a tool in 3DS Max they could use to preview their work in-game. It reduced their iteration time from days to minutes, which meant they could try a lot more ways to reduce the polygon count without compromising the look. I would never have made this leap if I hadn’t been sitting in an open plan office where I could hear them cursing!

Since then I’ve seen the cycle repeat itself in every new industry I’ve joined. When the technology is new and hard to use, just knowing how to operate it gives you a high status position, but the tide of Moore’s Law and the spread of knowledge makes that a very temporary throne. The technical impediments always disappear, and graduates come out of college knowing what used to be elite skills. What keeps you in a job is the ability to be the interface between the precise requirements of software and the rest of the world filled with messy, contradictory and incompletely understood problems.

In passing Nathan mentions measuring productivity, but that’s one of the stickiest problems in software, with an inglorious history from counting lines of code to stack ranking. Most of my most useful contributions at the companies I’ve worked at have been when I’ve avoided producing what someone has asked me for, and instead given them what they need, which meant a lot of conversations to really understand the background. I also spend a lot of time passing on what I know to other people who are hitting problems, which hits my individual productivity but helps the team’s. The only meaningful level at which you can measure productivity is for the whole group tackling the problem, including the designers, QA, marketers, translators, and everyone else outside of coding who’s needed for almost any product you’re building. In almost every place I’ve worked, they would be able to make far more progress if they could interrupt the programmers a lot more, even though I’d hate it as an engineer!

I am actually ambivalent about open plan offices myself. Having my own room often seems like a delicious dream, and headphones are a poor alternative when I’m really diving deep into the code. What stops me from pushing for that is the knowledge that my job is all about collaboration, and an open arrangement truly does enable that. The interruptions aren’t getting in the way of my primary work, the interruptions are my primary work.

Follow

Get every new post delivered to your Inbox.

Join 58 other followers