Smartphone Energy Consumption


I never used to worry about energy usage but over the last few years most of my code runs on smartphones, so it has become one of my top priorities. The trouble is I’ve never had any training or experience in the area, so I’ve had to learn how to do it from experience, other engineers, and the internet, most of which feels more like folklore than engineering. A few days ago I was trying to learn more about the shiny new Monsoon power monitor on my desk when I came across “Smartphone Energy Consumption” from Cambridge University Press.

I’ve now had a chance to read it, and it has helped fill in a lot of gaps in my education. I highly recommend picking up a copy if you’re facing any of the same problems, and to give you a taster here are some of the things I learned. Rest assured that any mistakes in here are my own, not the authors!

There are lots of different ways to measure battery capacity – joules, watt-hours, and milli-amp hours. If I’m getting my physics straight, 1 watt hour is 3,600 joules, and if you assume a rough voltage of 4V, 1 milli-amp hour is 0.001 * 4v * 60 * 60, or 14.4 joules. A typical phone battery might have 2,000 mAh, or about 29,000 joules.

Most phones can’t dissipate more than 3 watts of heat, so that’s a practical limit for sustained power usage. Wearables can have much lower limits.

There are two main ways power is used in mobile chips. Switching loss is the power used to change gate states in the silicon, and static loss is more like a steady leakage. Switching power decreases with the square of the voltage, so halving the voltage reduces its effect by 75%, whereas static power shrinks linearly. That leads to some interesting situations, where having two cores running at half the voltage and frequency (since lower clock frequencies allow lower voltages to be used) may take the same time to complete a task, but at half the power of a single higher-voltage core. Turning off parts of the chip entirely is the key to reducing static leakage.

There’s a great practical guide to wiring up Monsoons, exactly what I need right now! There’s also a great section on human-battery interaction, that I just skimmed, but which covers a lot of research. The key takeaway for me was that users get very annoyed if their battery starts draining more quickly than it used to, and will uninstall recent apps to fix the problem, so developer should be highly motivated to reduce their power consumption.

I found a lot of very useful estimates for components power usages scattered through the book. These are just rough guides, but they helped my mental modeling, so here are some I found notable:

  • An ARM A9 CPU can use between 500 and 2,000 mW.
  • A display might use 400 mW.
  • Active cell radio might use 800 mW.
  • Bluetooth might use 100 mW.
  • Accelerometer is 21 mW.
  • Gyroscope is 130 mW.
  • Microphone is 101 mW.
  • GPS is 176 mW.
  • Using the camera in ‘viewfinder’ mode, focusing and looking at a picture preview, might use 1,000 mW.
  • Actually recording video might take another 200 to 1,000 mW on top of that.

A key problem for wireless network communication is the ‘tail energy’ used to keep the radio active after the last communication, even when nothing’s being sent. This is vital for responsiveness, but it can be ten seconds for LTE, so apparently short communications can use a lot more energy than you’d expect. Sending a single byte can use a massive amount of power if it keeps the radio active for ten seconds after!

A Microsoft paper showed that over 50% of the power on several popular games is consumed by the ads they show!

There’s some interesting work on modeling the tradeoffs between computation offloading (moving work to the cloud from the phone) and communication offloading (doing more work on the device to reduce network costs). I’m a big believer that we should do more work on-device, so it was great to have a better foundation for modeling those tradeoffs. One example they give is using the Android SDK on a 1080p image to detect faces on-device, and taking 3.2 seconds and 9 joules, whereas sending the image to a nearby server was quicker, even with the extra power of network traffic.

Anyway, it’s a great piece of work so if this sort of information is useful, go pick up a copy yourself, there’s a lot more than I can cover here!

Semantic Sensors


Video from Springboard

The other day I was catching up with neighborhood news, and saw this article about “people counters” in San Francisco’s tourist district. These are cameras watching the sidewalks and totaling up how many pedestrians are walking past. The results weren’t earth-shattering, but I was fascinated because I’d never heard of the technology before. Digging in deeper, I discovered there’s a whole industry of competing vendors offering similar devices.

Why am I so interested in these? Traditionally we’ve always thought about cameras as devices to capture pictures for humans to watch. People counters only use images as an intermediate stage in their data pipeline, their real output is just the coordinates of nearby pedestrians. Right now this is a very niche application, because the systems cost $2,100 each. What happens when something similar costs $2, or even 20 cents? And how about combining that price point with rapidly-improving computer vision, allowing far more information to be derived from images?

Those trends are why I think we’re going to see a lot of “Semantic Sensors” emerging. These will be tiny, cheap, all-in-one modules that capture raw noisy data from the real world, have built-in AI for analysis, and only output a few high-level signals. Imagine a small optical sensor that is wired like a switch, but turns on when it sees someone wave up, and off when they wave down. Here are some other concrete examples of what I think they might enable:

  •  Meeting room lights that stay on when there’s a person sitting there, even if the conference call has paralyzed them into immobility.
  •  Gestural interfaces on anything with a switch.
  •  Parking meters that can tell if there’s a car in their spot, and automatically charge based on the license plate.
  •  Cat-flaps that only let in cats, not raccoons!
  •  Farm gates that spot sick or injured animals.
  •  Streetlights that dim themselves when nobody’s around, and even report car crashes or house fires.
  •  Stop lights that vary their timing cycle depending on whether there are any vehicles or pedestrians approaching from each direction, and will prioritize emergency vehicles.
  •  Drug cabinet doors that keep track of the medicines you have inside, help you find them, and re-order when you’re out.
  •  Shop window display items that spring to life when passers-by are looking at them, using eye tracking.
  •  Canary sensors scattered through crops that spot and report any pests or weeds they see, to minimize the use of chemicals.
  •  IFTTT-style hardware mashups, with quirky niche applications like tea-kettles that turn themselves on if you stare longingly at them, art installations that let you paint on them with hand gestures, or lawn sprinklers that know if it’s been raining, and only water the parts that are starting to go brown.

For all of these applications, the images involved are just an implementation detail, they can be immediately discarded. From a systems view, they’re just black boxes that output data about the local environment. Engineers with no background in vision will be able to integrate them, and get useful signals to drive their applications. There are already a few specialist devices like Omron’s Human Vision Components, but imagine when these become common components, standardized so they can be easily plugged into existing designs and cheap enough to be used on everyday items.

I don’t have a crystal ball, and all of these are purely my own personal musings, but it seems obvious to me that machine vision is becoming a commodity. Once the technology’s truly democratized, I believe it will give computers a window into the real world they’ve never had before, and enable interfaces and responses to the environment we’ve never even dreamed of. I think a big part of that will be the emergence of these “semantic sensors” that output human-meaningful data about what’s happening around them.

OpenHeatMap and DataScienceToolkit under new management

I’ve been running OpenHeatMap and the Data Science Toolkit for quite a few years now, but a few months ago I realized I wasn’t able to keep maintaining them. I know a lot of people out there are still using them, so I looked around for a partner I could transfer the ownership to. After some discussions, I arranged a deal with the team to transfer the sites to them, for no charge, in return for their agreement to keep supporting the existing community. For the last few weeks they’ve been handling the servers, support, and maintenance, and I’m very glad they were able to step in. The goal is to keep the existing free services supported, but give them the ability to expand in a more commercial direction too, so that the site becomes more self-sustaining. All OpenHeatMap support requests should now go to, which they administer.

The code behind is all open-source on github, so that will continue to be available, but the DSTK site itself has an uncertain future. I’ve always tried to keep it open to anyone who wants to experiment with the APIs, but over the last year its come under denial-of-service level usage levels from a wide range of IP addresses. I spent some time learning firewall rules and attempting to block the problematic calls, but I wasn’t able to keep the levels low enough to keep the site consistently up. Since OpenHeatMap relies on the site as its geocoder, that meant the uploading there was also often unreliable. I came to the sad conclusion I didn’t have enough time to do the overhauling I’d need to deal with the problems, which is why I handed everything over to a team who can put in more time. The most common use of the DSTK was for geocoding US address, and with the Census Bureau now providing their own free API, that side of it became less essential too. The hosting of the large VMs unfortunately got lost when I shut down the site, so I’m afraid I don’t have those available any more.

Both of the sites were failed startup ideas that took on a life of their own, even though I was never able to make them commercial ventures. I’m hopeful that a fresh team with new ideas will be able to provide a better service to everyone who uses them. I’m grateful to everyone who’s been in touch over the years, I kept supporting the site for so long because I saw the amazing projects you were all using them for. My deep thanks go to the community that formed around the sites.

How to Talk to Journalists


Photo by Jon S

Now I’m at Google I don’t get to talk to reporters, which is a shame because they’re a lot of fun. When I was doing startups I learned a lot from hanging out with them, because they’re generally very smart, curious people who have a lot wider perspective on what’s happening than anyone else. I even dabbled in writing articles myself at the old ReadWriteWeb site many years ago.

I was talking to a startup founder recently, and realized she didn’t actually understand the basics of what a journalist’s job is like. Knowing the day-to-day routines and constraints on reporters is essential if you’re going to do a good job helping them cover what you’re doing.  Here’s my advice, based on my personal experiences over the last few years. I’d love to hear more from other people too, since I think this is far from the final word on the subject!

Connect Early and Selectively

Most founders want to wait until they hit a milestone they consider significant, and then mail-blast every high profile journalist they can find an email for. Every reporter’s inbox is piled up with so many of these every day that they’re almost never even read. Every writer has their own areas of interest, and their own long-term storylines about the tech world, and you first need to identify a handful of people who might truly care about what you’re doing, long before you’re looking for a story. Connect with them on Twitter or story comments, chat to them at conferences, and communicate your own enthusiasm about the things they care about. Don’t be a stalker, just be human. They’re in their jobs because they are interested in this new world we’re building so connect with them on that level.

Be Responsive

If you do start to build a relationship, one of the most helpful things you can do is provide quotes or off-the-record background for their stories. The key here is that they usually are up against a very tight deadline, they might need to submit in a matter of minutes, so drop everything and get back to them immediately. Make sure you know if what you’re saying will be quoted, or if it’s “on background”, before you say it! You can’t take back an ill-considered quote just because you regret it later.

Having quotes is essential for almost any story, since they’re the evidence to back up the writer’s version of events. Make sure you listen carefully to the reporter’s questions too, they’ll often give you an idea of what angle they’re interested in, so you can focus your response to fit. Be honest – having a quote that disagrees with someone else or the conventional wisdom can sometimes make an even better story than a confirmation.

Let the Story Emerge

A friend of mine used to curse about all the ’round number’ pitches that endlessly filled her inbox – “ has reached 100,000 users!”. They’re boring for everyone outside that company. Only slightly better are fund-raising announcements, or new product features. What journalists care about are stories that readers will actually be interested in, and those need to be entertaining. Drama, surprise, tragedy, hope, and humor are all vital parts of the stories that engage people, so you have to let the journalists into the lives of your startup and let them decide what the real story is. It might be something you’re embarrassed by, like being turned down by every VC but finding alternative ways of staying afloat, or that you expanded too early, went through painful layoffs, but are now turning the corner. It might be something you don’t think about because you take it for granted, like that you have a great working environment for disabled people, or take interns from the local community.

Good reporters love getting to know people and teasing out the stories their readers will want to hear, so let them do their job and don’t bombard them with your own ideas, because you lack their perspective on what’s interesting. If you really are the next Facebook the stories about how you’re crushing it will come, don’t worry, but in the early days you need all the coverage you can get.


Everyone knows blogging’s been dead for years, but there’s no substitute for writing short-form articles in your own voice and publishing them yourself, and it actually helps journalists you’ll deal with in a lot of ways too. When you speak to them about a topic you’ve already blogged about, you’ll be far more articulate and quotable because you’ll have already worked through the ideas on paper once already. You’re also giving them something to link to in their articles as evidence to back up their own arguments. If you develop even a small audience on your blog, it’s actually useful for journalists to know you’re likely to link back to their own coverage, and so drive more readers in their direction. Any graphics or data you’ve produced can be very useful to quickly illustrate their stories too, especially about trends. Journalists tend to be voracious readers, so writing regular interesting posts is a great way to build a relationship as well, and it’s even better if you reference their work and engage their arguments. Your posts may end up sparking ideas for stories too, and if one’s got a lot of shares on social media that’s strong evidence people find the underlying topic interesting.

Speak Directly

I have never found PR firms helpful*. When I was on the other end of their pitches, I saw how much of a negative reaction their formulaic emails got from other reporters. I see them as middle managers inserting themselves between journalists and the founders they actually want to speak to, and I’ve run across too many who make their money by pandering to founder’s egos without helping the business. It’s possible it all makes sense in the corporate world, but as a founder you need to build your own direct relationships, and if you do have to cold-email somebody, at least make it a personal note in your own voice.

Think of it like dealing with investors, it’s not something you can delegate when you’re starting out. Reporters want to get heartfelt quotes from un-coached entrepreneurs, not rehearsed soundbites from someone sitting with a handler, and just the hassle of arranging interviews through a third-party can put them off. It does feel risky, but as an early-stage startup the reward of good coverage is so valuable, you need to take the plunge.

Focus on Your Work

Success in other areas makes good PR possible, and good coverage is a force multiplier, but PR shouldn’t take up more than a small percentage of your time as a founder. You can charm reporters all you want, but if you’re not doing anything fundamentally interesting, you won’t get a good story. Even if you do get coverage, if your product or business model don’t work it won’t help your traction. Journalists know you have a job to do, and would much rather you come back to them less frequently with amazing things to show, than spend all your time on little stories at the expense of everything else.

(*) The only situation I’d recommend getting help is if you find yourself in the center of a scandal, like Alasdair Allan and I did with the iPhone locationgate problem. There having the wonderful Maureen Jennings from O’Reilly performing traffic control for all the people who suddenly wanted to interview us was a life-saver. She was able to communicate effectively with everyone involved, we’ll forever be grateful to her for all her help!

Five Deep Links


Picture by Kevin Dooley

I’m coming up to a year at Google now, and I’ve been continuing to have an amazing time with the deep learning team here. Deep networks are not a silver bullet for all AI problems, but they do mean we are moving from a cottage industry of bespoke machine learning specialists hand-carving algorithms for each new problem, to mass production where general software engineers can get good results by applying the same off-the-shelf approaches to a lot of different areas. If you have a good object recognition network architecture, you can get damn fine results on scene recognition, location estimation, and a whole host of other tasks using the same model, just by varying the training data, or even just retraining the top layer.

The tools aren’t particularly easy to use right now which makes deep learning seem very intimidating, but work like Andrej Karpathy’s ConvNetJS shows that the code can be expressed in much more understandable ways. As the libraries and documentation mature, we’ll see tools that let any software engineer create their own deep learning solution by just creating a training set that expresses their problem and feed it into an automated system. I imagine there will be separate approaches for the big areas of images, speech, and natural language, but we’re at the point where we can produce semantically meaningful intermediate representations from all those kinds of real-world data, and then straightforwardly train against those. Anyway, enough of my excited ramblings, I mostly wanted to share some interesting deep learning articles I’ve seen recently.

How Google Translate Squeezes Deep Learning onto a Phone – I’ve been lucky enough to work with the former WordLens team to get their amazing augmented reality visual translator using deep neural networks for the character recognition, directly on the device. It was nice to see the technology from the WordLens and Jetpac acquisitions come together with all of the experience and smarts of the wider Google teams to make something this fun.

Composing Music with Recurrent Neural Networks – Mozart’s job is still safe for a while based on the final results, but it’s a great demonstration of how it’s getting easier for non-specialists to start working with neural networks. It also has the best explanation of LSTMs that I’ve seen!

gemmlowp – Benoit Jacob, of Eigen fame, has been doing a fantastic job of optimizing the kind of eight-bit matrix multiply routines that I find essential for running networks on device. Even better, because Google’s very supportive of open source, we’ve been able to release it publicly on Github. It’s been a great project to collaborate on, and I’m happy that we’ve been able to share the results.

Visualizing GoogLeNet Classes – I love that we’re still at the stage where we don’t really know how these networks work under the hood, and investigations like these are great ways of exploring what these strange mechanisms we’ve created are actually doing.

How a Driverless Car Sees the World – Yes, I have drunk the Google Kool-aid, but I’ve long thought this is one of the coolest projects happening right now. This is a great rundown of some of the engineering challenges it’s facing, including wheelchairs doing donuts in the road.

Why I support At the Crossroads


When I first moved to San Francisco, I was shocked by how many people were living on the streets. We’re in one of the richest cities in the world, I was appalled that we couldn’t do a better job helping them. I wanted to do something, but it was frustratingly hard to figure out ways that would be effective in doing more than just salving my conscience. I even attended a few “Homeless Innovation” meetups, but from talking to the people who worked in the trenches it was clear that new technology wasn’t the solution. I did discover some non-profits doing important work through the group though, like the Lava Mae shower bus project, and At the Crossroads.

Rob Gitin, the co-founder of the group, gave a short presentation about what ATC did, and it made a lot of sense to me. For seventeen years groups of staff members have walked the Tenderloin and Mission districts at night, talking to young homeless people and handing out basic necessities like toothbrushes, snacks, and clothes. There’s no agenda, the goal is just to make contact with people, and start a conversation. As trust grows they can get informal counseling on the spot, get practical advice about connecting with other services that are available, and much more, but what most impressed me was ATC’s focus on just listening.

I know from my own life that just feeling like you’re being heard can make a massive difference, and unlike a lot of non-profit program the recipients are in control. They’re not being pushed down prescribed programs administered from above, it’s a grass-roots approach that lets them choose what help they need, and when, without judgment, delays, and paperwork.

I’ve stayed involved with ATC since then, trying to help them as a donor (most recently with a boost from Google’s generous gift-matching program). One of the perks has been getting the newsletter every few months, which is simple, but beautifully written, and often very moving as they focus on the stories of the clients. You can view the latest issue here, and what really struck me this time was how Rob distilled the group’s philosophy on helping people in his editorial:

It took me 10 years of doing this work before I realized there was no better topic to discuss than relationships. Knowing how to build and sustain healthy relationships, and how to navigate difficult ones, is the single most important tool our youth can develop that will empower them to build the lives they want. It can have a greater impact on their long-term stability than getting into housing, going back to school, or finding work.

Our clients can get a job, but if they don’t know how to deal with a harsh boss, they will quit or get fired. They can find a room in an apartment or in subsidized housing, but if they can’t navigate roommate conflicts or deal with a case manager they don’t like, they will lose their housing. Furthermore, if they don’t have a strong, supportive community, losing a job or housing will send them right back to the streets.

Stable, long-term relationships are the building blocks upon which our youth create healthy and fulfilling lives. They also feed the heart and the soul. Care without condition nurtures hope, which is often in short supply for our youth. For many, we are the first people to reflect back what is special about them, and who actually see them for all that they are. Hope is a prerequisite for change, and it is wonderful to get to instill it.

It’s this kind of practical, humane wisdom that makes me happy that ATC exists, and is out there night after night helping people. If you’re concerned about homelessness in San Francisco, but you’ve felt lost trying to find a practical way to help, I encourage you to check out what ATC does, and all the different ways you can get involved. It’s not a magic solution to all the pain out there, but I really see them making a difference, one person at a time.

Why are Eight Bits Enough for Deep Neural Networks?


Picture by Retronator

Deep learning is a very weird technology. It evolved over decades on a very different track than the mainstream of AI, kept alive by the efforts of a handful of believers. When I started using it a few years ago, it reminded me of the first time I played with an iPhone – it felt like I’d been handed something that had been sent back to us from the future, or alien technology.

One of the consequences of that is that my engineering intuitions about it are often wrong. When I came across im2col, the memory redundancy seemed crazy, based on my experience with image processing, but it turns out it’s an efficient way to tackle the problem. While there are more complex approaches that can yield better results, they’re not the ones my graphics background would have predicted.

Another key area that seems to throw a lot of people off is how much precision you need for the calculations inside neural networks. For most of my career, precision loss has been a fairly easy thing to estimate. I almost never needed more than 32-bit floats, and if I did it was because I’d screwed up my numerical design and I had a fragile algorithm that would go wrong pretty soon even with 64 bits. 16-bit floats were good for a lot of graphics operations, as long as they weren’t chained together too deeply. I could use 8-bit values for a final output for display, or at the end of an algorithm, but they weren’t useful for much else.

It turns out that neural networks are different. You can run them with eight-bit parameters and intermediate buffers, and suffer no noticeable loss in the final results. This was astonishing to me, but it’s something that’s been re-discovered over and over again. My colleague Vincent Vanhoucke has the only paper I’ve found covering this result for deep networks, but I’ve seen with my own eyes how it holds true across every application I’ve tried it on. I’ve also had to convince almost every other engineer who I tell that I’m not crazy, and watch them prove it to themselves by running a lot of their own tests, so this post is an attempt to short-circuit some of that!

How does it work?

You can see an example of a low-precision approach in the Jetpac mobile framework, though to keep things simple I keep the intermediate calculations in float and just use eight bits to compress the weights. Nervana’s NEON library also supports fp16, though not eight-bit yet. As long as you accumulate to 32 bits when you’re doing the long dot products that are the heart of the fully-connected and convolution operations (and that take up the vast majority of the time) you don’t need float though, you can keep all your inputs and output as eight bit. I’ve even seen evidence that you can drop a bit or two below eight without too much loss! The pooling layers are fine at eight bits too, I’ve generally seen the bias addition and activation functions (other than the trivial relu) done at higher precision, but 16 bits seems fine even for those.

I’ve generally taken networks that have been trained in full float and down-converted them afterwards, since I’m focused on inference, but training can also be done at low precision. Knowing that you’re aiming at a lower-precision deployment can make life easier too, even if you train in float, since you can do things like place limits on the ranges of the activation layers.

Why does it work?

I can’t see any fundamental mathematical reason why the results should hold up so well with low precision, so I’ve come to believe that it emerges as a side-effect of a successful training process. When we are trying to teach a network, the aim is to have it understand the patterns that are useful evidence and discard the meaningless variations and irrelevant details. That means we expect the network to be able to produce good results despite a lot of noise. Dropout is a good example of synthetic grit being thrown into the machinery, so that the final network can function even with very adverse data.

The networks that emerge from this process have to be very robust numerically, with a lot of redundancy in their calculations so that small differences in input samples don’t affect the results. Compared to differences in pose, position, and orientation, the noise in images is actually a comparatively small problem to deal with. All of the layers are affected by those small input changes to some extent, so they all develop a tolerance to minor variations. That means that the differences introduced by low-precision calculations are well within the tolerances a network has learned to deal with. Intuitively, they feel like weebles that won’t fall down no matter how much you push them, thanks to an inherently stable structure.

At heart I’m an engineer, so I’ve been happy to see it works in practice without worrying too much about why, I don’t want to look a gift horse in the mouth! What I’ve laid out here is my best guess at the cause of this property, but I would love to see a more principled explanation if any researchers want to investigate more thoroughly? [Update – here’s a related paper from Matthieu Courbariaux, thanks Scott!]

What does this mean?

This is very good news for anyone trying to optimize deep neural networks. On the general CPU side, modern SIMD instruction sets are often geared towards float, and so eight bit calculations don’t offer a massive computational advantage on recent x86 or ARM chips. DRAM access takes a lot of electrical power though, and is slow too, so just reducing the bandwidth by 75% can be a very big help. Being able to squeeze more values into fast, low-power SRAM cache and registers is a win too.

GPUs were originally designed to take eight bit texture values, perform calculations on them at higher precisions, and then write them back out at eight bits again, so they’re a perfect fit for our needs. They generally have very wide pipes to DRAM, so the gains aren’t quite as straightforward to achieve, but can be exploited with a bit of work. I’ve learned to appreciate DSPs as great low-power solutions too, and their instruction sets are geared towards the sort of fixed-point operations we need. Custom vision chips like Movidius’ Myriad are good fits too.

Deep networks’ robustness means that they can be implemented efficiently across a very wide range of hardware. Combine this flexibility with their almost-magical effectiveness at a lot of AI tasks that have eluded us for decades, and you can see why I’m so excited about how they will alter our world over the next few years!


Get every new post delivered to your Inbox.

Join 1,277 other followers