The other day I was catching up with neighborhood news, and saw this article about “people counters” in San Francisco’s tourist district. These are cameras watching the sidewalks and totaling up how many pedestrians are walking past. The results weren’t earth-shattering, but I was fascinated because I’d never heard of the technology before. Digging in deeper, I discovered there’s a whole industry of competing vendors offering similar devices.
Why am I so interested in these? Traditionally we’ve always thought about cameras as devices to capture pictures for humans to watch. People counters only use images as an intermediate stage in their data pipeline, their real output is just the coordinates of nearby pedestrians. Right now this is a very niche application, because the systems cost $2,100 each. What happens when something similar costs $2, or even 20 cents? And how about combining that price point with rapidly-improving computer vision, allowing far more information to be derived from images?
Those trends are why I think we’re going to see a lot of “Semantic Sensors” emerging. These will be tiny, cheap, all-in-one modules that capture raw noisy data from the real world, have built-in AI for analysis, and only output a few high-level signals. Imagine a small optical sensor that is wired like a switch, but turns on when it sees someone wave up, and off when they wave down. Here are some other concrete examples of what I think they might enable:
- Meeting room lights that stay on when there’s a person sitting there, even if the conference call has paralyzed them into immobility.
- Gestural interfaces on anything with a switch.
- Parking meters that can tell if there’s a car in their spot, and automatically charge based on the license plate.
- Cat-flaps that only let in cats, not raccoons!
- Farm gates that spot sick or injured animals.
- Streetlights that dim themselves when nobody’s around, and even report car crashes or house fires.
- Stop lights that vary their timing cycle depending on whether there are any vehicles or pedestrians approaching from each direction, and will prioritize emergency vehicles.
- Drug cabinet doors that keep track of the medicines you have inside, help you find them, and re-order when you’re out.
- Shop window display items that spring to life when passers-by are looking at them, using eye tracking.
- Canary sensors scattered through crops that spot and report any pests or weeds they see, to minimize the use of chemicals.
- IFTTT-style hardware mashups, with quirky niche applications like tea-kettles that turn themselves on if you stare longingly at them, art installations that let you paint on them with hand gestures, or lawn sprinklers that know if it’s been raining, and only water the parts that are starting to go brown.
For all of these applications, the images involved are just an implementation detail, they can be immediately discarded. From a systems view, they’re just black boxes that output data about the local environment. Engineers with no background in vision will be able to integrate them, and get useful signals to drive their applications. There are already a few specialist devices like Omron’s Human Vision Components, but imagine when these become common components, standardized so they can be easily plugged into existing designs and cheap enough to be used on everyday items.
I don’t have a crystal ball, and all of these are purely my own personal musings, but it seems obvious to me that machine vision is becoming a commodity. Once the technology’s truly democratized, I believe it will give computers a window into the real world they’ve never had before, and enable interfaces and responses to the environment we’ve never even dreamed of. I think a big part of that will be the emergence of these “semantic sensors” that output human-meaningful data about what’s happening around them.