How to break into machine learning

broken_glass

Photo by Erich Ferdinand

An engineer recently asked me how she could turn an interest in machine learning into a full-time job. This can be a daunting prospect, because the whole field has until recently been very separate from traditional engineering, with only a few specialists at large companies using it in production, often far from traditional product teams. I took a very random path to focusing on deep learning full time, but so did most of the people I work with. It’s not clear that there is one good route, but I wanted to share the advice I had to offer in case it’s helpful to others.

Become a Designated Machine Learner

Every manager should point at one member of their team and say “You are now our machine learning expert”. If your manager doesn’t do that for you, announce it yourself to anyone who will listen. This may sound like madness, but machine learning is rapidly invading almost every product area, so whether you’re in games or enterprise software, your group needs to at least stay up to date with what’s happening with the technology. If you aren’t, then your competitors are!

You may have to fight your own imposter syndrome, but becoming the go-to person for everyone’s questions about machine learning is a fantastic way to teach yourself the essentials. You’ll have to say “Good question, let me go figure that out” a lot at first, but every expert I know does the same! Even if you don’t end up building anything in production, at least you’ll be able to point at relevant research and experiments if you decide to change to a new position.

Enter Competitions

I have been a massive fan of Kaggle since it got off the ground. If your job’s not offering you the opportunities in machine learning you want, then joining that community is a great way to teach yourself a lot of practical skills. If you look through the forums, a lot of the contestants will describe exactly how they solved old competitions, so I would recommend following a few of their recipes to get started. Once you’re able to do that, pick a new contest that’s similar to one of those, and start playing around with all of the different options to see how you can improve the results. Most of machine learning is the software equivalent of banging on the side of the TV set until it works, so don’t be discouraged if you have trouble seeing an underlying theory behind all your tweaking!

Find a Community

As I mentioned above, the most frustrating thing about machine learning is how arbitrary it all is. I’m lucky enough to be at a large company surrounded by people I can talk to about things like why my model isn’t learning, but most engineers don’t have that luxury. That’s another advantage of Kaggle, from what I’ve seen their forums offer a lot of support and encouragement. I would also look out for real-world meetups where you can swap stories and commiserate. If you can’t find something related to your field, try starting a mailing list or group yourself, or propose a session at a conference.

There is a long tradition of mentorship in machine learning, especially around deep learning, but I think we should be doing a lot better job of capturing all that oral tradition. As someone who was recently an outside myself, I want to see the field democratized. I think the reliance on word-of-mouth is more about poor written communication than anything inherent in the subject.

Write Documentation

On that topic, my TensorFlow for Poets post came out of work I was doing to help myself understand how to reliably retrain the top layer of a deep network. I didn’t know how before I started, but by carefully documenting the process and making sure I could reproduce it consistently, I learned a lot about how it all works. I also got a lot of helpful feedback as I shared drafts of the guide with colleagues.

One interesting thing about human nature is that people are a lot more willing to correct somebody else’s mistaken ideas than they are to propose their own. As long as you’re happy to keep eating humble pie, that means writing up your own tentative understanding and getting it reviewed is a lot more effective way of getting others to share their knowledge than asking flat out! That’s another reason I try to do documentation, purely for the corrections.

Don’ts

Unless you’re doing a degree at a recognized university, I personally don’t recommend going for a credential in machine learning. I do love courses like the Udacity Deep Learning program, but for the content not as a resumé builder. Having practical experience, even just on competitions like Kaggle, will be a lot more helpful in interviews.

As an engineer, I also find many machine learning research papers hard to get much benefit from. They tend to assume a lot of prior knowledge from the academic world, and prefer presenting their ideas in math rather than code. They can be useful once you’re experienced, but don’t worry if you’re left baffled by them at first.

Anyway, I hope some of these ideas are useful. Definitely read them with a skeptical eye, nobody really knows anything in this field, and I’ll be interested to hear what other suggestions people have!

Nano-computers are coming!

ccd

Photo by Steve Jurvetson

A few days ago I got an email from a journalist asking about the Starshot project. Of course he was looking for my much-more famous namesake Pete Worden, but I’ve been fascinated by the effort too. Its whole foundation is that we’ll soon be able to miniaturize space probes down to a few grams and have them function on tiny amounts of power. Over the past few years I’ve come to realize that’s the future of computing.

Imagine having a self-contained system that costs a few cents, is only a couple of millimeters wide, with a self-contained battery, processor, and basic CCD image sensor. Using modern deep learning techniques, you could train it to recognize crop pests or diseases on leaves and then scatter a few thousand across a field. Or sprinkle them through a jungle to help spot endangered wildlife. They could be spread over our bridges to spot corrosion before it gets started, or for any of the Semantic Sensor uses I’ve talked about before.

I know how useful these systems will be once they exist, but there are some major engineering challenges to solve before we get there. That’s why I’m excited to be going to the Embedded Vision Summit in a couple of weeks. Jeff Bier has gathered together a fantastic group of developers and industry leaders who are working on making this future happen. We’ll also have a strong presence from the TensorFlow team, to show how important embedded devices are to us. Jeff Dean will be keynoting and I’ll be discussing the nitty-gritty of using the framework on tiny devices.

If you’re intrigued by the idea of these “nano-computers”, and want to find out more (or even better if you’re already working on them like several folks I know!) I highly recommend joining me at the Summit in Santa Clara, May 2nd to 4th.