For the past few months I’ve been working with Zain Asgar and Keyi Zhang on EE292D, Machine Learning on Embedded Systems, at Stanford. We’re hoping to open source all the materials after the course is done, but I’ve been including some of the lectures I’m leading as part of my TinyML YouTube series. Since I’ve talked a lot about quantization on this blog over the years, I thought it would be worth including the latest episode here too.
It’s over an hour long, mostly because quantization is still evolving so fast, and I’ve included a couple of Colabs you can play with to back up the lesson. The first lets you load a pretrained Inception v3 model and inspect the weights, and the second shows how you can load a TensorFlow Lite model file, modify the weights, save it out again and check the accuracy, so you can see for yourself how quantization affects the overall results.
The slides themselves are available too, and this is one area where I go into more depth on the screencast than I do in the TinyML book, since that has more of a focus on concrete exercises. I’m working with some other academics to see if we can come up with a shared syllabus around embedded ML, so if you’re trying to put together something similar for undergraduates or graduates at your college, please do get in touch. The TensorFlow team even has grants available to help with the expenses of machine learning courses, especially for traditionally overlooked students.