Speech Commands is now larger and cleaner!

Picture by Aaron Parecki

When I launched the Speech Commands dataset last year I wasn’t quite sure what to expect, but I’ve been very happy to see all the creative ways people have used it, like guiding embedded optimizations or testing new model architectures. The best part has been all the conversations I’ve ended up having because of it, and how much I’ve learned about the area of microcontroller machine learning from other people in the field.

Having a lot of eyes on the data (especially through the Kaggle competition) gave me a lot more insight into how to improve its quality, and there’s been a steady stream of volunteers donating their voices to expand the number of utterances. I also had a lot of requests for a paper giving more details on the dataset, especially covering how it was collected and what the best approaches to benchmarking accuracy were. With all of that in mind, I spent the past few weeks gathering the voice data that had been donated recently, improving the labeling process, and documenting it all in much more depth. I’m pleased to say that the resulting paper is now up on Arxiv, and you can download the expanded and improved archive of over one hundred thousand utterances. The folder layout is still compatible with the first version, so to run the example training script from the tutorial, you can just execute:

python tensorflow/examples/speech_commands/train.py \
--data_url=http://download.tensorflow.org/data/speech_commands_v0.02.tar.gz

I’m looking forward to hearing more about how you’re using the dataset, and continuing the conversations it has already sparked, so I hope you have as much fun with it as I have!

One response

scientistnobee says:

April 14, 2018 at 8:06 pm

Hi Pete, I would like to know if this method can be used for optical spectral analysis. As I work in the field of photonics, I am looking to learn ML through spectral data analysis. Any leads on that? Thanks.

Yours,
Chinna.

	Zero-Copy GPU Infere… on Why GEMM is at the heart of de…
	Moonshine Voice完全解説｜… on Announcing Moonshine Voice
	Moonshine KI-Sprache… on Introducing Moonshine, the new…
	Moonshine Voice v2 v… on Announcing Moonshine Voice
	Pete Warden on Launching a free, open-source,…

Pete Warden's blog

Ever tried. Ever failed. No matter. Try Again. Fail again. Fail better.

Speech Commands is now larger and cleaner!

One response

Leave a reply to scientistnobee Cancel reply

Share this:

Related

One response

Leave a reply to scientistnobee Cancel reply