Speech Commands Recognition Model Trainer

Get your speech commands recognition model in 10 minutes.

Check this YouTube video for step-by-step tutorial.

Step 1: determine up to 5 speech commands to be recognized.

Step 2: add 20 1-second audio samples for each speech command, and label them.

Step 3: train the model, and check training accuracy.

Step 4: add audio samples, and check predictions.

Step 5: re-train the model with more audio samples.

[Optional] Step 6: download the model, and use it on ESP32 device.

[Optional] Step 7: re-train the model with samples captured on device.

Tip: over 100 audio samples are needed to train a decent model.

Tip: a low training accuracy indicates audio samples are of bad quality or mis-labelled.

Tip: be prepared when clicking Add Audio Sample button, you have 1 second to say the speech command.

Model training is based on TensorFlow.js in browser. You audio samples are not saved to any server.

Creation Time

Audio Sample

Speech Command

Prediction

Probability

Quantized Prediction

Quantized Probability