I want to create an application to train the neural network based on environmental sound.
Can you please give me a higher level idea how to train the neural network on any platform and generate the model.
Hi @jayz ,
To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.
Actually what you can do is :
1) Record the sound/voices/whatever, you can do it with the SensorTile (https://www.st.com/en/evaluation-tools/steval-stlkt01v1.html)
2) You will have to label the sound data
3) Then you have to convert the sound into a spectrogram
4) And finally you can train a Convolutionnal Neural Network to learn the link between your spectrogram and the label
5) Actually the real final step: you can use X-Cube-AI to convert your CNN to an optimized STM32 code 😉
Check this link to learn more about what you can do with audio and AI on STM32 : https://www.st.com/content/st_com/en/products/embedded-software/mcus-embedded-software/stm32-embedded-software/stm32-ode-function-pack-sw/fp-ai-sensing1.html
Hi @jayz ,
there are several existing datasets available for Environmental Sound. Some of the more established ones are:
If you search for these in the literature, you will find many proposed neural network models that can be used as a starting point. However most of them will be too large to fit on an STM32.
I have implemented a large amount of these in Keras at https://github.com/jonnor/ESC-CNN-microcontroller/tree/master/microesc/models
When utilizing 60 mels, 31 frames @ 22kHz, all of these can fit within the constraints of a STM32L476. I am currently investigating which kind of model gives the best performance.
If you decide to use some of this as, please attribute my work accordingly.
Cheers, Jon Nordby
Machine Learning Engineer & Independent Consultant