How to train the neural network based on environmental sound?

jayz · ‎2019-01-10

I want to create an application to train the neural network based on environmental sound.

Can you please give me a higher level idea how to train the neural network on any platform and generate the model.

Amel NASRI · ‎2019-01-15

Hi @jayz ,

I updated your post to put a meaningful title and a body; we cannot put all on the title of the question.
All resources related to the STM32 solutions for Artificial Neural Networks are gathered in the following page: https://www.st.com/content/st_com/en/stm32-ann.html. You need to start there and follow the 5 steps to AI.
Data selection, collection, labeling together with the selection of the right neural network topology and its training should be considered as prerequisite skills to be able to use our STM32Cube.AI plug-in. If you would like to get more support on this we have an AI dedicated partner program providing such kind of Engineering Services. Please find the link here : https://www.st.com/content/st_com/en/partner/partner-program/partnerpage.html?key=STM32CubeAI. The list will be obviously increasing to support more regions and fields of applications.
Regarding platforms to train neural networks here is one example well documented : https://keras.io/

-Amel

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

Romain LE DONGE · ‎2019-01-31

Hi !

Actually what you can do is :

1) Record the sound/voices/whatever, you can do it with the SensorTile (https://www.st.com/en/evaluation-tools/steval-stlkt01v1.html)

2) You will have to label the sound data

3) Then you have to convert the sound into a spectrogram

4) And finally you can train a Convolutionnal Neural Network to learn the link between your spectrogram and the label

5) Actually the real final step: you can use X-Cube-AI to convert your CNN to an optimized STM32 code ;)

Check this link to learn more about what you can do with audio and AI on STM32 : https://www.st.com/content/st_com/en/products/embedded-software/mcus-embedded-software/stm32-embedded-software/stm32-ode-function-pack-sw/fp-ai-sensing1.html

Regards,

Romain

JNord · ‎2019-04-02

Hi @jayz ,

there are several existing datasets available for Environmental Sound. Some of the more established ones are:

Urbansound8k. https://urbansounddataset.weebly.com/urbansound8k.html
ESC-50. https://github.com/karoldvl/ESC-50

If you search for these in the literature, you will find many proposed neural network models that can be used as a starting point. However most of them will be too large to fit on an STM32.

I have implemented a large amount of these in Keras at https://github.com/jonnor/ESC-CNN-microcontroller/tree/master/microesc/models

When utilizing 60 mels, 31 frames @ 22kHz, all of these can fit within the constraints of a STM32L476. I am currently investigating which kind of model gives the best performance.

If you decide to use some of this as, please attribute my work accordingly.

Cheers, Jon Nordby

Machine Learning Engineer & Independent Consultant