STM Model Zoo Audio Event Detection

jzzunn · ‎2025-03-14

I want to train one of the pretrained AI models (lets say i choose yamnet), on my own data. I want to use mimii dataset which consists of different industrial equipment sound files in .wav format in normal working condition and anomalous working condition. I only want to test on valve type sound so only two classes normal and anomalous sound type are needed. How do i convert the format to esc-50 format as said in the tutorial, i see that there are .csv files as well as .wav files in the esc-50 dataset.

Julian E. · ‎2025-03-14

Hello @jzzunn ,

Thank you for your question, here is what you need to do:

Put all WAV files in a single folder (e.g., audio/).
Ensure each file is in a consistent format (e.g., 44.1 kHz, mono, 16-bit WAV).

Then you need to create a meta.csv. Each row in meta.csv should contain:

The filename (filename)
The category name (category) → Use "normal" or "abnormal".

Example of meta.csv
filename,category
0001.wav,normal
0002.wav,abnormal
0003.wav,normal
0004.wav,abnormal
...

Then to retrain the model, follow this tutorial:

stm32ai-modelzoo-services/audio_event_detection/src/training/README.md at main · STMicroelectronics/stm32ai-modelzoo-services

Have a good day,

Julian

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.