STM Model Zoo Audio Event Detection
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
2025-03-14 8:07 AM
I want to train one of the pretrained AI models (lets say i choose yamnet), on my own data. I want to use mimii dataset which consists of different industrial equipment sound files in .wav format in normal working condition and anomalous working condition. I only want to test on valve type sound so only two classes normal and anomalous sound type are needed. How do i convert the format to esc-50 format as said in the tutorial, i see that there are .csv files as well as .wav files in the esc-50 dataset.
- Labels:
-
Model Zoo
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
2025-03-14 8:31 AM - edited 2025-03-14 8:32 AM
Hello @jzzunn ,
Thank you for your question, here is what you need to do:
Put all WAV files in a single folder (e.g., audio/).
Ensure each file is in a consistent format (e.g., 44.1 kHz, mono, 16-bit WAV).
Then you need to create a meta.csv. Each row in meta.csv should contain:
- The filename (filename)
- The category name (category) → Use "normal" or "abnormal".
Example of meta.csv
filename,category
0001.wav,normal
0002.wav,abnormal
0003.wav,normal
0004.wav,abnormal
...
Then to retrain the model, follow this tutorial:
Have a good day,
Julian
In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.
