NanoEdge AI Studio 5.0.2 high offline accuracy; poor on STEVAL-STWINBX1

dzf · ‎2026-02-24

Dear ST Support Team,

I am currently working on a binary voice classification task on the STEVAL-STWINBX1 (for example, “echo” vs “other”) using NanoEdge AI Studio 5.0.2, and I would like to ask for your advice.

My workflow

Data collection
- I use STEVAL-STWINBX1 to collect voice data.
- Each sample is 1 second long.
- I collected more than 200 samples per class.
- Data acquisition firmware:
  - fp-sns-datalog2\fp-sns-datalog2\STM32CubeFunctionPack_DATALOG2_V3.1.0\Projects\STM32U585AI-STWIN.box\Applications\DATALOG2
Dataset conversion
- I use batch_to_nanoedge.bat to batch-convert the collected files into NanoEdge-compatible format.
Training in NanoEdge AI Studio 5.0.2
- I import the converted data into NanoEdge AI Studio 5.0.2
- Perform Data Management (DM)
- Train a classification model
- Many generated models show accuracy above 97% in Studio
Deployment to MCU
- I deploy the generated library to the MCU following this ST wiki page:
  - https://wiki.st.com/stm32mcu/wiki/AI:How_to_create_a_current_sensing_classifier_using_NanoEdge_AI_Studio
- I integrate the library into:
  - FP-AI-MONITOR2_16_3\FP-AI-MONITOR2_V1.0.0_RC8\FP-AI-MONITOR2_V1.0.0\Middlewares\ST\NanoEdge_AI_Library

Problem

Although the classification accuracy in NanoEdge AI Studio is very high (often >97%), the real-time classification accuracy on the MCU is very poor.

My question / suspicion

I suspect that the data used by NanoEdge AI Studio for training/classification may not be the same representation as the raw data sent to the NanoEdge library on the MCU. For example:

Studio-side data may be normalized, or
converted using microphone sensitivity scaling
while the MCU-side classifier may be receiving raw sensor data directly.

This possible mismatch might explain the large accuracy gap between Studio and MCU deployment.

Questions

Is there any issue with the workflow I am using?
For voice classification on STEVAL-STWINBX1, what is the recommended way to ensure that the training data format and MCU runtime input format are strictly consistent?
Does NanoEdge AI Studio expect raw sensor samples, sensitivity-scaled values, or normalized inputs for this type of workflow?
Have you seen similar cases (high Studio accuracy but poor MCU accuracy), and what are the common causes / best practices to solve them?

Any guidance would be greatly appreciated.

Thank you very much for your support.

Best regards,

Julian E. · ‎2026-02-25

Hi @dzf,

Your workflow seems correct. I suspect overfitting or an issue in your firmware.

Could you please split your data in a train and a test set (70%/30%). Please shuffle them before the split.

Then run a benchmark with the train data and use the "Validation step" to test some libraries with the test data.

What could happen is that the last libraries overfit on the data making them work well on training data, but not so much on "new"/test data.

If you have bad validation results, this is probably an overfitting problem:

Try to use more data for the benchmark
Try to see if "worse libraries" or libraries found earlier in the benchmark work better.

If you have good validation results, then the problem is not coming from the library:

Make sure that the data you acquire in the firmware to run the inference are exactly the same as the ones you collected to train the model.
Make sure that your sensor is collecting data correctly.

Have a good day,

Julian

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.