cancel
Showing results for 
Search instead for 
Did you mean: 

Some problems about using the NPU inference of the STM32N6 series

Z-YF
Associate III

Hello, we encountered the following issues when deploying a binary classification model using the ll library. Do you have any solutions?

1. The output data of the AI model deployed according to the example (without quantization) does not match the data calculated locally on the PC (before softmax).
2. We compared the model calculation results on the PC with those on the STM32N6 layer by layer and found that the mismatch in the calculation results before and after reshape (in the underlying code module of DMA to NPU) led to the mismatch in the final output results. The calculation results of other layers such as conv, maxpool, and relu were all correct.
3. How should we solve the model deployment?

1 REPLY 1
Julian E.
ST Employee

Hello @Z-YF,

 

The C model and python models are not bit to bit exact, so you may see differences. The real concern is to determine if these differences have an impact on the performance of the model or not.

 

In the validation report, do you get a high COS (>0.99)? If so, your model should behave as expected (like the python model). If not, it may be a bug on our side.

If you get a bad COS, please share the model with us.

 

Also, just in case you don't know, the NPU is only capable of running int8 operations, so, while using the non quantized model, most of your epochs are probably in SW.

 

As I explained, there is a difference of results between the python model and the C (MCU) model.

But note that, because the code that runs on the NPU is not the same as the one that run on the MCU (HW and SW epochs), you may also have differences later, when deploying the quantized model. Again, please look at the COS to see if these differences are impactful or not.

 

Have a good day,

Julian


In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.