2025-06-23 6:16 AM
Hello everyone,
I am using CubeAI 10.1.0 and STAGEAI 2.1 to analyze my model for STM32N6, and I encountered an issue where some epochs show ?? instead of the expected results. Here's the log:
In epoch 22 and epoch 24, the result is shown as ??, and I couldn't retrieve any computation results. I have a few questions:
Does the ?? represent that some operators or operations failed to execute during these epochs? Does it imply that those operators are not supported on STM32N6, or could it be due to hardware resource limitations?
If epoch shows ??, will it impact the final recognition or inference accuracy of the model? Should I be concerned that this issue may lead to unreliable results from the model?
The official documentation mentions that PReLU is supported on STM32N6, but after model quantization, the computation for PReLU is still executed in software rather than on the hardware. Why is that? Is it due to hardware limitations, or is STM32N6's hardware acceleration for this operator not fully optimized? Is there any other reason why PReLU still runs in software?
If these issues occur, are there any recommended optimization methods or adjustment strategies to address them and ensure that the model runs smoothly and gives accurate results? Should I consider simplifying the model or replacing PReLU with another activation function to avoid the operator being executed in software?
Thank you in advance for your help and suggestions!
Solved! Go to Solution.
2025-06-25 7:40 AM - edited 2025-06-25 7:41 AM
Hello @qiqi,
So, the PReLU being in software is a bug of the CLI front end. It is supported by the aton compiler.
The bug is fixed and will be part of the next version (2.2) planned for beginning/mid July.
Concerning the ?? bug, I opened an internal ticket, and I will update you.
Until I know more, I would suggest either not to use the option causing the issue or to use the validate on target with and without the option to see the difference and make sure the results are correct.
Have a good day,
Julian
2025-06-24 6:07 AM
Hello @qiqi,
Could you please share your model in a .zip file?
Concerning the PReLU, it is indeed supported. As for why it is not used in SW epoch it could be because the compiler decided that it is faster to do it in SW. I will look at it with more detail if you share your model.
Have a good day,
Julian
2025-06-24 6:45 AM
Dear Julian,
Thank you so much for your help! I have packed the models into a .zip file and attached it for your review. The zip file contains three models: mobilefacenet.onnx, ONet.onnx, and RNet.onnx, all of which are quantized models. During the analysis, both ONet.onnx and RNet.onnx showed ?? epochs. Could you kindly take a look and help identify any issues and suggest possible solutions?
Additionally, if you don't mind, I would like to ask you one more question. The mobilefacenet.onnx feature extraction model has relatively large parameters, and the analysis shows a total of 164 epochs, of which 111 are implemented in software. Through empirical testing, the inference time is around 100ms, which I feel is a bit long. Is there a way to move more epochs to hardware execution instead of software?
Furthermore, the model’s activations are 3.062 MB, and apart from npuRAM3, npuRAM4, npuRAM5, and npuRAM6, it must occupy some space in hyperRAM. According to the official documentation I reviewed, this might affect the inference speed. Is that the case? If so, can it be optimized by adjusting the options in the user_neuralart.json file?
Apologies for all the questions, and I really appreciate your help in answering them and optimizing the model.
Thanks again for your support, and I look forward to your reply!
Best regards,
QiQi
2025-06-24 7:06 AM
Hello @qiqi,
Thank you for the models, I will first take a look at this ?? issue.
Regarding optimization, if the activations do not fit into internal RAM, then, it will indeed have a big impact on the inference time. The weights are in external flash, but because they are read one time when needed, the impact is not heavy. For activations however, multiple read and writes will require to access external memory, inducing this augmentation of inference time.
I will take a look with my colleague to see if we can provide you with some tips to help you.
In the meantime, you can look at this piece of information, if you have not already seen it:
https://stedgeai-dc.st.com/assets/embedded-docs/stneuralart_neural_art_compiler.html#tips-variations-around-the-basic-use-case
Have a good day,
Julian
2025-06-24 7:19 AM
Dear Julian,
Thank you for your prompt and helpful response! I will carefully review the documentation you provided and look forward to hearing from you with any tips or insights you and your colleague may have.
Thanks again for your support, and I appreciate your assistance in helping me optimize the models.
Have a great day!
Best regards,
QiQi
2025-06-25 1:10 AM - edited 2025-06-25 7:42 AM
Hello @qiqi,
I do not reproduce the ?? issue.
Concerning the epoch being in SW, I we take for example the PReLU in your RNet, we can see that it uses float32, but an epoch can only be mapped in HW if the operation is supported and in int8:
You do get warning at the beginning at the report telling you that nodes are not quantized. Could you please try first to quantize your model and see.
Have a good day,
Julian
2025-06-25 3:43 AM
Hello,
After further testing, I found that the issue with ?? was caused by the --Oalt-sched optimization option. Once I removed this option, the problem disappeared. Do you know what might be causing this issue? Currently, I am using the following optimization options:
Do you think further optimization is needed with these settings?
Additionally, we tried quantizing the PReLU operator, but with ONNX-based quantization, it seems that it can only be quantized to uint8. After quantization, the analysis results in an error, and it seems we cannot resolve this issue. Could you provide guidance on how to handle this, or suggest another approach for quantizing PReLU?
Thank you for your assistance!
Best regards,
QIQI
2025-06-25 7:40 AM - edited 2025-06-25 7:41 AM
Hello @qiqi,
So, the PReLU being in software is a bug of the CLI front end. It is supported by the aton compiler.
The bug is fixed and will be part of the next version (2.2) planned for beginning/mid July.
Concerning the ?? bug, I opened an internal ticket, and I will update you.
Until I know more, I would suggest either not to use the option causing the issue or to use the validate on target with and without the option to see the difference and make sure the results are correct.
Have a good day,
Julian
2025-06-26 11:57 PM
Hello Julian,
I encountered a new issue while using the ST Edge AI optimization options. If you have some time, could you please take a look and help me with it? Your assistance would mean a lot to me. The issue is posted at the following link:
Thank you so much for your help!