2025-10-24 2:05 AM
I’m currently trying with the STM32N6570-DK board using the ST Edge-AI v2.2.0
I would like to know whether a lightweight Transformer model (for example, a small Vision Transformer or a minimal Transformer encoder) can be successfully converted and executed on the STM32N6. I checked that batch_matmul, transpose, etc. are supported on the link (https://stm32ai-cs.st.com/assets/embedded-docs/stneuralart_operator_support.html )
Currently, after converting the lightweight ViT model to tflite or onnx, I am trying to convert it using stedgeai and stm32cubeide, but only the following error messages appear and are not being converted
Code, model file, and error text: https://github.com/minchoCoin/lightweight_vit
These model was successfully analyzed with STM32Cube.AI MCU runtime on STM32CubeIDE
Following error messages appear when I try to convert these model(tflite and onnx) with STM32Cube.AI Neural-ART runtime in STM32CubeIDE or with stedgeai in command line
1. tflite
stedgeai generate --model custom_vit_int8.tflite --target stm32n6 --st-neural-art default@user_neuralart_STM32N6570-DK.json
ST Edge AI Core v2.2.0-20266 2adc00962
WARNING: nl_8 is not quantized
...
STEDGEAI_BuildAtonnExe_Win/git/onnx_backend/platform_passes/transform_gemm_fc_into_conv.cc:203: runTransform: Assertion `(b_shape.size() == 1) || ((b_shape[0].dim == M) || b_shape[0].dim == N)` failed.
Warning: Missing Quantization info for Pow_35_exp; will consider Pow_35_exp as a native Float
...
Warning: Lowering of node=Transpose_52 kind=Transpose not yet supported. the generated code will not compile
terminate called after throwing an instance of 'std::runtime_error'
what(): SW mapping failed:
Node Transpose_88 not mapped
Internal compiler error (signo=6), please report it2. onnx
ST Edge AI Core v2.2.0-20266 2adc00962
INTERNAL ERROR: Exported ONNX could be malformed since ONNX shape inference fails
thank you for your support in advance!
2025-10-24 5:50 AM
Hello @mincho00,
You may try to use the --use-onnx-simplifier with your onnx model (It may also work with the tflite, as it may affect the intermediary onnx model generated).
Something like:
stedgeai generate --model custom_vit_int8.tflite --target stm32n6 --st-neural-art default@user_neuralart_STM32N6570-DK.json --use-onnx-simplifier
If it does not work, please upload your model in a .zip if you can share it. I will be useful for the dev team for future updates. Thanks
Have a good day,
Julian
2025-10-24 6:39 AM
Dear [Recipient],
Thank you for your reply.
Unfortunately, the --use-onnx-simplifier option did not resolve the issue.
I’ve uploaded the Colab notebook (ipynb, running on colab) and the model file for reference.
(I’ve been successfully using the STM32N6 Neural art accelerator with CNN models, but it has been challenging to implement Transformer or multi-head attention on the STM32N6-DK device)
torch==2.8.0+cu126
tensorflow==2.19.0
onnx==1.19.1
onnxruntime==1.23.2
Thank you again for your time and assistance.
Best regards,
Taehun Kim
2025-10-26 6:51 PM - edited 2025-10-26 6:52 PM
Thank you @Julian E.
Unfortunately, the '--use-onnx-simplifier' option didn’t help with converting the models.
I’ve uploaded the code, model file, and error log for your reference.
Thank you.