Transformer models inference on STM32MP257F using its NPU

ramkumarkoppu · ‎2025-08-25

Is NPU of STM32MP257F capable of running inference of Transformers models like Small VLMs on this device using its NPU? what frameworks are supported like execuTorch or onnxruntime ?

Pwxn · ‎2025-08-26

Hello @ramkumarkoppu

I recommend you to have a look to the expansion package X-LINUX-AI.

On MPU, ST provide the STAI_mpu api which unified different framework: ONNX, TFLite™, and Network Binary Graph (NBG).

You will find in the wiki different reference for implementation to benchmarking:

https://wiki.st.com/stm32mpu/wiki/Category:X-LINUX-AI_expansion_package

Regards,