2025-12-12 6:22 AM
Hello. I am currently attempting to execute a MatMul operation on an NPU. I have implemented a simple TFLite file with a MatMul operation, as shown in the image below.
When converting the model using ST Edge AI, I observed that Matmul operator is mapped to the MCU instead of the NPU, and the operation is converted into a Convolution. Furthermore, checking the network.c file reveals that it is being converted into a convolution layer with an extremely large stride.
https://stm32ai-cs.st.com/assets/embedded-docs/stneuralart_operator_support.html states that the matmul(batch matmul) operator is supported on the ST Neural art Accelerator.
How can I resolve this issue? Thank you.
Solved! Go to Solution.
2026-01-12 12:18 AM
Hi @mincho00;
In documentation, for matmul, there is this need:
HW |
Second input should be constant else SW fallback is considered |
https://stedgeai-dc.st.com/assets/embedded-docs/stneuralart_operator_support.html
In your case, you use batchMatmul. Batchmatmul and matmul are two different operators. The Batch matmul can be seen as a concatenation of N matmul with N="batch", That's why we end up with 4 different convolutions, and for the moment, we have the same constraints of the matmul on the second input that should be constant.
So you should try to have a constant second input to see if it helps.
Have a good day,
Julian
2026-01-12 12:18 AM
Hi @mincho00;
In documentation, for matmul, there is this need:
HW |
Second input should be constant else SW fallback is considered |
https://stedgeai-dc.st.com/assets/embedded-docs/stneuralart_operator_support.html
In your case, you use batchMatmul. Batchmatmul and matmul are two different operators. The Batch matmul can be seen as a concatenation of N matmul with N="batch", That's why we end up with 4 different convolutions, and for the moment, we have the same constraints of the matmul on the second input that should be constant.
So you should try to have a constant second input to see if it helps.
Have a good day,
Julian