cancel
Showing results for 
Search instead for 
Did you mean: 

Matmul operation is converted to MCU target Convolution

mincho00
Associate II

Hello. I am currently attempting to execute a MatMul operation on an NPU. I have implemented a simple TFLite file with a MatMul operation, as shown in the image below.

mincho00_3-1765548820534.png

When converting the model using ST Edge AI, I observed that Matmul operator is mapped to the MCU instead of the NPU, and the operation is converted into a Convolution. Furthermore, checking the network.c file reveals that it is being converted into a convolution layer with an extremely large stride.

mincho00_1-1765548640104.png

mincho00_4-1765548858606.png

 

https://stm32ai-cs.st.com/assets/embedded-docs/stneuralart_operator_support.html states that the matmul(batch matmul) operator is supported on the ST Neural art Accelerator.

How can I resolve this issue? Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
Julian E.
ST Employee

Hi @mincho00;

 

In documentation, for matmul, there is this need:


HW
Second input should be constant else SW fallback is considered

https://stedgeai-dc.st.com/assets/embedded-docs/stneuralart_operator_support.html

 

In your case, you use batchMatmul. Batchmatmul and matmul are two different operators. The Batch matmul can be seen as a concatenation of N matmul with N="batch", That's why we end up with 4 different convolutions, and for the moment, we have the same constraints of the matmul on the second input that should be constant. 

 

So you should try to have a constant second input to see if it helps.

 

Have a good day,

Julian


In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

View solution in original post

1 REPLY 1
Julian E.
ST Employee

Hi @mincho00;

 

In documentation, for matmul, there is this need:


HW
Second input should be constant else SW fallback is considered

https://stedgeai-dc.st.com/assets/embedded-docs/stneuralart_operator_support.html

 

In your case, you use batchMatmul. Batchmatmul and matmul are two different operators. The Batch matmul can be seen as a concatenation of N matmul with N="batch", That's why we end up with 4 different convolutions, and for the moment, we have the same constraints of the matmul on the second input that should be constant. 

 

So you should try to have a constant second input to see if it helps.

 

Have a good day,

Julian


In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.