Batch Inference in X-XUBE-AI 9.0.0 (STM32CubeAI)

AM-adben · ‎2024-12-12

I am using X-CUBE-AI 9.0.0 in my project and need to run a quantized mobilenet_v2.0.35 model (.tflite). The model takes 4*56*56*3 (Batch * width * height * channel) input, but I am unable to find a way to run it with that input shape. The generated code sets input _1 size to 56*56*3 bytes only. I can't figure out how to provide batch input.

I am using STM32CubeAI runtime. I also see differences in input/output parameters between the original model and the one optimized by X-CUBE-AI.

I'd highly appreciate any guidance about how to run inference using batches.

I am attaching the Netron screenshot of the comparison below:

Comparison between original (left) and optimized model

EDIT:

In the generated report file, I can see batch size 4 is used in the layer breakdown table. But input and output size don't show that.

-----------------------------------------------------------------------------------------------------------------
input 1/1          :   'serving_default_input_10', uint8(1x56x56x3), 9.19 KBytes, QLinear(0.007843138,127,uint8), user 
output 1/1         :   'conversion_68', f32(1x1280), 5.00 KBytes, user

------ -------------------------------------------- ----------------------
m_id   layer (type,original)                        oshape                
------ -------------------------------------------- ----------------------
0      serving_default_input_10 (Input, )           [b:4,h:56,w:56,c:3]   
       conversion_0 (Conversion, QUANTIZE)          [b:4,h:56,w:56,c:3]   
------ -------------------------------------------- ----------------------

I have also attached the generated report text file.

Julian E. · ‎2024-12-13

Hello @AM-adben ,

It is as you say, it is working for a batch size of one but not with 4.

We have opened a ticket to investigate this matter.

For the moment, sadly it means that you don't have a way to achieve what you want to do.

I don't know what model you use or what you are doing but you could try to "stack the images" instead of using multiple one maybe. So instead of using an input size of (4,56,56,3), maybe think your model to use an input size of (1,56,56,12)?

Have a good day

Julian

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

AM-adben · ‎2024-12-30

Hi, is there any update regarding the possibility of batch inference? Unfortunately, we can't use the input size `(4,56,56,3)` as you proposed.

Julian E. · ‎2025-01-06

Hello @AM-adben,

We have identified the cause of this issue. But now it is up to the technical team to solve it...

Sadly, I don't have control over it, please wait for an update of the stedgeai core and try to see if has been solved.

I am following the issue; I will try to update you if it is solved.

Have a good day

Julian

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.