cancel
Showing results for 
Search instead for 
Did you mean: 

Batch Inference in X-XUBE-AI 9.0.0 (STM32CubeAI)

AM-adben
Associate

I am using X-CUBE-AI 9.0.0 in my project and need to run a quantized mobilenet_v2.0.35 model (.tflite). The model takes 4*56*56*3 (Batch * width * height * channel) input, but I am unable to find a way to run it with that input shape. The generated code sets input _1 size to 56*56*3 bytes only. I can't figure out how to provide batch input.

I am using STM32CubeAI runtime. I also see differences in input/output parameters between the original model and the one optimized by X-CUBE-AI.

I'd highly appreciate any guidance about how to run inference using batches.

I am attaching the Netron screenshot of the comparison below:

Comparison between original (left) and optimized modelComparison between original (left) and optimized model

 

EDIT:

In the generated report file, I can see batch size 4 is used in the layer breakdown table. But input and output size don't show that.

 

-----------------------------------------------------------------------------------------------------------------
input 1/1          :   'serving_default_input_10', uint8(1x56x56x3), 9.19 KBytes, QLinear(0.007843138,127,uint8), user 
output 1/1         :   'conversion_68', f32(1x1280), 5.00 KBytes, user                                                 
------ -------------------------------------------- ----------------------
m_id   layer (type,original)                        oshape                
------ -------------------------------------------- ----------------------
0      serving_default_input_10 (Input, )           [b:4,h:56,w:56,c:3]   
       conversion_0 (Conversion, QUANTIZE)          [b:4,h:56,w:56,c:3]   
------ -------------------------------------------- ----------------------

 

 I have also attached the generated report text file.

1 REPLY 1
Julian E.
ST Employee

Hello @AM-adben ,

 

It is as you say, it is working for a batch size of one but not with 4.

We have opened a ticket to investigate this matter.

 

For the moment, sadly it means that you don't have a way to achieve what you want to do.

I don't know what model you use or what you are doing but you could try to "stack the images" instead of using multiple one maybe. So instead of using an input size of (4,56,56,3), maybe think your model to use an input size of (1,56,56,12)?

 

Have a good day

Julian


In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.