2025-04-01 12:35 PM
Hi,
I'm trying to test a model on the STM32N6 but stedgeai tool fails to analyse the model:
ST Edge AI Core v2.0.0-20049
TOOL ERROR: list index out of range
I'm using X-CUBE-AI 10.0.0, STM32CubeN6 v1.1.0 and STM32CubeIDE 1.18.0.
The model I'm trying to use is in attachment.
I've also tried with STM32Cube.AI MCU Runtime mode but the error is the same.
How can I fix the issue ?
Thanks,
Alexis Murzeau
2025-04-17 2:41 AM
Hello @AMurz.1,
Sorry for the late answer.
I have tried multiple things, but I could not solve your issue.
For sure, the GRUs layers are the cause of the error but it is a stedgeai core bug.
GRUs and LSTMs are badly supported as of today.
Here are steps that I followed, if it can help you for future work:
In your case, here is what I did:
import onnx
from onnx import shape_inference
def fix_batch_size_to_one(model_path, output_path):
# Load model
model = onnx.load(model_path)
graph = model.graph
def fix_tensor_shape(tensor):
if tensor.type.tensor_type.shape.dim:
tensor.type.tensor_type.shape.dim[0].dim_value = 10
# tensor.type.tensor_type.shape.dim[0].dim_param = ""
# Fix inputs and outputs
for input_tensor in graph.input:
fix_tensor_shape(input_tensor)
for output_tensor in graph.output:
fix_tensor_shape(output_tensor)
# Save intermediate model
onnx.save(model, output_path)
print(f"Saved model with fixed batch size to {output_path}")
return output_path
def infer_shapes(model_path, inferred_path):
# Run ONNX shape inference
model = onnx.load(model_path)
inferred_model = shape_inference.infer_shapes(model)
onnx.save(inferred_model, inferred_path)
print(f"Inferred shapes saved to {inferred_path}")
return inferred_model
def print_value_info(model):
print("\n== Intermediate Value Info ==")
for vi in model.graph.value_info:
shape = [
d.dim_value if (d.HasField("dim_value")) else "?"
for d in vi.type.tensor_type.shape.dim
]
print(f"{vi.name}: {shape}")
# === Usage ===
fixed_model_path = "model_fixed.onnx"
inferred_model_path = "model_inferred.onnx"
# Step 1: Fix batch size
fix_batch_size_to_one("denoiser_GRU_dns.onnx", fixed_model_path)
# Step 2: Infer shapes
inferred_model = infer_shapes(fixed_model_path, inferred_model_path)
# Step 3: Print intermediate shapes
print_value_info(inferred_model)
And onnx simplifier in cmd:
!onnxsim model_fixed.onnx denoiser_GRU_dns_output.onnx --no-large-tensor
I still get an error: ValueError: operands could not be broadcast together with shapes (1024,) (512,)
It most likely comes from your GRUs shape.
Have a good day,
Julian
2025-04-22 2:11 PM
Hi,
Thanks for your reply.
_DEBUG=2 is very useful :)
I've also tried this which further reduce the graph (I think it preprocesses weights data like onnx simplifier does):
from onnxruntime.quantization import preprocess
preprocess.quant_pre_process("denoiser_GRU_dns_nobatchsize.onnx", "denoiser_GRU_dns_nobatchsize_preprocess.onnx")
And I get the same error "ValueError: operands could not be broadcast together with shapes (1024,) (512,) ".
But I see that the previous Transpose node seems to be misinterpreted:
Computing all activation shapes of node_32 (Transpose)
In shapes [(BATCH: 1, CH: 257, H: 1)] - In values shape [None]
Resetting shape of node_32
Computing remapping starting from (2, 0, 1)
Computed remapping is
BATCH <- H
CH <- BATCH
H <- CH
BATCH <- H
CH <- BATCH
H <- CH
Found new output shapes: [(BATCH: 1, CH: 257, H: 1)]
The output shape should be [1, 1, 257] instead of [1, 257, 1]:
But the error is the same at the GRU node. Maybe it expects the same dimensions for weights as a LSTM (which expect [num_directions, 4*hidden_size, input_size] for weights), but GRU needs 3*hidden_size instead of 4.
There is an alternative model using LSTM instead here: https://github.com/GreenWaves-Technologies/tiny_denoiser_v2/blob/public/model/denoiser_LSTM_Valetini.onnx
I got farther but I still get blocked at some point.
The Transpose adjustment is definitely needed to get further.
This what I need to do using onnx2py:
--- a/denoiser_LSTM_Valetini.py
+++ b/denoiser_LSTM_Valetini.py
@@ -41,8 +41,8 @@ model = helper.make_model(
producer_version='1.5',
graph=make_graph(
name='torch-jit-export',
- inputs=[helper.make_tensor_value_info('input', TensorProto.FLOAT, shape=['batch_size', 257, 1])],
- outputs=[helper.make_tensor_value_info('output', TensorProto.FLOAT, shape=['batch_size', 257, 1])],
+ inputs=[helper.make_tensor_value_info('input', TensorProto.FLOAT, shape=[1, 257, 1])],
+ outputs=[helper.make_tensor_value_info('output', TensorProto.FLOAT, shape=[1, 257, 1])],
initializer=[
numpy_helper.from_array(np.load(os.path.join(DATA_DIR, 'const0_enhance.bias_hh_l0.npy')).astype('float32').reshape([1024]), name='enhance.bias_hh_l0'),
numpy_helper.from_array(np.load(os.path.join(DATA_DIR, 'const1_enhance.bias_hh_l1.npy')).astype('float32').reshape([1024]), name='enhance.bias_hh_l1'),
@@ -84,7 +84,7 @@ model = helper.make_model(
momentum=0.8999999761581421,
),
make_node('Relu', inputs=['30'], outputs=['31'], name='Relu_2'),
- make_node('Transpose', inputs=['31'], outputs=['32'], name='Transpose_3', perm=[2, 0, 1]),
+ make_node('Transpose', inputs=['31'], outputs=['32'], name='Transpose_3', perm=[0, 2, 1]),
make_node('Shape', inputs=['32'], outputs=['33'], name='Shape_4'),
make_node('Constant', inputs=[], outputs=['34'], name='Constant_5', value=numpy_helper.from_array(np.array(1, dtype='int64'), name='')),
make_node('Gather', inputs=['33', '34'], outputs=['35'], name='Gather_6', axis=0),
@@ -227,7 +227,7 @@ model = helper.make_model(
make_node('Slice', inputs=['42', '173', '174', '172'], outputs=['175'], name='Slice_143'),
make_node('LSTM', inputs=['111', '165', '166', '167', '', '171', '175'], outputs=['176', '177', '178'], name='LSTM_144', hidden_size=256),
make_node('Squeeze', inputs=['176'], outputs=['179'], name='Squeeze_145', axes=[1]),
- make_node('Transpose', inputs=['179'], outputs=['180'], name='Transpose_146', perm=[1, 2, 0]),
+ make_node('Transpose', inputs=['179'], outputs=['180'], name='Transpose_146', perm=[0, 2, 1]),
make_node('Conv', inputs=['180', 'fc1.weight', 'fc1.bias'], outputs=['181'], name='Conv_147', dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]),
make_node(
'BatchNormalization',
This makes stedgeai use the correct shapes for Transpose operations.
Then stedgeai stop alter at this error:
[...]
Generating ONNX nodes
layer name: Input_0, type: Input
layer name: input_0_reshaped, type: Reshape
Adding as input Input_0
layer name: node_30, type: Conv2D
Adding as input input_0_reshaped
layer name: node_30_0_reshaped, type: Reshape
Adding as input node_30
layer name: node_31, type: Nonlinearity
Adding as input node_30_0_reshaped
layer name: transpose_0_out, type: Transpose
Adding as input node_31
ONNX output shape map is [BATCH, CH, H]
Remapping is {BATCH: CH, CH: BATCH, H: H}
Computing perm attribute
For BATCH set CH which is 1
For CH set BATCH which is 0
For H set H which is 2
Perm attribute is [1, 0, 2]
layer name: node_108_forward, type: LSTM
Adding as input transpose_0_out
Error in execution of pass type(ONNX_AND_JSON_EXPORTER) id(74)
To increase its debug level: "export _DEBUG_CLASSES=ONNX_AND_JSON_EXPORTER"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "build/scripts/st_ai_cli/st_ai.py", line 304, in ild.scripts.st_ai_cli.st_ai.main
File "build/scripts/st_ai_cli/st_ai.py", line 164, in ild.scripts.st_ai_cli.st_ai.launch_cli
File "build/scripts/st_ai_cli/st_ai.py", line 148, in ild.scripts.st_ai_cli.st_ai.launch_cli
File "build/scripts/st_ai_cli/st_ai.py", line 128, in ild.scripts.st_ai_cli.st_ai.cmd_launch
File "build/scripts/st_ai_cli/st_ai.py", line 129, in ild.scripts.st_ai_cli.st_ai.cmd_launch
File "build/scripts/st_ai_cli/st_ai.py", line 70, in ild.scripts.st_ai_cli.st_ai.cmd_generate
File "build/passes/ai_pass_factory.py", line 138, in ild.passes.ai_pass_factory.AIPassFactory.exec
File "build/passes/ai_pass_factory.py", line 133, in ild.passes.ai_pass_factory.AIPassFactory.exec
File "build/passes/ai_pass.py", line 408, in ild.passes.ai_pass.AIPass.exec
File "build/passes/middleend/onnx_and_json_exporter.py", line 1767, in ild.passes.middleend.onnx_and_json_exporter.OnnxAndJsonExporter._exec
File "build/irs/objects/layers/lstm.py", line 649, in ild.irs.objects.layers.lstm.Lstm.get_onnx_tensor_node
AssertionError: Export of LSTM node_108_forward with grouped weights is not supported
I've tried various modification to the model, but I always get that error and don't understand what means "grouped weights".
Do you think something can be done on the model to fix or workaround this error ?
I'm attaching the resulting model and python script that gives me this error.
Usage: python denoiser_LSTM_Valetini.py denoiser_LSTM_Valetini_nobatchsize.onnx
This will also generate denoiser_LSTM_Valetini_nobatchsize_preprocess.onnx which will be the simplified version with intermediate tensors shapes.
Regards,
Alexis Murzeau