Analysing onnx model error: TOOL ERROR: list index out of range

AMurz.1 · ‎2025-04-01

Hi,

I'm trying to test a model on the STM32N6 but stedgeai tool fails to analyse the model:

ST Edge AI Core v2.0.0-20049

TOOL ERROR: list index out of range

I'm using X-CUBE-AI 10.0.0, STM32CubeN6 v1.1.0 and STM32CubeIDE 1.18.0.

The model I'm trying to use is in attachment.

I've also tried with STM32Cube.AI MCU Runtime mode but the error is the same.

How can I fix the issue ?

Thanks,

Alexis Murzeau

Julian E. · ‎2025-04-17

Hello @AMurz.1,

Sorry for the late answer.

I have tried multiple things, but I could not solve your issue.

For sure, the GRUs layers are the cause of the error but it is a stedgeai core bug.

GRUs and LSTMs are badly supported as of today.

Here are steps that I followed, if it can help you for future work:

You can expend the log error by first using: export _DEBUG=2 in a git bash before the stedgeai generate
Fixing the batchsize to 1 generally helps
use onnx shape_inference to make sure all shape are correct (or just to have them all in the neutron plot)
you try to use onnx simplifier (onnx-simplifier · PyPI)

In your case, here is what I did:

import onnx
from onnx import shape_inference

def fix_batch_size_to_one(model_path, output_path):
    # Load model
    model = onnx.load(model_path)
    graph = model.graph

    def fix_tensor_shape(tensor):
        if tensor.type.tensor_type.shape.dim:
            tensor.type.tensor_type.shape.dim[0].dim_value = 10
            # tensor.type.tensor_type.shape.dim[0].dim_param = ""

    # Fix inputs and outputs
    for input_tensor in graph.input:
        fix_tensor_shape(input_tensor)
    for output_tensor in graph.output:
        fix_tensor_shape(output_tensor)

    # Save intermediate model
    onnx.save(model, output_path)
    print(f"Saved model with fixed batch size to {output_path}")

    return output_path

def infer_shapes(model_path, inferred_path):
    # Run ONNX shape inference
    model = onnx.load(model_path)
    inferred_model = shape_inference.infer_shapes(model)
    onnx.save(inferred_model, inferred_path)
    print(f"Inferred shapes saved to {inferred_path}")
    return inferred_model

def print_value_info(model):
    print("\n== Intermediate Value Info ==")
    for vi in model.graph.value_info:
        shape = [
            d.dim_value if (d.HasField("dim_value")) else "?" 
            for d in vi.type.tensor_type.shape.dim
        ]
        print(f"{vi.name}: {shape}")

# === Usage ===
fixed_model_path = "model_fixed.onnx"
inferred_model_path = "model_inferred.onnx"

# Step 1: Fix batch size
fix_batch_size_to_one("denoiser_GRU_dns.onnx", fixed_model_path)

# Step 2: Infer shapes
inferred_model = infer_shapes(fixed_model_path, inferred_model_path)

# Step 3: Print intermediate shapes
print_value_info(inferred_model)

And onnx simplifier in cmd:

!onnxsim model_fixed.onnx denoiser_GRU_dns_output.onnx --no-large-tensor

I still get an error: ValueError: operands could not be broadcast together with shapes (1024,) (512,)

It most likely comes from your GRUs shape.

Have a good day,

Julian

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

AMurz.1 · ‎2025-04-22

Hi,

Thanks for your reply.

_DEBUG=2 is very useful :)

I've also tried this which further reduce the graph (I think it preprocesses weights data like onnx simplifier does):

from onnxruntime.quantization import preprocess
preprocess.quant_pre_process("denoiser_GRU_dns_nobatchsize.onnx", "denoiser_GRU_dns_nobatchsize_preprocess.onnx")

And I get the same error "ValueError: operands could not be broadcast together with shapes (1024,) (512,) ".

But I see that the previous Transpose node seems to be misinterpreted:

Computing all activation shapes of node_32 (Transpose)
    In shapes [(BATCH: 1, CH: 257, H: 1)] - In values shape [None]
        Resetting shape of node_32
        Computing remapping starting from (2, 0, 1)
            Computed remapping is
            BATCH <- H
            CH <- BATCH
            H <- CH
        BATCH <- H
        CH <- BATCH
        H <- CH
    Found new output shapes: [(BATCH: 1, CH: 257, H: 1)]

The output shape should be [1, 1, 257] instead of [1, 257, 1]:

But the error is the same at the GRU node. Maybe it expects the same dimensions for weights as a LSTM (which expect [num_directions, 4*hidden_size, input_size] for weights), but GRU needs 3*hidden_size instead of 4.

There is an alternative model using LSTM instead here: https://github.com/GreenWaves-Technologies/tiny_denoiser_v2/blob/public/model/denoiser_LSTM_Valetini.onnx

I got farther but I still get blocked at some point.

The Transpose adjustment is definitely needed to get further.

This what I need to do using onnx2py:

--- a/denoiser_LSTM_Valetini.py
+++ b/denoiser_LSTM_Valetini.py
@@ -41,8 +41,8 @@ model = helper.make_model(
     producer_version='1.5',
     graph=make_graph(
         name='torch-jit-export',
-        inputs=[helper.make_tensor_value_info('input', TensorProto.FLOAT, shape=['batch_size', 257, 1])],
-        outputs=[helper.make_tensor_value_info('output', TensorProto.FLOAT, shape=['batch_size', 257, 1])],
+        inputs=[helper.make_tensor_value_info('input', TensorProto.FLOAT, shape=[1, 257, 1])],
+        outputs=[helper.make_tensor_value_info('output', TensorProto.FLOAT, shape=[1, 257, 1])],
         initializer=[
             numpy_helper.from_array(np.load(os.path.join(DATA_DIR, 'const0_enhance.bias_hh_l0.npy')).astype('float32').reshape([1024]), name='enhance.bias_hh_l0'),
             numpy_helper.from_array(np.load(os.path.join(DATA_DIR, 'const1_enhance.bias_hh_l1.npy')).astype('float32').reshape([1024]), name='enhance.bias_hh_l1'),
@@ -84,7 +84,7 @@ model = helper.make_model(
                 momentum=0.8999999761581421,
             ),
             make_node('Relu', inputs=['30'], outputs=['31'], name='Relu_2'),
-            make_node('Transpose', inputs=['31'], outputs=['32'], name='Transpose_3', perm=[2, 0, 1]),
+            make_node('Transpose', inputs=['31'], outputs=['32'], name='Transpose_3', perm=[0, 2, 1]),
             make_node('Shape', inputs=['32'], outputs=['33'], name='Shape_4'),
             make_node('Constant', inputs=[], outputs=['34'], name='Constant_5', value=numpy_helper.from_array(np.array(1, dtype='int64'), name='')),
             make_node('Gather', inputs=['33', '34'], outputs=['35'], name='Gather_6', axis=0),
@@ -227,7 +227,7 @@ model = helper.make_model(
             make_node('Slice', inputs=['42', '173', '174', '172'], outputs=['175'], name='Slice_143'),
             make_node('LSTM', inputs=['111', '165', '166', '167', '', '171', '175'], outputs=['176', '177', '178'], name='LSTM_144', hidden_size=256),
             make_node('Squeeze', inputs=['176'], outputs=['179'], name='Squeeze_145', axes=[1]),
-            make_node('Transpose', inputs=['179'], outputs=['180'], name='Transpose_146', perm=[1, 2, 0]),
+            make_node('Transpose', inputs=['179'], outputs=['180'], name='Transpose_146', perm=[0, 2, 1]),
             make_node('Conv', inputs=['180', 'fc1.weight', 'fc1.bias'], outputs=['181'], name='Conv_147', dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]),
             make_node(
                 'BatchNormalization',

This makes stedgeai use the correct shapes for Transpose operations.

Then stedgeai stop alter at this error:

[...]
      Generating ONNX nodes
         layer name: Input_0,  type: Input
         layer name: input_0_reshaped,  type: Reshape
            Adding as input Input_0
         layer name: node_30,  type: Conv2D
            Adding as input input_0_reshaped
         layer name: node_30_0_reshaped,  type: Reshape
            Adding as input node_30
         layer name: node_31,  type: Nonlinearity
            Adding as input node_30_0_reshaped
         layer name: transpose_0_out,  type: Transpose
            Adding as input node_31
            ONNX output shape map is [BATCH, CH, H]
            Remapping is {BATCH: CH, CH: BATCH, H: H}
            Computing perm attribute
               For BATCH set CH which is 1
               For CH set BATCH which is 0
               For H set H which is 2
            Perm attribute is [1, 0, 2]
         layer name: node_108_forward,  type: LSTM
            Adding as input transpose_0_out
            Error in execution of pass type(ONNX_AND_JSON_EXPORTER) id(74)
            To increase its debug level: "export _DEBUG_CLASSES=ONNX_AND_JSON_EXPORTER"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "build/scripts/st_ai_cli/st_ai.py", line 304, in ild.scripts.st_ai_cli.st_ai.main
  File "build/scripts/st_ai_cli/st_ai.py", line 164, in ild.scripts.st_ai_cli.st_ai.launch_cli
  File "build/scripts/st_ai_cli/st_ai.py", line 148, in ild.scripts.st_ai_cli.st_ai.launch_cli
  File "build/scripts/st_ai_cli/st_ai.py", line 128, in ild.scripts.st_ai_cli.st_ai.cmd_launch
  File "build/scripts/st_ai_cli/st_ai.py", line 129, in ild.scripts.st_ai_cli.st_ai.cmd_launch
  File "build/scripts/st_ai_cli/st_ai.py", line 70, in ild.scripts.st_ai_cli.st_ai.cmd_generate
  File "build/passes/ai_pass_factory.py", line 138, in ild.passes.ai_pass_factory.AIPassFactory.exec
  File "build/passes/ai_pass_factory.py", line 133, in ild.passes.ai_pass_factory.AIPassFactory.exec
  File "build/passes/ai_pass.py", line 408, in ild.passes.ai_pass.AIPass.exec
  File "build/passes/middleend/onnx_and_json_exporter.py", line 1767, in ild.passes.middleend.onnx_and_json_exporter.OnnxAndJsonExporter._exec
  File "build/irs/objects/layers/lstm.py", line 649, in ild.irs.objects.layers.lstm.Lstm.get_onnx_tensor_node
AssertionError: Export of LSTM node_108_forward with grouped weights is not supported

I've tried various modification to the model, but I always get that error and don't understand what means "grouped weights".

Do you think something can be done on the model to fix or workaround this error ?

I'm attaching the resulting model and python script that gives me this error.

Usage: python denoiser_LSTM_Valetini.py denoiser_LSTM_Valetini_nobatchsize.onnx

This will also generate denoiser_LSTM_Valetini_nobatchsize_preprocess.onnx which will be the simplified version with intermediate tensors shapes.

Regards,

Alexis Murzeau