2024-11-04 05:54 AM
Hi community members,
I’m currently working on a project where I want to deploy a Convolutional Recurrent Neural Network (CRNN) on an STM32 MCU using X-Cube-AI. The network structure includes convolutional layers followed by GRU layers, with a reshape and permute in between to convert the 4D tensor output to a 3D input tensor for the recurrent layer. Initially, I implemented this in PyTorch (exported to ONNX), and then tried TensorFlow Keras (exported to TFLite). Unfortunately, I’ve run into errors with both approaches during the import process in X-Cube-AI.
When importing the ONNX model from PyTorch, I received the following error message:
TOOL ERROR: Shape and shape map lengths must be the same: [192] vs. (CH_IN, CH)
I built the same architecture using TensorFlow Keras to see if this would work, then exported it as a TFLite model. However, importing this TFLite model into X-Cube-AI produced the following error:
INTERNAL ERROR: Inconsistent in/out tensor shapes in transpose_6_output: (H: 41, W: 4, CH: 64) and (H: 1, CH: 256)
Both errors seem to be related to the incorrect reshaping of dimensions. The GRU in the ONNX model has an input size of 256, hidden_size of 64, and is bidirectional. So the 192 comes from 3*hidden_size, which might give a clue on where and why this problem occurs. In the TFLite model, the dimensions seem to be mixed up. The GRU layer isn't supported in TFLite, so I had to unroll the GRU. I also have stateful=False, return_sequences=True, FYI.
Since both errors seem related to the reshaping, I think it is necessary to explain this part in more details.
Both models take in a spectrogram input of size (1, 1, 64, 41), where the dimensions are (Batch, Channels, Height, Width) for PyTorch, and (1, 64, 41, 1) in Tensorflow with its channel last convention (Batch, Height, Width, Channels). After the convolution part, there are 64 channels, the height is reduced from 64 to 4, and the width stays the same. The GRU takes in a tensor of size (batch, timesteps, feature), so I permute the dimensions to have the order (batch, timesteps, height, channels) and group the height with the channels to obtain an output of (1, 41, 256) for the GRU layer.
I've attached the summaries of both models, as well as the model files for reproducibility!
I really appreciate any help or suggestions from the community, especially if you’ve faced similar issues or have experience with deploying CRNN models on STM32.
Thank you!
Have a wonderful day if you're reading this!
Solved! Go to Solution.
2024-11-08 07:02 AM - edited 2024-11-08 07:19 AM
Hello @Tsu ,
Your tflite should be useable in the next version of ST Edge AI (or cubeAI in cubeMX), so not the 9.1.0 but the next one (which should be close).
For the onnx however, I also get the shape error, I will investigate with our experts.
Based on your screenshot, I did code your model in tensorflow (normal, not lite):
import tensorflow as tf
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Permute, Reshape, GRU, Dense, Activation
# Define the model
model = tf.keras.Sequential()
# Input layer
model.add(tf.keras.layers.Input(shape=(64, 41, 1), batch_size=1))
# First Conv2D layer
model.add(Conv2D(16, (3, 3), padding='same', activation='relu'))
# First MaxPooling2D layer
model.add(MaxPooling2D((4, 1), padding='same'))
# Second Conv2D layer
model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
# Second MaxPooling2D layer
model.add(MaxPooling2D((2, 1), padding='same'))
# Third Conv2D layer
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
# Third MaxPooling2D layer
model.add(MaxPooling2D((2, 1), padding='same'))
# Permute layer
model.add(Permute((2, 1, 3)))
# Reshape layer
model.add(Reshape((41, 256)))
# GRU layer
model.add(GRU(64,return_sequences=True, stateful=True,))
# Dense layer
model.add(Dense(1))
# Activation layer
model.add(Activation('sigmoid'))
# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# Save the model as an H5 file
model.save('model.h5')
# Summary of the model
model.summary()
(as explained in the doc, you need to define a batch size of 1 and put stateful=True in the gru)
When I upload it on ST Edge AI Dev cloud, I cannot quantize it but the optimization pass and I succeeded in benchmarking it on a real board:
So you should be able to use it as of now non quantized if you want, it may be useful to you.
Have a good day,
Julian
2024-11-07 05:45 AM
Hello @Tsu ,
Sadly GRU (even 1 dimensional) are not supported.
Your error, even if not very clear, is probably due to this.
You can try to decompose your model into their constituent operations (e.g. for LSTM : Dense layers, Tanh, Sigmoïd, etc.) for a single timestep.
Example for LSTM that could help you (see my last response in the thread): https://community.st.com/t5/edge-ai/e010-invalidmodelerror-couldn-t-load-keras-model/td-p/738099
Have a good day,
Julian
2024-11-08 06:05 AM
Hi @Julian E. ,
Thank you very much for your answer!
I must have misunderstood the documentation then. Do you mind clearing up my confusion why (1) does include GRU/LSTM as supported operations, and (2) says there is initial support for stateful LSTM/GRUs? Doc files found at:
(1) STM32Cube/Repository/Packs/STMicroelectronics/X-CUBE-AI/9.1.0/Documentation/supported_ops_onnx.html
(2) STM32Cube/Repository/Packs/STMicroelectronics/X-CUBE-AI/9.1.0/Documentation/keras_lstm_stateful.html
I've chosen GRU as recurrent part because it has a smaller size compared to LSTM, and quantization is not supported (yet) for both layers. But, is any of the 2 preferred over the other on Cube AI? I might switch if one is less buggy.
If there's no simpler solution, then I'll try to write the custom layer. Appreciate the help!
2024-11-08 07:02 AM - edited 2024-11-08 07:19 AM
Hello @Tsu ,
Your tflite should be useable in the next version of ST Edge AI (or cubeAI in cubeMX), so not the 9.1.0 but the next one (which should be close).
For the onnx however, I also get the shape error, I will investigate with our experts.
Based on your screenshot, I did code your model in tensorflow (normal, not lite):
import tensorflow as tf
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Permute, Reshape, GRU, Dense, Activation
# Define the model
model = tf.keras.Sequential()
# Input layer
model.add(tf.keras.layers.Input(shape=(64, 41, 1), batch_size=1))
# First Conv2D layer
model.add(Conv2D(16, (3, 3), padding='same', activation='relu'))
# First MaxPooling2D layer
model.add(MaxPooling2D((4, 1), padding='same'))
# Second Conv2D layer
model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
# Second MaxPooling2D layer
model.add(MaxPooling2D((2, 1), padding='same'))
# Third Conv2D layer
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
# Third MaxPooling2D layer
model.add(MaxPooling2D((2, 1), padding='same'))
# Permute layer
model.add(Permute((2, 1, 3)))
# Reshape layer
model.add(Reshape((41, 256)))
# GRU layer
model.add(GRU(64,return_sequences=True, stateful=True,))
# Dense layer
model.add(Dense(1))
# Activation layer
model.add(Activation('sigmoid'))
# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# Save the model as an H5 file
model.save('model.h5')
# Summary of the model
model.summary()
(as explained in the doc, you need to define a batch size of 1 and put stateful=True in the gru)
When I upload it on ST Edge AI Dev cloud, I cannot quantize it but the optimization pass and I succeeded in benchmarking it on a real board:
So you should be able to use it as of now non quantized if you want, it may be useful to you.
Have a good day,
Julian
2024-11-11 03:23 AM
Hi @Julian E. ,
That's really amazing to hear! It's so incredibly helpful to be able to experiment with the h5 file for now and wait for the next update to use the tflite format!
Thank you so much for the code example and good news Julian! You really made my week!
Kind regards,
Tsu