2025-01-30 08:26 AM
Hello everyone,
I’m trying to run a model on an STM32 MCU with Neural-ART using the ST Edge AI Developer Cloud.
The final model includes GRU layers, but when I attempt to quantize and then optimize it, I encounter issues. To debug this, I created a minimal test model, which is as follows:
class GRUTEST(nn.Module):
def __init__(self, hidden_layer_size, n_layers, output_size, dropout, device):
super(GRUTEST, self).__init__()
self.device = device
self.hidden_layer_size = hidden_layer_size
self.n_layers = n_layers
self.rnn = nn.GRU(256, hidden_layer_size, n_layers, batch_first=True, dropout=dropout, bidirectional=False)
self.fc = nn.Linear(64, output_size)
def forward(self, x):
x, _ = self.rnn(x)
x = self.fc(x[:, -1, :])
return x
I then convert this model to ONNX as follows:
torch_model = GRUTEST(
hidden_layer_size=64,
n_layers=2,
output_size=1,
dropout=0.2,
device='cpu'
)
torch_model.eval()
torch.onnx.export(
torch_model,
torch.randn(1, 1, 256),
"modelGRU.onnx",
opset_version=15,
input_names=["input"],
output_names=["output"],
dynamic_axes={"input": {0: "batch_size"}, "output": {0: "batch_size"}},
export_params=True,
keep_initializers_as_inputs=False
)
I successfully perform per-channel quantization, but as soon as I reach the optimization step (with default optimization options), I get the following error—both with the quantized and non-quantized versions of the model:
TOOL ERROR: operands could not be broadcast together with shapes (256,) (128,).
If I replace the GRU with an LSTM model:
class LSTMTEST(nn.Module):
def __init__(self, hidden_layer_size, n_layers, output_size, dropout, device):
super(LSTMTEST, self).__init__()
self.device = device
self.hidden_layer_size = hidden_layer_size
self.n_layers = n_layers
self.rnn = nn.LSTM(256, hidden_layer_size, n_layers, batch_first=True, dropout=dropout, bidirectional=False)
self.fc = nn.Linear(64, output_size)
def forward(self, x):
batch_size = x.size(0)
h_0 = torch.zeros(self.n_layers, batch_size, self.hidden_layer_size, device=x.device)
c_0 = torch.zeros(self.n_layers, batch_size, self.hidden_layer_size, device=x.device)
x, _ = self.rnn(x, (h_0, c_0))
x = self.fc(x[:, -1, :])
return x
Even though I explicitly define h_0 and c_0, I still get the following error:
NOT IMPLEMENTED: Sixth input (initial_h) of LSTM _rnn_LSTM_output_0_forward is not constant or constant propagation was not able to compute it
Would anyone be able to point out what I’m doing wrong?
Thanks in advance!
2025-01-31 06:59 AM
Hello @Dresult,
The neural-art does not support GRU nor LSTM. You can find the list of supported layers here:
https://stedgeai-dc.st.com/assets/embedded-docs/stneuralart_operator_support.html
Have a good day,
Julian
2025-02-03 03:27 AM
Hello @Julian E.
Thanks for your answer, I hadn't check the page dedicated to the NPU. Now, it makes sense.
However, I am also trying to deploy the model on the STM32 MCUs and MPUs but when it comes to the optimaztion step, the process hangs. I tried with both the ST Edge AI Core 2.0 and the STM32Cube.AI 9.0. Do GRU and LSTM layers remain unsupported even on these platforms ?