cancel
Showing results for 
Search instead for 
Did you mean: 

Issue Running a GRU/LSTM Model on STM32 with Neural-ART

Dresult
Associate III

Hello everyone,

I’m trying to run a model on an STM32 MCU with Neural-ART using the ST Edge AI Developer Cloud.

The final model includes GRU layers, but when I attempt to quantize and then optimize it, I encounter issues. To debug this, I created a minimal test model, which is as follows:

class GRUTEST(nn.Module): def __init__(self, hidden_layer_size, n_layers, output_size, dropout, device): super(GRUTEST, self).__init__() self.device = device self.hidden_layer_size = hidden_layer_size self.n_layers = n_layers self.rnn = nn.GRU(256, hidden_layer_size, n_layers, batch_first=True, dropout=dropout, bidirectional=False) self.fc = nn.Linear(64, output_size) def forward(self, x): x, _ = self.rnn(x) x = self.fc(x[:, -1, :]) return x

I then convert this model to ONNX as follows:

torch_model = GRUTEST( hidden_layer_size=64, n_layers=2, output_size=1, dropout=0.2, device='cpu' ) torch_model.eval() torch.onnx.export( torch_model, torch.randn(1, 1, 256), "modelGRU.onnx", opset_version=15, input_names=["input"], output_names=["output"], dynamic_axes={"input": {0: "batch_size"}, "output": {0: "batch_size"}}, export_params=True, keep_initializers_as_inputs=False )

I successfully perform per-channel quantization, but as soon as I reach the optimization step (with default optimization options), I get the following error—both with the quantized and non-quantized versions of the model:

TOOL ERROR: operands could not be broadcast together with shapes (256,) (128,).

If I replace the GRU with an LSTM model:

class LSTMTEST(nn.Module): def __init__(self, hidden_layer_size, n_layers, output_size, dropout, device): super(LSTMTEST, self).__init__() self.device = device self.hidden_layer_size = hidden_layer_size self.n_layers = n_layers self.rnn = nn.LSTM(256, hidden_layer_size, n_layers, batch_first=True, dropout=dropout, bidirectional=False) self.fc = nn.Linear(64, output_size) def forward(self, x): batch_size = x.size(0) h_0 = torch.zeros(self.n_layers, batch_size, self.hidden_layer_size, device=x.device) c_0 = torch.zeros(self.n_layers, batch_size, self.hidden_layer_size, device=x.device) x, _ = self.rnn(x, (h_0, c_0)) x = self.fc(x[:, -1, :]) return x

Even though I explicitly define h_0 and c_0, I still get the following error:

NOT IMPLEMENTED: Sixth input (initial_h) of LSTM _rnn_LSTM_output_0_forward is not constant or constant propagation was not able to compute it

Would anyone be able to point out what I’m doing wrong?

Thanks in advance!

2 REPLIES 2
Julian E.
ST Employee

Hello @Dresult,

 

The neural-art does not support GRU nor LSTM. You can find the list of supported layers here:

https://stedgeai-dc.st.com/assets/embedded-docs/stneuralart_operator_support.html 

 

Have a good day,

Julian

​
In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

Hello @Julian E. 

Thanks for your answer, I hadn't check the page dedicated to the NPU. Now, it makes sense.

However, I am also trying to deploy the model on the STM32 MCUs and MPUs but when it comes to the optimaztion step, the process hangs. I tried with both the ST Edge AI Core 2.0 and the STM32Cube.AI 9.0. Do GRU and LSTM layers remain unsupported even on these platforms ?