Problem with LSTM from pytorch.

ASalc.1 · ‎2024-02-15

Hi,

I am trying to deploy a network based on LSTM created with pytorch. The model is analysed correctly but validation on Desktop gives the following error:

LOAD ERROR: exception: access violation reading 0x0000000000000004

In the MCU a hardfault raises at inference.

I noticed that the error happens because I am accessing the 'hidden_state' of the first LSTM layer to feed it to the second one.

Any ideas about why is this happening?

This is a simple version of the model that reproduces the problem:

import torch
from torch import nn

class LSTMModel(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(LSTMModel, self).__init__()
        self.lstm1 = nn.LSTM(input_size, hidden_size, batch_first=True)
        self.lstm2 = nn.LSTM(hidden_size, hidden_size, batch_first=True)
        self.linear = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        out, (hidden_state, _) = self.lstm1(x)
        out, _ = self.lstm2(hidden_state)
        out = self.linear(out[:, -1, :])
        return out

if __name__ == '__main__':
    model = LSTMModel(14, 8, 14)
    input_data = torch.randn(1, 1, 14)
    _ = model(input_data)

    output_file = 'test-model.onnx'
    model.eval()
    dummy_input = torch.randn(1, 1, 14)
    torch.onnx.export(
        model,
        dummy_input,
        output_file,
        verbose=False)

fauvarque.daniel · ‎2024-07-17

Here is an extract of the documentation that explains you how to use stateful LSTM on the target

LSTM

Computes a multi-layer long short-term memory (LSTM) RNN to an input sequence (batch=1, timesteps, features)

category: recurrent layer
input data types: float32
output data types: float32

Specific constraints/recommendations:

stateless and stateful (batch=1 only) mode support
in stateful mode the user is requested to define two C routines to allocate and deallocate internal layer state.
- initial state must be provided as part of the allocation routine implementation
- the two functions to implement are:
  - void _allocate_lstm_states(ai_float **states, ai_u32 size_in_bytes)
  - void _deallocate_lstm_states(ai_float **states)
fused activation: gelu, linear, relu, relu_n1_to_1, leaky_relu, relu6, elu, selu, sigmoid, hard_sigmoid, hard_swish, exponential, tanh, softmax, softplus, softsign
fused recurrent activation: gelu, linear, relu, relu_n1_to_1, leaky_relu, relu6, elu, selu, sigmoid, hard_sigmoid, hard_swish, exponential, tanh, softmax, softplus, softsign
return_state not supported

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.