cancel
Showing results for 
Search instead for 
Did you mean: 

RAM Overflow and X-CUBE-AI Quantization Analysis Error with Onnx Model on STM32F401RE

SR1218
Associate II

RAM Overflow and X-CUBE-AI Quantization Analysis Error with PyTorch Model on STM32F401RE

Hello everyone,

I'm working on deploying a PyTorch model to an STM32F401RE NUCLEO board and encountering some challenging memory and quantization issues that I hope the community can help me resolve.

Project Context

My project involves running a custom PyTorch model (converted to ONNX format) on the STM32F401RE NUCLEO board. The system already has USB Host (Audio Class) library and FreeRTOS integrated as essential components for my application, which means I need to work within the remaining available memory space.

Development Environment

  • Board: STM32F401RE NUCLEO (96KB SRAM)
  • IDE: STM32CubeIDE 1.18.1
  • X-CUBE-AI: 10.2.0
  • Additional Libraries: USB Host (Audio Class), FreeRTOS
  • Model: PyTorch → ONNX converted

Problem 1: RAM Overflow During Build

When I configure X-CUBE-AI with compression set to high and optimization set to ram, the build process fails with a linker error:

C:/ST/STM32CubeIDE_1.18.1/STM32CubeIDE/plugins/com.st.stm32cube.ide.mcu.externaltools.gnu-tools-for-stm32.13.3.rel1.win32_1.0.0.202411081344/tools/bin/../lib/gcc/arm-none-eabi/13.3.1/../../../../arm-none-eabi/bin/ld.exe: Xcube.elf section `.bss' will not fit in region `RAM'

C:/ST/STM32CubeIDE_1.18.1/STM32CubeIDE/plugins/com.st.stm32cube.ide.mcu.externaltools.gnu-tools-for-stm32.13.3.rel1.win32_1.0.0.202411081344/tools/bin/../lib/gcc/arm-none-eabi/13.3.1/../../../../arm-none-eabi/bin/ld.exe: region `RAM' overflowed by 10800 bytes

Given that the STM32F401RE has only 96KB of SRAM and I already have USB Host and FreeRTOS consuming memory, this overflow isn't entirely surprising.

Problem 2: X-CUBE-AI Quantization Analysis Failure

To address the memory constraints, I attempted INT8 quantization using ONNX Runtime. Here's the quantization code I used:

class MyCalibrationDataReader(CalibrationDataReader):
    def __init__(self, data, model_path):
        self.enum_data = None
        self.data = data 

        # Use inference session to get input shape.
        session = onnxruntime.InferenceSession(model_path, None)
        batch_size, channel, length = session.get_inputs()[0].shape
        self.input_name = session.get_inputs()[0].name
        self.datasize = len(data)

    def get_next(self):
        if self.enum_data is None:
            self.enum_data = iter([
                {self.input_name: sample[np.newaxis, np.newaxis, :].astype(np.float32)}  # (2048,) → (1, 1, 2048)
                for sample in self.data
            ])
        return next(self.enum_data, None)
    
    def rewind(self):
        self.enum_data = None  # Reset the enumeration of calibration data

dr = MyCalibrationDataReader(cali_data, model_fp32_prep)

quantize_static(
    model_fp32_prep,
    model_quant,
    dr,
    quant_format=QuantFormat.QDQ,
    per_channel=True,
    weight_type=QuantType.QInt8, 
    activation_type=QuantType.QInt8, 
    reduce_range=True,
    extra_options={'WeightSymmetric': True, 'ActivationSymmetric': False}
)

However, when I try to analyze the quantized model with X-CUBE-AI, I encounter this error:

Analyzing model C:/Users/user/STM32Cube/Repository/Packs/STMicroelectronics/X-CUBE-AI/10.2.0/Utilities/windows/stedgeai.exe analyze --target stm32f4 --name network -m C:/Users/user/Downloads/FRFconv-TDS_onnx.quant.onnx --compression high --verbosity 1 --no-inputs-allocation -O ram --no-outputs-allocation --memory-pool C:\Users\user\AppData\Local\Temp\mxAI_workspace1712638268650010180404439894557726\mempools.json --workspace C:/Users/user/AppData/Local/Temp/mxAI_workspace1712638268650010180404439894557726 --output C:/Users/user/.stm32cubemx/network_output 

ST Edge AI Core v2.2.0-20266 2adc00962 
INTERNAL ERROR: 'NoneType' object is not subscriptable

Problem 3: Same Error on ST Edge AI Developer Cloud

I also tried using ST Edge AI Developer Cloud for quantization, but encountered the same issue:

>>> stedgeai analyze --model FRFconv-TDS_onnx_PerTensor_quant_random_2.onnx --optimization ram --target stm32f4 --name network --workspace workspace --output output 

ST Edge AI Core v2.2.0-20266 2adc00962 
INTERNAL ERROR: 'NoneType' object is not subscriptable

My Questions

I'm quite attached to my current model architecture as it's specifically designed for my application requirements, so I'd prefer not to change the model structure if possible.

  1. Memory Optimization: Has anyone successfully deployed AI models on STM32F401RE with other libraries like USB Host and FreeRTOS running simultaneously? Are there additional memory optimization techniques beyond X-CUBE-AI's high compression and RAM optimization that I could try?

  2. Quantization Error: Have you encountered the 'NoneType' error when analyzing quantized ONNX models in X-CUBE-AI? This seems to occur both locally and on the cloud platform. Could this be a compatibility issue with my quantization approach or the ONNX model format?

  3. Alternative Approaches: Are there other strategies to make my model fit within the available memory constraints without modifying the model architecture?

Additional Information Available

I have the following resources available if they would help with troubleshooting:

  • Original PyTorch model code
  • ONNX conversion and quantization scripts
  • Original ONNX model file (before quantization)

Please let me know if you need any additional information to help diagnose these issues.

Thank you in advance for your assistance!

0 REPLIES 0