Issue running larger models on STM32H747I-DISCO

Dresult · ‎2025-03-27

Hi everyone,

I’m having trouble generating a project for the STM32H747I-DISCO board because my AI model exceeds the 1024 KB flash limit allocated for the M7 core (it takes up around 1.09 MB). To solve this, I’m trying to use more memory by following this guide: How to run larger models on STM32H747I-DISCO

Here are the steps I followed:

Create an empty project for the board, selecting not to initialize the default peripherals.
Enable X-CUBE-AI and set the application mode to "Validation".
Modify the memory pool from the Tools tab.
1. Reducing the flash allocated for the M4to 64 KB and chaning the boot address
2. Increase the flash available for the M7
In the X-CUBE-AI configuration screen, I load the model (and at this step I select to optimize the clock and the peripherals), then when I try to generate the code, I get a warning stating that the model exceeds the 1024 KB flash limit. If I ignore the warning and proceed, the generated project appears incomplete and I can't move on building the project.

(Of course, if everything had worked correctly, I would have changed the M4 boot address using CubeProgrammer to eventually perform the validation on target).

Has anyone any suggestions on how to fix it?
Thanks in advance!

hamitiya · ‎2025-03-28

Hello @Dresult

It looks like an issue on X-CUBE-AI which is not forcing the code generation, so my assumption was false, we are investigating on that part. Sorry for the inconvenience.

As a workaround, here is what you could try to run everything on Internal Flash:

- Open X-CUBE-AI

- Update your flash memory pool as you've done before (to maximize region for ARM Cortex-M7)

- Use a small model from the same type but with smaller flash footprint

After that your project should work with your smaller model

Now:

- Generate associated C code from your bigger model:

$HOME/STM32Cube/Repository/Packs/STMicroelectronics/X-CUBE-AI/10.0.0/Utilities/linux/stedgeai generate --model path/to/model --workspace /tmp/ws --output /path/to/X-CUBE-AI/folder/in/your/IDE/project --name network

You will replace your previous model with the new one.

Your files should be in CM7/X-CUBE-AI/App or equivalent.

It will replace:

- network_data_params.[c/h]

- network.[c/h]

- network_config.[c/h]

It is not very convenient I agree with you but it should do what you expect.

If you have any question feel free to ask!

Best regards,

Yanis

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

View solution in original post

hamitiya · ‎2025-03-27

Hello @Dresult,

Could you confirm you enabled X-CUBE-AI for context CortexM7 ?

Your procedure is right, however X-CUBE-AI should not block your generation even if it exceeds flash constraints. You will have the warning because X-CUBE-AI sees 1024KB for each region and not what you've changed on the "Tools" panel, but the generation should perform accordingly.

Do you have plenty of RAM for your model ?

When building in your IDE (I will consider STM32CubeIDE), do you see a X-CUBE-AI folder in the file tree under CM7 folder ?

In both CM4 and CM7 folder, can you retrieve the information you've set ?

// For CM7  
FLASH   (rx)   : ORIGIN = 0x08000000, LENGTH = 1984K    /* Memory is divided. Actual start is 0x08000000 and actual length is 1984K */

// For CM4:
FLASH (rx)     : ORIGIN = 0x081F0000, LENGTH = 64K

Best regards,

Yanis

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

Dresult · ‎2025-03-27

Hello Yanis, first of all thank for your support! :)

I confirm that I've enabled the Cube AI for the M7, it is set as in this image:

I can generate the project, even if I get this warning

however, it is missing something. If I try to compile the project for the M7 I get this output:

In file included from ../X-CUBE-AI/App/app_x-cube-ai.c:37:
../X-CUBE-AI/App/app_x-cube-ai.h:28:8: error: unknown type name 'ai_i8'
   28 | extern ai_i8* data_ins[];
      |        ^~~~~
../X-CUBE-AI/App/app_x-cube-ai.h:29:8: error: unknown type name 'ai_i8'
   29 | extern ai_i8* data_outs[];
      |        ^~~~~
../X-CUBE-AI/App/app_x-cube-ai.h:31:8: error: unknown type name 'ai_handle'
   31 | extern ai_handle data_activations0[];
      |        ^~~~~~~~~
make: *** [X-CUBE-AI/App/subdir.mk:19: X-CUBE-AI/App/app_x-cube-ai.o] Error 1
make: *** Waiting for unfinished jobs....
"make -j16 all" terminated with exit code 2. Build might be incomplete.

and I can see that there is no directory called "Middleware" as in other generated projects. This is part of the directory tree that shows the content of the X-CUBE-AI directories:

The memory sectors in the linker files are:

M7:

MEMORY
{
  RAM_D1 (xrw)   : ORIGIN = 0x24000000, LENGTH = 512K
  FLASH   (rx)   : ORIGIN = 0x08000000, LENGTH = 1984K    /* Memory is divided. Actual start is 0x08000000 and actual length is 2048K */
  DTCMRAM (xrw)  : ORIGIN = 0x20000000, LENGTH = 128K
  RAM_D2 (xrw)   : ORIGIN = 0x30000000, LENGTH = 288K
  RAM_D3 (xrw)   : ORIGIN = 0x38000000, LENGTH = 64K
  ITCMRAM (xrw)  : ORIGIN = 0x00000000, LENGTH = 64K
}

M4:

MEMORY
{
FLASH (rx)     : ORIGIN = 0x08100000, LENGTH = 64K
RAM (xrw)      : ORIGIN = 0x10000000, LENGTH = 288K
}

hamitiya · ‎2025-03-28

Hello,

To confirm what I observed on my side could you please:

- Go to $HOME/.stm32cubemx folder

- Clean STM32CubeMX.log file

- Execute the generation code part

- Share with me the output of this file

On my side I observed an Exception while copying a template file which should not happen. If it is the same on your side we will open a ticket.

Best regards,

Yanis

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

Dresult · ‎2025-03-28

Good morning Yanis,
I did the steps as you told me, I'm attaching the log file to this reply :)

Meanwhile, I tried to move the model to the external flash by enabling QuadSPI memory.
If I choose to do the weight split between internal and external flash I can generate the project and compile but when I run it I get a “Load FAIL” error. If instead I choose to use the separate bin file for the weights I am able to generate, compile and run the project on the board while also managing to do validation on the target.
As a temporary solution I can use this, however I am waiting to hear if there is a possibility to keep everything on the internal flash.

hamitiya · ‎2025-03-28

Hello @Dresult

It looks like an issue on X-CUBE-AI which is not forcing the code generation, so my assumption was false, we are investigating on that part. Sorry for the inconvenience.

As a workaround, here is what you could try to run everything on Internal Flash:

- Open X-CUBE-AI

- Update your flash memory pool as you've done before (to maximize region for ARM Cortex-M7)

- Use a small model from the same type but with smaller flash footprint

After that your project should work with your smaller model

Now:

- Generate associated C code from your bigger model:

$HOME/STM32Cube/Repository/Packs/STMicroelectronics/X-CUBE-AI/10.0.0/Utilities/linux/stedgeai generate --model path/to/model --workspace /tmp/ws --output /path/to/X-CUBE-AI/folder/in/your/IDE/project --name network

You will replace your previous model with the new one.

Your files should be in CM7/X-CUBE-AI/App or equivalent.

It will replace:

- network_data_params.[c/h]

- network.[c/h]

- network_config.[c/h]

It is not very convenient I agree with you but it should do what you expect.

If you have any question feel free to ask!

Best regards,

Yanis

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.