cancel
Showing results for 
Search instead for 
Did you mean: 

STM32N6 X-CUBE-AI multiple models

Tuomas95
Associate III

Hello,

I'm trying to use 2 networks from the STM32 model zoo on the STM32N6. I would like to load the weights of both models into my external RAM, and the activations into the npuRAM3,4,5 and 6. I have selected the n6-extram profile for both models.

I cannot figure out how to configure from X-CUBE-AI the address for the activations and weights, such that they don't overlap. The compiler reports:

Compilation details
------------------------------------------------------------------------------------
Compiler version: 1.1.1-14
Compiler arguments:  -i st_yolo_x_nano_256_0.5_0.4_int8_OE_3_3_0.onnx --json-quant-file st_yolo_x_nano_256_0.5_0.4_int8_OE_3_3_0_Q.json -g image_network.c --load-mdesc stm32n6.mdesc --load-mpool stm32n6__extRam.mpool --save-mpool-file stm32n6__extRam.mpool --out-dir-prefix neural_art__image_network/ --optimization 3 --all-buffers-info --mvei --cache-maintenance --Oauto-sched --native-float --enable-virtual-mem-pools --Omax-ca-pipe 4 --Ocache-opt --Os --output-info-file c_info.json --network-name image_network
====================================================================================

Memory usage information  (input/output buffers are included in activations)
------------------------------------------------------------------------------------
	flexMEM    [0x34000000 - 0x34000000]:          0  B /          0  B  (  0.00 % used) -- weights:          0  B (  0.00 % used)  activations:          0  B (  0.00 % used)
	cpuRAM1    [0x34064000 - 0x34064000]:          0  B /          0  B  (  0.00 % used) -- weights:          0  B (  0.00 % used)  activations:          0  B (  0.00 % used)
	cpuRAM2    [0x34100000 - 0x34200000]:          0  B /      1.000 MB  (  0.00 % used) -- weights:          0  B (  0.00 % used)  activations:          0  B (  0.00 % used)
	npuRAM3    [0x34200000 - 0x34270000]:    100.000 kB /    448.000 kB  ( 22.32 % used) -- weights:          0  B (  0.00 % used)  activations:    100.000 kB ( 22.32 % used)
	npuRAM4    [0x34270000 - 0x342E0000]:    431.000 kB /    448.000 kB  ( 96.21 % used) -- weights:          0  B (  0.00 % used)  activations:    431.000 kB ( 96.21 % used)
	npuRAM5    [0x342E0000 - 0x34350000]:    440.625 kB /    448.000 kB  ( 98.35 % used) -- weights:          0  B (  0.00 % used)  activations:    440.625 kB ( 98.35 % used)
	npuRAM6    [0x34350000 - 0x343C0000]:          0  B /    448.000 kB  (  0.00 % used) -- weights:          0  B (  0.00 % used)  activations:          0  B (  0.00 % used)
	octoFlash  [0x71000000 - 0x71000000]:          0  B /          0  B  (  0.00 % used) -- weights:          0  B (  0.00 % used)  activations:          0  B (  0.00 % used)
	hyperRAM   [0x90000000 - 0x92000000]:      2.487 MB /     32.000 MB  (  7.77 % used) -- weights:      2.487 MB (  7.77 % used)  activations:          0  B (  0.00 % used)
---
Total:                                             3.436 MB                                  -- weights:      2.487 MB                  activations:    971.625 kB                   
====================================================================================



Compilation details
------------------------------------------------------------------------------------
Compiler version: 1.1.1-14
Compiler arguments:  -i yamnet_1024_64x96_tl_qdq_int8_OE_3_3_0.onnx --json-quant-file yamnet_1024_64x96_tl_qdq_int8_OE_3_3_0_Q.json -g audio_network.c --load-mdesc stm32n6.mdesc --load-mpool stm32n6__extRam.mpool --save-mpool-file stm32n6__extRam.mpool --out-dir-prefix neural_art__audio_network/ --optimization 3 --all-buffers-info --mvei --cache-maintenance --Oauto-sched --native-float --enable-virtual-mem-pools --Omax-ca-pipe 4 --Ocache-opt --Os --output-info-file c_info.json --network-name audio_network
====================================================================================

Memory usage information  (input/output buffers are included in activations)
------------------------------------------------------------------------------------
	flexMEM    [0x34000000 - 0x34000000]:          0  B /          0  B  (  0.00 % used) -- weights:          0  B (  0.00 % used)  activations:          0  B (  0.00 % used)
	cpuRAM1    [0x34064000 - 0x34064000]:          0  B /          0  B  (  0.00 % used) -- weights:          0  B (  0.00 % used)  activations:          0  B (  0.00 % used)
	cpuRAM2    [0x34100000 - 0x34200000]:          0  B /      1.000 MB  (  0.00 % used) -- weights:          0  B (  0.00 % used)  activations:          0  B (  0.00 % used)
	npuRAM3    [0x34200000 - 0x34270000]:          0  B /    448.000 kB  (  0.00 % used) -- weights:          0  B (  0.00 % used)  activations:          0  B (  0.00 % used)
	npuRAM4    [0x34270000 - 0x342E0000]:          0  B /    448.000 kB  (  0.00 % used) -- weights:          0  B (  0.00 % used)  activations:          0  B (  0.00 % used)
	npuRAM5    [0x342E0000 - 0x34350000]:    144.000 kB /    448.000 kB  ( 32.14 % used) -- weights:          0  B (  0.00 % used)  activations:    144.000 kB ( 32.14 % used)
	npuRAM6    [0x34350000 - 0x343C0000]:          0  B /    448.000 kB  (  0.00 % used) -- weights:          0  B (  0.00 % used)  activations:          0  B (  0.00 % used)
	octoFlash  [0x71000000 - 0x71000000]:          0  B /          0  B  (  0.00 % used) -- weights:          0  B (  0.00 % used)  activations:          0  B (  0.00 % used)
	hyperRAM   [0x90000000 - 0x92000000]:      3.415 MB /     32.000 MB  ( 10.67 % used) -- weights:      3.415 MB ( 10.67 % used)  activations:          0  B (  0.00 % used)
---
Total:                                             3.556 MB                                  -- weights:      3.415 MB                  activations:    144.000 kB                   
====================================================================================

I would l want to place the activations from the yamnet model into npuRAM6, and yolox into npuRAM3,4 and 5. Currently the yamnet activations are in npuRAM5, overlapping the yolox activations, as can be seen from the reports above. Both of the weights are in hyperRAM, at the same address starting at 0x90000000. When i try to change the yolox weights address to 0x9036a4e1 (after the yamnet weights) in Advanced Settings -> External RAM: "Use External RAM (checked)", "Memory: Custom", "Start Address: 0x9036a4e1" (i cannot add a picture of this because i get "Permission Denied" error), the selection automatically goes back to "Memory: xSPI1", "Start Address: 0x90000000". I cannot find a menu at all where i could configure where the activations go?

The document UM2526 Figure 28. shows options "Use activation buffer" with configurable start address, and "Copy weight to RAM" with configurable start address. I don't have these options in my X-CUBE-AI?

As the addresses are baked into the x_network.c files, there has to be a way to configure these addresses from X-CUBE-AI?

I'm X-CUBE-AI version 10.2.0, and CubeMX version 6.16.1.

Thanks!

 

2 REPLIES 2
Tuomas95
Associate III

@Julian E.Would you happen to know if the new thing you were talking about here will also solve this issue?

Best regards,

Tuomas

Hi @Tuomas95,

 

I am talking about the new desktop app that replace X Cube AI. It should come out today, but this is still being done.

I will give you more details later.

 

The thing is that for multiple models, it is not done natively, and I don't know how to do it exactly.

I also need to understand how the advanced option you describe are implement in the new tool (if indeed in the new tool).

 

I am looking for answer, in any cases, it should be a tutorial that we have (I mean: how to use multiples models).

 

I will update you,

Julian

 


In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.