cancel
Showing results for 
Search instead for 
Did you mean: 

STM32N6 CPU backend reports unexpectedly small weights(ro) size

ayaaaa
Associate III

 

Hi ST community,

I’m investigating an unexpected behavior regarding model weights generated by the STM32N6 CPU backend vs Neural-ART backend.

Model tested:

  • Visual Wake Word (MobileNet-like INT8 model)

Observations:

CPU backend

Report shows:

  • params # ≈ 210,850
  • weights(ro) ≈ 42 KB

Neural-ART backend

Report shows:

  • model weights ≈ 219 KB
  • octoFlash weights ≈ 206 KB

This ~200 KB size is coherent with the real INT8 parameter count (~210k params).

What is surprising is that only the ST CPU backend reports ~42 KB.

I also tested:

--compression none

 

but the generated CPU backend still reports ~42 KB.

So this does NOT appear to be related to the documented CLI compression options (lossless, medium, high, etc.).

Questions:

  1. What exactly does weights(ro) represent in the CPU backend report?
  2. Is the CPU backend internally using another packed/compressed representation even when --compression none is selected?
  3. Is there any undocumented optimization specific to MobileNet/1x1 convolution-heavy architecturesfor the cpu ? ( I found the one specifiying special cases for 1x1 convolution for neural art st)?
  4. Why do the Neural-ART  remain close to the raw parameter size (~200 KB), while the ST CPU backend drops to ~42 KB?

I could not find documentation explaining this behavior, so any clarification would be very appreciated.

i will attach the three generated reports .

0 REPLIES 0