2026-05-26 7:04 AM - last edited on 2026-05-26 7:08 AM by Andrew Neil
Hi ST community,
I’m investigating an unexpected behavior regarding model weights generated by the STM32N6 CPU backend vs Neural-ART backend.
Model tested:
Observations:
Report shows:
Report shows:
This ~200 KB size is coherent with the real INT8 parameter count (~210k params).
What is surprising is that only the ST CPU backend reports ~42 KB.
I also tested:
--compression none
but the generated CPU backend still reports ~42 KB.
So this does NOT appear to be related to the documented CLI compression options (lossless, medium, high, etc.).
Questions:
I could not find documentation explaining this behavior, so any clarification would be very appreciated.
i will attach the three generated reports .