INT16 Quantization for YOLOv8 models on STM32N6570-DK

athern27 · ‎2025-03-12

Hi everyone,

I'm working on deploying an object detection model on the STM32N6570-DK board. While I have successfully deployed it, I noticed a significant drop in accuracy during live testing compared to the floating-point YOLOv8 model.

I'm now considering using INT16 quantization instead of INT8. My question is: How can I quantize the model to INT16? The tutorial I’m following states that quantization is only available for UINT8 and INT8.

Additionally, is it possible to deploy an object detection model using X-CUBE-AI in STM32CubeMX? I haven’t been able to find any tutorials on object detection with X-CUBE-AI.

Any guidance or resources would be greatly appreciated!

Thanks!

Julian E. · ‎2025-03-12

Hello @athern27,

The neural art accelerator on the N6 only support INT8.

You can probably find how to convert your onnx or h5 model with tensorflowlite on your own using float16. I don't think we provide this solution.

X Cube AI is a CubeMX addon that uses the ST Edge AI Core to convert your model to C code. It also generates a basic template application that run an inference with a random input.

It means that the model can be anything, X Cube AI will also do what I described.

It is up to you to create your object detection application.

That being said, we have the getting started application that you can use to deploy your own model (kind of).

You can use it as a standalone application or use ST Model Zoo scripts.

AI Getting Started N6: STM32N6-AI - AI software ecosystem for STM32N6 with Neural-ART accelerator - STMicroelectronics

With ST Model Zoo: stm32ai-modelzoo-services/object_detection/deployment/README.md at main · STMicroelectronics/stm32ai-modelzoo-services

For your bad performance with your quantize model, make sure to quantize it with real data and a good amount.

You can also do a validation procedure and check the COS.

The COS is a metric that compare your quantized python model to the C code model used.
If you have a low COS, it means that you lose accuracy because of the conversion of your model to C

https://stedgeai-dc.st.com/assets/embedded-docs/stneuralart_getting_started.html

Have a good day

Julian

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

View solution in original post

Julian E. · ‎2025-03-12