how to get the bounding boxes from the output of the object detection models?

jetty · ‎2024-11-24

Hi everyone,

I'm trying to integrate st_yolo_lc_v1_192 pretrained model to my project on STM32H750 MCU.

I'm new to the framework and AI tech.

I see the output shape is 12x12x30 which I know the 12x12 means there are 12x12 grids. What I don't know is the '30', does that mean 5 * [x, y, w, h, confidence, class] ?

and do I need to apply an activation function to these values?

Thanks!

MCHTO.1 · ‎2024-11-25

Hi,

Indeed 30 is 5 * [x, y, w, h, confidence, class] because by default st_yololc_v1 has 5 anchors.

To decode the raw output of the model we use this function:
https://github.com/STMicroelectronics/stm32ai-modelzoo/blob/e5361e76f8427b0907b67d9815101d05c32e7407/object_detection/src/postprocessing/tiny_yolo_v2_postprocess.py#L12

and then you need to apply non max suppression, an example can be found here:

https://github.com/STMicroelectronics/stm32ai-modelzoo/blob/e5361e76f8427b0907b67d9815101d05c32e7407/object_detection/src/postprocessing/tiny_yolo_v2_postprocess.py#L104

View solution in original post

MCHTO.1 · ‎2024-11-25

Hi,

Indeed 30 is 5 * [x, y, w, h, confidence, class] because by default st_yololc_v1 has 5 anchors.

To decode the raw output of the model we use this function:
https://github.com/STMicroelectronics/stm32ai-modelzoo/blob/e5361e76f8427b0907b67d9815101d05c32e7407/object_detection/src/postprocessing/tiny_yolo_v2_postprocess.py#L12

and then you need to apply non max suppression, an example can be found here:

https://github.com/STMicroelectronics/stm32ai-modelzoo/blob/e5361e76f8427b0907b67d9815101d05c32e7407/object_detection/src/postprocessing/tiny_yolo_v2_postprocess.py#L104