Build failed error when deploying pose detection model

athern27 · ‎2025-02-28

Hi everyone,
I am trying to deploy a pose detection model on stm32n6570-dk board but I am getting the following error.


(st_zoo) kartikkhandewal@ATL-HPZG14-99:~/stm32packages/stm32ai-modelzoo-services/pose_estimation/src$ python3 stm32ai_main.py --config-path ./config_file_examples --config-name deployment_n6_yolo_mpe_config.yaml
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1740729082.336274  136562 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1740729082.339482  136562 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
[WARNING] The usable GPU memory is unlimited.
Please consider setting the 'gpu_memory_limit' attribute in the 'general' section of your configuration file.
[INFO] : Running `deployment` operation mode
[INFO] : The random seed for this simulation is 123
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
[INFO] : Generating C header file for Getting Started...
loading model.. model_path="yolov8n_256_quant_pc_uf_pose_pallet.tflite"
loading conf file.. "../../application_code/pose_estimation/STM32N6/stmaic_STM32N6570-DK.conf" config="None"
"n6 release" configuration is used
[INFO] : Selected board :  "STM32N6570-DK Getting Started Pose Estimation (STM32CubeIDE)" (stm32_cube_ide/n6 release/stm32n6)
[INFO] : Compiling the model and generating optimized C code + Lib/Inc files:  yolov8n_256_quant_pc_uf_pose_pallet.tflite
setting STM.AI tools.. root_dir="", req_version=""
 Cube AI Path: "/home/kartikkhandewal/STEdgeAI/2.0/Utilities/linux/stedgeai".
[INFO] : Offline CubeAI used; Selected tools:  10.0.0 (x-cube-ai pack)
loading conf file.. "../../application_code/pose_estimation/STM32N6/stmaic_STM32N6570-DK.conf" config="None"
"n6 release" configuration is used
compiling... "yolov8n_256_quant_pc_uf_pose_pallet_tflite" session
 model_path  : ['yolov8n_256_quant_pc_uf_pose_pallet.tflite']
 tools       : 10.0.0 (x-cube-ai pack)
 target      : "STM32N6570-DK Getting Started Pose Estimation (STM32CubeIDE)" (stm32_cube_ide/n6 release/stm32n6)
 options     : --st-neural-art default@../../application_code/pose_estimation/STM32N6/Model/user_neuralart.json --input-data-type uint8 --inputs-ch-position chlast
"series" value is not coherent.. stm32n6 != stm32n6npu
 results -> RAM=1,675,264 IO=196,608:220,416 WEIGHTS=3,243,377 MACC=0 RT_RAM=1,893 RT_FLASH=504,138 LATENCY=0.000
[INFO] : Optimized C code + Lib/Inc files generation done.
[INFO] : Building the STM32 c-project..
deploying the c-project.. "STM32N6570-DK Getting Started Pose Estimation (STM32CubeIDE)" (stm32_cube_ide/n6 release/stm32n6)
updating.. n6 release
 -> s:copying file.. "network.c" to ../../application_code/pose_estimation/STM32N6/Model/network.c
 -> s:copying file.. "network_ecblobs.h" to ../../application_code/pose_estimation/STM32N6/Model/network_ecblobs.h
 -> s:copying file.. "network_atonbuf.xSPI2.raw" to ../../application_code/pose_estimation/STM32N6/Model/network_atonbuf.xSPI2.raw
 -> s:removing dir.. ../../application_code/pose_estimation/STM32N6/Middlewares/AI_Runtime/Lib/GCC/ARMCortexM55
 -> s:copying dir.. "ARMCortexM55" to ../../application_code/pose_estimation/STM32N6/Middlewares/AI_Runtime/Lib/GCC/ARMCortexM55
 -> s:removing dir.. ../../application_code/pose_estimation/STM32N6/Middlewares/AI_Runtime/Inc
 -> s:copying dir.. "Inc" to ../../application_code/pose_estimation/STM32N6/Middlewares/AI_Runtime/Inc
 -> s:removing dir.. ../../application_code/pose_estimation/STM32N6/Middlewares/AI_Runtime/Npu/ll_aton
 -> s:copying dir.. "ll_aton" to ../../application_code/pose_estimation/STM32N6/Middlewares/AI_Runtime/Npu/ll_aton
 -> u:copying file.. "app_config.h" to ../../application_code/pose_estimation/STM32N6/Inc/app_config.h
 -> updating cproject file "/home/kartikkhandewal/stm32packages/stm32ai-modelzoo-services/application_code/pose_estimation/STM32N6/STM32CubeIDE" with "NetworkRuntime1000_CM55_GCC.a"
building.. n6 release
[returned code = 1 - FAILED]
flashing.. n6 release STM32N6570-DK
[returned code = 1 - FAILED]
Board programming failed: " Error: binary file does not exist:  Debug/STM32N6_GettingStarted_PoseEstimation.bin"
[returned code = 1 - FAILED]
Board programming failed: "Error: File does not exist: STM32N6_GettingStarted_PoseEstimation_signed.bin"
Board programming failed: "Error: File does not exist: STM32N6_GettingStarted_PoseEstimation_signed.bin"
Board programming failed: "Error: File does not exist: STM32N6_GettingStarted_PoseEstimation_signed.bin"
[INFO] : Deployment complete.
[INFO] : Please on STM32N6570-DK toggle the boot switches to the left and power cycle the board.

I have deployed object detection models on board and they have worked fine but I am not able to deploy this.
Steps I followed-
best.pt ----> my original pose detection model (12 keypoints to detect)
converted it to tflite which gave me saved_model using


from ultralytics import YOLO

model_path = 'best.pt'
model = YOLO(model_path)
results = model.export(format='tflite', int8=True, imgsz=[256, 256])

Quantized model following this tutorial .

Deployed it using this tutorial

Kindly help.

Julian E. · ‎2025-03-04

Hello @athern27 ,

The issue may come from the 12 keypoints to detect. It seems that we only support 13 and 17.

Can you try to use a model with 13 or 17 keypoints?

You can find ultralytics yolos model examples here:

ultralytics/examples/YOLOv8-STEdgeAI at main · stm32-hotspot/ultralytics

If it doesn't help, it may be an issue in the deployment scripts, so do not hesitate to come back to me if you still have an issue.

Have a good day,

Julian

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

athern27 · ‎2025-03-04

Hi @Julian E. ,

I tried running the yolov8n-pose.pt model, which is designed to detect 17 body keypoints. After quantizing it, I was able to run it successfully with good accuracy.

Is there no way to deploy a 12-keypoint model on the board?

Kindly help.

Julian E. · ‎2025-03-06

Hello @athern27,

I am not exactly sure about the procedure, could you please try to adapt the postprocess_conf.h with

#define AI_MPE_YOLOV8_PP_TOTAL_BOXES (1200)
#define AI_POSE_PP_POSE_KEYPOINTS_NB (12)

And look at the need app.c and see if you see something related to that.

I've got the information that it is not a big deal to change the getting started but I don't have clear guideline.

If you can generate a model with 17 and 13 keypoints and compare what is different, you may be able to understand the part related to the keypoints.

Have a good day,

Julian

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.