2025-10-10 11:19 PM
Solved! Go to Solution.
2025-10-21 1:24 AM
Hello @fanronghua0123456,
To do an inference with a nb, you can look at this article:
https://wiki.st.com/stm32mpu/wiki/How_to_run_inference_using_the_STAI_MPU_Python_API
In your case, you need to edit the code to print the outputs as you expect, as by default it is done for image classification.
The 6 last lines:
top_k = results.argsort()[-5:][::-1]
labels = load_labels(args.label_file)
for i in top_k:
if output_tensor_dtype == np.uint8:
print('{:08.6f}: {}'.format(float(results[i] / 255.0), labels[i]))
else:
print('{:08.6f}: {}'.format(float(results[i]), labels[i]))
Have a good day,
Julian
2025-10-21 5:03 AM
Thanks for your reply.
I ran it according to your method and encountered the following error. Could you tell me what caused this and how it should be fixed?
It looks like there stai_model.set_input(0, input_data) is an error here.
root@ATK-DLMP257:/opt/ui/src/apps/resource/x-linux-ai/object-detection# python3 testpy.py -m /home/root/best_integer_quant.nb -i /home/root/hander1_1.jpg -l /home
/root/best.txt
Loading dynamically: /usr/lib/libstai_mpu_ovx.so.6
[OVX]: Loading nbg model
**Input node: 0 -Input_name: -Input_dims:4 - input_type:float16 -Input_shape:(1, 640, 640, 3)
**Output node: 0 -Output_name: -Output_dims:3 - Output_type:float16 -Output_shape:(1, 11, 8400)
640
640
----1
Segmentation fault (core dumped)
My testpy.py script is as follows:
from stai_mpu import stai_mpu_network
from numpy.typing import NDArray
from typing import Any, List
from pathlib import Path
from PIL import Image
from argparse import ArgumentParser
from timeit import default_timer as timer
import cv2 as cv
import numpy as np
import time
def load_labels(filename):
with open(filename, 'r') as f:
return [line.strip() for line in f.readlines()]
if __name__ == '__main__':
parser = ArgumentParser()
parser.add_argument('-i','--image', help='image to be classified.')
parser.add_argument('-m','--model_file',help='model to be executed.')
parser.add_argument('-l','--label_file', help='name of labels file.')
parser.add_argument('--input_mean', default=127.5, help='input_mean')
parser.add_argument('--input_std', default=127.5,help='input stddev')
args = parser.parse_args()
stai_model = stai_mpu_network(model_path=args.model_file, use_hw_acceleration=True)
# Read input tensor information
num_inputs = stai_model.get_num_inputs()
input_tensor_infos = stai_model.get_input_infos()
for i in range(0, num_inputs):
input_tensor_shape = input_tensor_infos[i].get_shape()
input_tensor_name = input_tensor_infos[i].get_name()
input_tensor_rank = input_tensor_infos[i].get_rank()
input_tensor_dtype = input_tensor_infos[i].get_dtype()
print("**Input node: {} -Input_name:{} -Input_dims:{} - input_type:{} -Input_shape:{}".format(i, input_tensor_name,
input_tensor_rank,
input_tensor_dtype,
input_tensor_shape))
if input_tensor_infos[i].get_qtype() == "staticAffine":
# Reading the input scale and zero point variables
input_tensor_scale = input_tensor_infos[i].get_scale()
input_tensor_zp = input_tensor_infos[i].get_zero_point()
if input_tensor_infos[i].get_qtype() == "dynamicFixedPoint":
# Reading the dynamic fixed point position
input_tensor_dfp_pos = input_tensor_infos[i].get_fixed_point_pos()
# Read output tensor information
num_outputs = stai_model.get_num_outputs()
output_tensor_infos = stai_model.get_output_infos()
for i in range(0, num_outputs):
output_tensor_shape = output_tensor_infos[i].get_shape()
output_tensor_name = output_tensor_infos[i].get_name()
output_tensor_rank = output_tensor_infos[i].get_rank()
output_tensor_dtype = output_tensor_infos[i].get_dtype()
print("**Output node: {} -Output_name:{} -Output_dims:{} - Output_type:{} -Output_shape:{}".format(i, output_tensor_name,
output_tensor_rank,
output_tensor_dtype,
output_tensor_shape))
if output_tensor_infos[i].get_qtype() == "staticAffine":
# Reading the output scale and zero point variables
output_tensor_scale = output_tensor_infos[i].get_scale()
output_tensor_zp = output_tensor_infos[i].get_zero_point()
if output_tensor_infos[i].get_qtype() == "dynamicFixedPoint":
# Reading the dynamic fixed point position
output_tensor_dfp_pos = output_tensor_infos[i].get_fixed_point_pos()
# Reading input image
input_width = input_tensor_shape[1]
print(input_width)
input_height = input_tensor_shape[2]
print(input_height)
input_image = Image.open(args.image).resize((input_width,input_height))
input_data = np.expand_dims(input_image, axis=0)
if input_tensor_dtype == np.float32:
input_data = (np.float32(input_data) - args.input_mean) /args.input_std
print("----1")
stai_model.set_input(0, input_data)
print("----2")
start = timer()
stai_model.run()
end = timer()
print("Inference time: ", (end - start) *1000, "ms")
output_data = stai_model.get_output(index=0)
results = np.squeeze(output_data)
top_k = results.argsort()[-5:][::-1]
labels = load_labels(args.label_file)
for i in top_k:
if output_tensor_dtype == np.uint8:
print('{:08.6f}: {}'.format(float(results[i] / 255.0), labels[i]))
else:
print('{:08.6f}: {}'.format(float(results[i]), labels[i]))
Thank you very much for your help!
2025-10-21 6:19 AM
Can you please add this to your code, after the resize, to make sure the shape and type of the input are correct:
---
img_array_after = np.array(input_data)
print("Dtype after resize: ", img_array_after.dtype)
print("Shape after resize: ", img_array_after.shape)
---
stai_model.set_input(0, input_data)
Your input is in float16, so your data must also be in float16, which can make this line give an error:
if input_tensor_dtype == np.float32:
input_data = (np.float32(input_data) - args.input_mean) /args.input_std
We do a conversion in float32 of the input if the dtype of the input is of type float32. But in your case, it is float16 so we still need to a conversion to float16, something like
if input_tensor_dtype == np.float16:
print("input data float 16")
input_data = (np.float16(input_data) - args.input_mean) /args.input_std
Have a good day,
Julian
2025-10-21 5:30 PM - edited 2025-10-21 6:29 PM
Thanks for your reply.
Using your method, I added the following code.
# Reading input image
input_width = input_tensor_shape[1]
print(input_width)
input_height = input_tensor_shape[2]
print(input_height)
input_image = Image.open(args.image).resize((input_width,input_height))
input_data = np.expand_dims(input_image, axis=0)
if input_tensor_dtype == np.float32:
input_data = (np.float32(input_data) - args.input_mean) /args.input_std
print("----1")
img_array_after = np.array(input_data)
print("Dtype after resize: ", img_array_after.dtype)
print("Shape after resize: ", img_array_after.shape)
print("----1test")
if input_tensor_dtype == np.float32:
print("float32")
if input_tensor_dtype == np.float16:
print("float16")
input_data = (np.float16(input_data) - args.input_mean) /args.input_std
print("----2test")
stai_model.set_input(0, input_data)
print("----2")
start = timer()
stai_model.run()
end = timer()
It looks like my output is of int8 type after resize. Do I need to convert it?
root@ATK-DLMP257:/opt/ui/src/apps/resource/x-linux-ai/object-detection# python3 testpy.py -m /home/root/best_integer_quant.nb -i /home/root/hander1_1.jpg -l /home
/root/best.txt
Loading dynamically: /usr/lib/libstai_mpu_ovx.so.6
[OVX]: Loading nbg model
**Input node: 0 -Input_name: -Input_dims:4 - input_type:float16 -Input_shape:(1, 640, 640, 3)
**Output node: 0 -Output_name: -Output_dims:3 - Output_type:float16 -Output_shape:(1, 11, 8400)
640
640
----1
Dtype after resize: uint8
Shape after resize: (1, 640, 640, 3)
----1test
float16
----2test
Segmentation fault (core dumped)
root@ATK-DLMP257:/opt/ui/src/apps/resource/x-linux-ai/object-detection#
Best regards
Charles Fan
2025-10-27 6:17 AM - edited 2025-10-27 6:17 AM
Our expert took a look and did tests with your zip. It seems that you need to convert the input to float32 even if the input on the model is float16 because it is badly supported by the stedgeai core.
So something like this:
# Reading input image
input_width = input_tensor_shape[1]
input_height = input_tensor_shape[2]
input_image = Image.open(args.image).resize((input_width,input_height))
input_data = np.expand_dims(input_image, axis=0)
if input_tensor_dtype == np.float32:
print("input data float")
input_data = (np.float32(input_data) - args.input_mean) /args.input_std
<strong> if input_tensor_dtype == np.float16: </strong>
<strong> print("input data float") </strong>
<strong> input_data = (np.float32(input_data) - args.input_mean) /args.input_std </strong>
img_array_after = np.array(input_data)
print("Dtype after resize: ", img_array_after.dtype)
print("Shape after resize: ", img_array_after.shape)
stai_model.set_input(0, input_data)
start = timer()
stai_model.run()
end = timer()
And this is the output that you should get:
root@stm32mp2-e3-c3-c9:~/test_mobilenet_alexis/yolo# python3 test.py -i hander1_1.jpg -m best_integer_quant.
nb -l best.txt
Loading dynamically: /usr/lib/libstai_mpu_ovx.so.6
[OVX]: Loading nbg model
**Input node: 0 -Input_name: -Input_dims:4 - input_type:float16 -Input_shape:(1, 640, 640, 3)
**Output node: 0 -Output_name: -Output_dims:3 - Output_type:float16 -Output_shape:(1, 11, 8400)
input data float
Dtype after resize: float32
Shape after resize: (1, 640, 640, 3)
Inference time: 108.06444752961397 ms
Have a good day,
Julian
2025-10-28 11:22 PM - edited 2025-10-29 2:00 AM
hi, Thanks for your reply.
I was able to infer using your code, but I still haven't received any inference results. My model outputs a shape of 4(position)+7(confidence level), but I couldn't find any items with a confidence greater than 0, which is the same as the result I verified using C++ code.
However, I can infer using best_integer_quant.tflite file on Ubuntu without any problems.
The attachment is the inference result.
from ultralytics import YOLO
tflite_model = YOLO("/home/alientek/best_saved_model/best_integer_quant.tflite")
results = tflite_model("/home/alientek/hander4_Left_379.jpg")
plotted_img = results[0].plot()
from PIL import Image
im = Image.fromarray(plotted_img)
im.show();
so,Can I infer if there is an error in the best_integer_quant.tflite to best_integer_quant.nb file?
./stedgeai generate -m /home/alientek/best_saved_model/best_integer_quant.tflite --target stm32mp25
Thanks for your help.
Thanks
charles fan
2025-10-29 8:25 AM
Could you share your tflite models in a .zip?
Have a good day,
Julian
2025-10-29 5:22 PM
Hi, Thanks for your reply!
Of course, no problem!
I first use the following script to convert the PT file into a TF file. Attached are the files before and after the conversion and the output log information!
from onnxruntime.quantization import quantize_dynamic, QuantType, quantize_static
import onnx
from ultralytics import YOLO
import onnx
from onnxruntime.quantization import quantize_static, QuantType, CalibrationMethod, CalibrationDataReader
if __name__ == '__main__':
model = YOLO('/home/alientek/best.pt') # 替换为你的模型路径
# 1. 首先导出标准 ONNX 模型
model.export(format='tflite', imgsz=640, int8=True)
# 2. 加载并检查模型
# onnx_model = onnx.load('runs/detect/train12/weights/best.onnx')
# onnx.checker.check_model(onnx_model)
#3. 进行动态量化
# quantize_dynamic(
# 'runs/detect/train12/weights/best.onnx',
# 'runs/detect/train12/weights/best_int8.onnx',
# weight_type=QuantType.QUInt8,
# # optimize_model=True
# )
# quantize_static(
# 'runs/detect/train6/weights/yolo11n.onnx',
# 'runs/detect/train6/weights/yolo11n_int8.onnx',
# weight_type=QuantType.QUInt8,
# # optimize_model=True
# )
print("INT8 量化完成!")
Thanks for your help!
Charles fan
2025-10-30 3:53 AM
using the saved_model.pb and this quantize script:
import tensorflow as tf
import numpy as np
def representative_dataset():
for _ in range(10):
data = np.random.rand(1, 640, 640, 3)
yield [data.astype(np.float32)]
# Convert the model
converter = tf.lite.TFLiteConverter.from_saved_model("./saved_model")
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8 # or tf.int8
converter.inference_output_type = tf.float32 # or tf.int8
converter.representative_dataset = representative_dataset
converter._experimental_disable_per_channel = True
tflite_model = converter.convert()
# Save the model
with open("model.tflite", 'wb') as f:
f.write(tflite_model)
then passing the edge ai -> model.nb
and testing on the board with this script:
from stai_mpu import stai_mpu_network
from numpy.typing import NDArray
from typing import Any, List
from pathlib import Path
from PIL import Image
from argparse import ArgumentParser
from timeit import default_timer as timer
import cv2 as cv
import numpy as np
import time
def intersection(rect1, rect2):
"""
This method return the intersection of two rectangles
"""
rect1_x1,rect1_y1,rect1_x2,rect1_y2 = rect1[:4]
rect2_x1,rect2_y1,rect2_x2,rect2_y2 = rect2[:4]
x1 = max(rect1_x1,rect2_x1)
y1 = max(rect1_y1,rect2_y1)
x2 = min(rect1_x2,rect2_x2)
y2 = min(rect1_y2,rect2_y2)
return (x2-x1)*(y2-y1)
def union(rect1,rect2):
"""
This method return the union of two rectangles
"""
rect1_x1,rect1_y1,rect1_x2,rect1_y2 = rect1[:4]
rect2_x1,rect2_y1,rect2_x2,rect2_y2 = rect2[:4]
rect1_area = (rect1_x2-rect1_x1)*(rect1_y2-rect1_y1)
rect2_area = (rect2_x2-rect2_x1)*(rect2_y2-rect2_y1)
return rect1_area + rect2_area - intersection(rect1,rect2)
def iou(rect1,rect2):
"""
This method compute IoU
"""
return intersection(rect1,rect2)/union(rect1,rect2)
def get_results(stai_mpu_model, threshold, iou_threshold):
# Lists to hold respective values while unwrapping.
base_objects_list = []
final_dets = []
# Output (0-4: box coordinates, 5-84: COCO classes confidence)
output = stai_mpu_model.get_output(index=0)
output = np.transpose(np.squeeze(output))
#output = np.squeeze(output)
print("output shape: ", output.shape)
# Split output -> [0..3]: box coordinates, [5]: confidence level
confidence_level = output[:, 4:] # Shape: (1344, 1)
print("confidence shape: ", confidence_level.shape)
print(np.max(confidence_level, axis=0))
print(np.max(confidence_level, axis=1))
indices = np.where(confidence_level > threshold)[0]
print(indices)
filtered_output = output[indices]
print(filtered_output.shape)
for i in range(filtered_output.shape[0]):
x_center, y_center, width, height = filtered_output[i][:4]
left = (x_center - width/2)
top = (y_center - height/2)
right = (x_center + width/2)
bottom = (y_center + height/2)
score = np.max(filtered_output[i][4:]) # filtered_output[i][4]
class_id = 0
base_objects_list.append([left, top, right, bottom, score, class_id])
# Do NMS
base_objects_list.sort(key=lambda x: x[4], reverse=True)
while len(base_objects_list)>0:
final_dets.append(base_objects_list[0])
base_objects_list = [objects for objects in base_objects_list if iou(objects,base_objects_list[0]) < iou_threshold]
return final_dets
def load_labels(filename):
with open(filename, 'r') as f:
return [line.strip() for line in f.readlines()]
if __name__ == '__main__':
parser = ArgumentParser()
parser.add_argument('-i','--image', help='image to be classified.')
parser.add_argument('-m','--model_file',help='model to be executed.')
parser.add_argument('-l','--label_file', help='name of labels file.')
parser.add_argument('--input_mean', default=127.5, help='input_mean')
parser.add_argument('--input_std', default=127.5,help='input stddev')
args = parser.parse_args()
stai_model = stai_mpu_network(model_path=args.model_file, use_hw_acceleration=True)
# Read input tensor information
num_inputs = stai_model.get_num_inputs()
input_tensor_infos = stai_model.get_input_infos()
for i in range(0, num_inputs):
input_tensor_shape = input_tensor_infos[i].get_shape()
input_tensor_name = input_tensor_infos[i].get_name()
input_tensor_rank = input_tensor_infos[i].get_rank()
input_tensor_dtype = input_tensor_infos[i].get_dtype()
print("**Input node: {} -Input_name:{} -Input_dims:{} - input_type:{} -Input_shape:{}".format(i, input_tensor_name,
input_tensor_rank,
input_tensor_dtype,
input_tensor_shape))
if input_tensor_infos[i].get_qtype() == "staticAffine":
# Reading the input scale and zero point variables
input_tensor_scale = input_tensor_infos[i].get_scale()
input_tensor_zp = input_tensor_infos[i].get_zero_point()
if input_tensor_infos[i].get_qtype() == "dynamicFixedPoint":
# Reading the dynamic fixed point position
input_tensor_dfp_pos = input_tensor_infos[i].get_fixed_point_pos()
# Read output tensor information
num_outputs = stai_model.get_num_outputs()
output_tensor_infos = stai_model.get_output_infos()
for i in range(0, num_outputs):
output_tensor_shape = output_tensor_infos[i].get_shape()
output_tensor_name = output_tensor_infos[i].get_name()
output_tensor_rank = output_tensor_infos[i].get_rank()
output_tensor_dtype = output_tensor_infos[i].get_dtype()
print("**Output node: {} -Output_name:{} -Output_dims:{} - Output_type:{} -Output_shape:{}".format(i, output_tensor_name,
output_tensor_rank,
output_tensor_dtype,
output_tensor_shape))
if output_tensor_infos[i].get_qtype() == "staticAffine":
# Reading the output scale and zero point variables
output_tensor_scale = output_tensor_infos[i].get_scale()
output_tensor_zp = output_tensor_infos[i].get_zero_point()
if output_tensor_infos[i].get_qtype() == "dynamicFixedPoint":
# Reading the dynamic fixed point position
output_tensor_dfp_pos = output_tensor_infos[i].get_fixed_point_pos()
# Reading input image
input_width = input_tensor_shape[1]
print(input_width)
input_height = input_tensor_shape[2]
print(input_height)
input_image = Image.open(args.image).resize((input_width,input_height))
input_data = np.expand_dims(input_image, axis=0)
if input_tensor_dtype == np.float32:
input_data = (np.float32(input_data) - args.input_mean) /args.input_std
print("----1")
img_array_after = np.array(input_data)
print("Dtype after resize: ", img_array_after.dtype)
print("Shape after resize: ", img_array_after.shape)
print("----1test")
if input_tensor_dtype == np.float32:
print("float32")
if input_tensor_dtype == np.float16:
print("float16")
#input_data = (np.float32(input_data) - args.input_mean) /args.input_std
input_data = np.float32(input_data)
print("----2test")
stai_model.set_input(0, input_data)
print("----2")
start = timer()
stai_model.run()
end = timer()
print("Inference time: ", (end - start) *1000, "ms")
final_dets = get_results(stai_model, 0.5, 0.5)
print(final_dets)
we get this:
Loading dynamically: /usr/lib/libstai_mpu_ovx.so.6
[OVX]: Loading nbg model
**Input node: 0 -Input_name: -Input_dims:4 - input_type:uint8 -Input_shape:(1, 640, 640, 3)
**Output node: 0 -Output_name: -Output_dims:3 - Output_type:float16 -Output_shape:(1, 11, 8400)
640
640
----1
Dtype after resize: uint8
Shape after resize: (1, 640, 640, 3)
----1test
----2test
----2
Inference time: 118.66035591810942 ms
output shape: (8400, 11)
confidence shape: (8400, 7)
[0.75878906 0.01785278 0.54003906 0. 0. 0.0044632
0. ]
[0. 0. 0. ... 0. 0. 0.]
[8250 8251 8252 8270 8272 8291]
(6, 11)
[[0.3995361328125, 0.265625, 0.7967529296875, 1.001953125, 0.75878906, 0]]
Have a good day,
Julian
2025-10-31 12:56 AM
Thanks your help! it's running.