2025-04-22 7:11 AM
Hello, I'm writing my bachelor thesis regarding the deployment of the model inside the STM32, and I just want to use CubeIDE to run my own program. It takes in the image data through the UART, put the data inside the model input buffer and then do the LL_ATON_RT_Main(&NN_Instance_Default); and finally gives out the ourt put.
I really did everthing I can, I already wrote the parameters generated by the Cube AI in side the External flash (exactly same place as the cubeAI report, which is 0x70000000 ). The model is from offical website: mobilenet_v1_0.25_224_fft_int8.tflite .
the file is from the Neural ART(cube AI) the origianl name is network_atonbuf.xSPI2.raw, I copypaste it and rename it to .BIN so that I can download it with the cube programmer instead of using the cmd lines or some python files of the Neural ART.
After that, I run my program, below is my ST program ,It works without any "wornings or bugs",I can inspect the output by monitoring the float buffer[101] with debug mode.The input data is also exactly the same with the PC side, I even wrote a CRC test for it,and I tried many ways and made sure that its exactly the same! But the output is very different from the model I run inside the computer (see the part after the code):
/* USER CODE BEGIN Header */
/**
******************************************************************************
* @file : main.c
* @brief : Main program body
******************************************************************************
* @attention
*
* Copyright (c) 2025 STMicroelectronics.
* All rights reserved.
*
* This software is licensed under terms that can be found in the LICENSE file
* in the root directory of this software component.
* If no LICENSE file comes with this software, it is provided AS-IS.
*
******************************************************************************
*/
/* USER CODE END Header */
/* Includes ------------------------------------------------------------------*/
#include "main.h"
/* Private includes ----------------------------------------------------------*/
/* USER CODE BEGIN Includes */
#include <string.h>
#include "ll_aton_runtime.h"
#include "stm32n6xx_nucleo_xspi.h"
/* USER CODE END Includes */
/* Private typedef -----------------------------------------------------------*/
/* USER CODE BEGIN PTD */
#define UART_TIMEOUT 20000
/* USER CODE END PTD */
/* Private define ------------------------------------------------------------*/
/* USER CODE BEGIN PD */
/* USER CODE END PD */
/* Private macro -------------------------------------------------------------*/
/* USER CODE BEGIN PM */
/* USER CODE END PM */
/* Private variables ---------------------------------------------------------*/
CACHEAXI_HandleTypeDef hcacheaxi;
CRC_HandleTypeDef hcrc;
//RAMCFG_HandleTypeDef hramcfg_SRAM1;
//RAMCFG_HandleTypeDef hramcfg_SRAM2;
//RAMCFG_HandleTypeDef hramcfg_SRAM3;
//RAMCFG_HandleTypeDef hramcfg_SRAM4;
//RAMCFG_HandleTypeDef hramcfg_SRAM5;
//RAMCFG_HandleTypeDef hramcfg_SRAM6;
UART_HandleTypeDef huart1;
//XSPI_HandleTypeDef hxspi2;
/* USER CODE BEGIN PV */
// PV for the AI function
uint32_t IMG_SIZE = 224 * 224 * 3; // 150,528 Bytes
LL_ATON_DECLARE_NAMED_NN_INSTANCE_AND_INTERFACE(Default);
// PV for the UART function
uint8_t wait_flag = 1;
uint8_t* ram_pointer;
uint16_t chunk_len = 65532;
int32_t remain_counter = 224*224*3;
uint16_t valid_len = 65532;
//uint8_t buffer[12];
uint32_t CRC_code;
/* USER CODE END PV */
/* Private function prototypes -----------------------------------------------*/
void SystemClock_Config(void);
static void MX_GPIO_Init(void);
static void MX_CACHEAXI_Init(void);
static void MX_CRC_Init(void);
//static void MX_RAMCFG_Init(void);
static void MX_USART1_UART_Init(void);
//static void MX_XSPI2_Init(void);
/* USER CODE BEGIN PFP */
/* USER CODE END PFP */
/* Private user code ---------------------------------------------------------*/
/* USER CODE BEGIN 0 */
static void NPURam_enable()
{
__HAL_RCC_NPU_CLK_ENABLE();
__HAL_RCC_NPU_FORCE_RESET();
__HAL_RCC_NPU_RELEASE_RESET();
/* Enable NPU RAMs (4x448KB) */
__HAL_RCC_AXISRAM3_MEM_CLK_ENABLE();
__HAL_RCC_AXISRAM4_MEM_CLK_ENABLE();
__HAL_RCC_AXISRAM5_MEM_CLK_ENABLE();
__HAL_RCC_AXISRAM6_MEM_CLK_ENABLE();
__HAL_RCC_RAMCFG_CLK_ENABLE();
RAMCFG_HandleTypeDef hramcfg = {0};
hramcfg.Instance = RAMCFG_SRAM3_AXI;
HAL_RAMCFG_EnableAXISRAM(&hramcfg);
hramcfg.Instance = RAMCFG_SRAM4_AXI;
HAL_RAMCFG_EnableAXISRAM(&hramcfg);
hramcfg.Instance = RAMCFG_SRAM5_AXI;
HAL_RAMCFG_EnableAXISRAM(&hramcfg);
hramcfg.Instance = RAMCFG_SRAM6_AXI;
HAL_RAMCFG_EnableAXISRAM(&hramcfg);
}
static void set_clk_sleep_mode(void)
{
/*** Enable sleep mode support during NPU inference *************************/
/* Configure peripheral clocks to remain active during sleep mode */
/* Keep all IP's enabled during WFE so they can wake up CPU. Fine tune
* this if you want to save maximum power
*/
__HAL_RCC_XSPI1_CLK_SLEEP_ENABLE(); /* For display frame buffer */
__HAL_RCC_XSPI2_CLK_SLEEP_ENABLE(); /* For NN weights */
__HAL_RCC_NPU_CLK_SLEEP_ENABLE(); /* For NN inference */
__HAL_RCC_CACHEAXI_CLK_SLEEP_ENABLE(); /* For NN inference */
__HAL_RCC_LTDC_CLK_SLEEP_ENABLE(); /* For display */
__HAL_RCC_DMA2D_CLK_SLEEP_ENABLE(); /* For display */
__HAL_RCC_DCMIPP_CLK_SLEEP_ENABLE(); /* For camera configuration retention */
__HAL_RCC_CSI_CLK_SLEEP_ENABLE(); /* For camera configuration retention */
__HAL_RCC_FLEXRAM_MEM_CLK_SLEEP_ENABLE();
__HAL_RCC_AXISRAM1_MEM_CLK_SLEEP_ENABLE();
__HAL_RCC_AXISRAM2_MEM_CLK_SLEEP_ENABLE();
__HAL_RCC_AXISRAM3_MEM_CLK_SLEEP_ENABLE();
__HAL_RCC_AXISRAM4_MEM_CLK_SLEEP_ENABLE();
__HAL_RCC_AXISRAM5_MEM_CLK_SLEEP_ENABLE();
__HAL_RCC_AXISRAM6_MEM_CLK_SLEEP_ENABLE();
}
void HAL_UART_RxCpltCallback(UART_HandleTypeDef *huart)
{
if(ram_pointer[0] == 0xAA && ram_pointer[1] == 0xBB && ram_pointer[2] == 0xCC)
{
remain_counter = remain_counter - chunk_len;
uint8_t accept_buffer[3] = {0xAA,0xBB,0xCC};
HAL_UART_Transmit(&huart1, accept_buffer, 3, HAL_MAX_DELAY);
memmove(ram_pointer, ram_pointer + 3, valid_len);
ram_pointer = ram_pointer + chunk_len;
if (remain_counter>0)
{
if(remain_counter >chunk_len)
{
HAL_UART_Receive_IT(&huart1, ram_pointer, (chunk_len + 3));
}
else
{
valid_len = remain_counter;
HAL_UART_Receive_IT(&huart1, ram_pointer, (remain_counter + 3));
}
}
else
{
CRC_code = HAL_CRC_Calculate(&hcrc, (void*) 0x342E0000 , (224*224*3));
HAL_UART_Transmit(&huart1, (uint8_t*)&CRC_code, 4, HAL_MAX_DELAY);
wait_flag = 0;
}
}
else
{
uint8_t deny_buffer[3] = {0xAA,0xBB,0xDD};
HAL_UART_Transmit(&huart1, deny_buffer, 3, HAL_MAX_DELAY);
}
}
/* USER CODE END 0 */
/**
* @brief The application entry point.
* @retval int
*/
int main(void)
{
/* USER CODE BEGIN 1 */
/* USER CODE END 1 */
/* MCU Configuration--------------------------------------------------------*/
HAL_Init();
/* USER CODE BEGIN Init */
/* USER CODE END Init */
/* Configure the system clock */
SystemClock_Config();
/* USER CODE BEGIN SysInit */
/* USER CODE END SysInit */
/* Initialize all configured peripherals */
MX_GPIO_Init();
MX_CACHEAXI_Init();
MX_CRC_Init();
// MX_RAMCFG_Init();
MX_USART1_UART_Init();
// MX_XSPI2_Init();
/* USER CODE BEGIN 2 */
NPURam_enable();
//Enable Flash memory map
BSP_XSPI_NOR_Init_t NOR_Init;
NOR_Init.InterfaceMode = BSP_XSPI_NOR_OPI_MODE;
NOR_Init.TransferRate = BSP_XSPI_NOR_DTR_TRANSFER;
BSP_XSPI_NOR_Init(0, &NOR_Init);
//These are the same to the MX_XSPI2_Init(must be configured right)
BSP_XSPI_NOR_EnableMemoryMappedMode(0);
set_clk_sleep_mode();
//Input buffers information are in NN_Interface_Default->input_buffers_info()
//Get address of the first input:
const LL_Buffer_InfoTypeDef * input_infos = NN_Interface_Default.input_buffers_info();
void* i0_start_address = LL_Buffer_addr_start(&input_infos[0]);
// void* i0_end_address = LL_Buffer_addr_end(&input_infos[0]);
//Get address of the first output:
const LL_Buffer_InfoTypeDef * output_infos = NN_Interface_Default.output_buffers_info();
void* o0_start_address = LL_Buffer_addr_start(&output_infos[0]);
void* o0_end_address = LL_Buffer_addr_end(&output_infos[0]);
ram_pointer = (uint8_t*)i0_start_address;
/* USER CODE END 2 */
/* Infinite loop */
/* USER CODE BEGIN WHILE */
while (1)
{
HAL_UART_Receive_IT(&huart1, ram_pointer, (chunk_len + 3));
// wait for receive complete
while(wait_flag)
{
HAL_Delay(10);
}
wait_flag = 1;
ram_pointer = (uint8_t*)i0_start_address;
remain_counter = 224*224*3;
valid_len = 65532;
SCB_CleanDCache_by_Addr((uint32_t*)i0_start_address, 224*224*3);
// complete and set dynamic parameters back
LL_ATON_RT_Main(&NN_Instance_Default);
SCB_InvalidateDCache_by_Addr((uint32_t*)o0_start_address, sizeof(float) * 101);
// see the after number
size_t num_floats = ((uintptr_t)o0_end_address - (uintptr_t)o0_start_address) / sizeof(float);
float* output_data = (float*)o0_start_address;
float buffer[101] = {0};
memcpy(buffer, output_data, num_floats * sizeof(float));
HAL_Delay(1000);
/* USER CODE END WHILE */
/* USER CODE BEGIN 3 */
}
/* USER CODE END 3 */
}
/* USER CODE BEGIN CLK 1 */
/* USER CODE END CLK 1 */
/**
* @brief System Clock Configuration
* @retval None
*/
void SystemClock_Config(void)
{
RCC_OscInitTypeDef RCC_OscInitStruct = {0};
RCC_ClkInitTypeDef RCC_ClkInitStruct = {0};
/** Configure the System Power Supply
*/
if (HAL_PWREx_ConfigSupply(PWR_EXTERNAL_SOURCE_SUPPLY) != HAL_OK)
{
Error_Handler();
}
/* Enable HSI */
RCC_OscInitStruct.OscillatorType = RCC_OSCILLATORTYPE_HSI;
RCC_OscInitStruct.HSIState = RCC_HSI_ON;
RCC_OscInitStruct.HSIDiv = RCC_HSI_DIV1;
RCC_OscInitStruct.HSICalibrationValue = RCC_HSICALIBRATION_DEFAULT;
RCC_OscInitStruct.PLL1.PLLState = RCC_PLL_NONE;
RCC_OscInitStruct.PLL2.PLLState = RCC_PLL_NONE;
RCC_OscInitStruct.PLL3.PLLState = RCC_PLL_NONE;
RCC_OscInitStruct.PLL4.PLLState = RCC_PLL_NONE;
if (HAL_RCC_OscConfig(&RCC_OscInitStruct) != HAL_OK)
{
Error_Handler();
}
/** Get current CPU/System buses clocks configuration and if necessary switch
to intermediate HSI clock to ensure target clock can be set
*/
HAL_RCC_GetClockConfig(&RCC_ClkInitStruct);
if ((RCC_ClkInitStruct.CPUCLKSource == RCC_CPUCLKSOURCE_IC1) ||
(RCC_ClkInitStruct.SYSCLKSource == RCC_SYSCLKSOURCE_IC2_IC6_IC11))
{
RCC_ClkInitStruct.ClockType = (RCC_CLOCKTYPE_CPUCLK | RCC_CLOCKTYPE_SYSCLK);
RCC_ClkInitStruct.CPUCLKSource = RCC_CPUCLKSOURCE_HSI;
RCC_ClkInitStruct.SYSCLKSource = RCC_SYSCLKSOURCE_HSI;
if (HAL_RCC_ClockConfig(&RCC_ClkInitStruct) != HAL_OK)
{
/* Initialization Error */
Error_Handler();
}
}
/** Initializes the RCC Oscillators according to the specified parameters
* in the RCC_OscInitTypeDef structure.
*/
RCC_OscInitStruct.OscillatorType = RCC_OSCILLATORTYPE_NONE;
RCC_OscInitStruct.PLL1.PLLState = RCC_PLL_ON;
RCC_OscInitStruct.PLL1.PLLSource = RCC_PLLSOURCE_HSI;
RCC_OscInitStruct.PLL1.PLLM = 1;
RCC_OscInitStruct.PLL1.PLLN = 25;
RCC_OscInitStruct.PLL1.PLLFractional = 0;
RCC_OscInitStruct.PLL1.PLLP1 = 1;
RCC_OscInitStruct.PLL1.PLLP2 = 1;
RCC_OscInitStruct.PLL2.PLLState = RCC_PLL_ON;
RCC_OscInitStruct.PLL2.PLLSource = RCC_PLLSOURCE_HSI;
RCC_OscInitStruct.PLL2.PLLM = 1;
RCC_OscInitStruct.PLL2.PLLN = 25;
RCC_OscInitStruct.PLL2.PLLFractional = 0;
RCC_OscInitStruct.PLL2.PLLP1 = 1;
RCC_OscInitStruct.PLL2.PLLP2 = 1;
RCC_OscInitStruct.PLL3.PLLState = RCC_PLL_NONE;
RCC_OscInitStruct.PLL4.PLLState = RCC_PLL_NONE;
if (HAL_RCC_OscConfig(&RCC_OscInitStruct) != HAL_OK)
{
Error_Handler();
}
/** Initializes the CPU, AHB and APB buses clocks
*/
RCC_ClkInitStruct.ClockType = RCC_CLOCKTYPE_CPUCLK|RCC_CLOCKTYPE_HCLK
|RCC_CLOCKTYPE_SYSCLK|RCC_CLOCKTYPE_PCLK1
|RCC_CLOCKTYPE_PCLK2|RCC_CLOCKTYPE_PCLK5
|RCC_CLOCKTYPE_PCLK4;
RCC_ClkInitStruct.CPUCLKSource = RCC_CPUCLKSOURCE_IC1;
RCC_ClkInitStruct.SYSCLKSource = RCC_SYSCLKSOURCE_IC2_IC6_IC11;
RCC_ClkInitStruct.AHBCLKDivider = RCC_HCLK_DIV2;
RCC_ClkInitStruct.APB1CLKDivider = RCC_APB1_DIV1;
RCC_ClkInitStruct.APB2CLKDivider = RCC_APB2_DIV1;
RCC_ClkInitStruct.APB4CLKDivider = RCC_APB4_DIV1;
RCC_ClkInitStruct.APB5CLKDivider = RCC_APB5_DIV1;
RCC_ClkInitStruct.IC1Selection.ClockSelection = RCC_ICCLKSOURCE_PLL2;
RCC_ClkInitStruct.IC1Selection.ClockDivider = 2;
RCC_ClkInitStruct.IC2Selection.ClockSelection = RCC_ICCLKSOURCE_PLL1;
RCC_ClkInitStruct.IC2Selection.ClockDivider = 4;
RCC_ClkInitStruct.IC6Selection.ClockSelection = RCC_ICCLKSOURCE_PLL1;
RCC_ClkInitStruct.IC6Selection.ClockDivider = 4;
RCC_ClkInitStruct.IC11Selection.ClockSelection = RCC_ICCLKSOURCE_PLL1;
RCC_ClkInitStruct.IC11Selection.ClockDivider = 8;
if (HAL_RCC_ClockConfig(&RCC_ClkInitStruct) != HAL_OK)
{
Error_Handler();
}
}
/**
* @brief CACHEAXI Initialization Function
* @PAram None
* @retval None
*/
static void MX_CACHEAXI_Init(void)
{
/* USER CODE BEGIN CACHEAXI_Init 0 */
/* USER CODE END CACHEAXI_Init 0 */
/* USER CODE BEGIN CACHEAXI_Init 1 */
/* USER CODE END CACHEAXI_Init 1 */
hcacheaxi.Instance = CACHEAXI;
if (HAL_CACHEAXI_Init(&hcacheaxi) != HAL_OK)
{
Error_Handler();
}
/* USER CODE BEGIN CACHEAXI_Init 2 */
/* USER CODE END CACHEAXI_Init 2 */
}
/**
* @brief CRC Initialization Function
* @PAram None
* @retval None
*/
static void MX_CRC_Init(void)
{
/* USER CODE BEGIN CRC_Init 0 */
/* USER CODE END CRC_Init 0 */
/* USER CODE BEGIN CRC_Init 1 */
/* USER CODE END CRC_Init 1 */
hcrc.Instance = CRC;
hcrc.Init.DefaultPolynomialUse = DEFAULT_POLYNOMIAL_ENABLE;
hcrc.Init.DefaultInitValueUse = DEFAULT_INIT_VALUE_ENABLE;
hcrc.Init.InputDataInversionMode = CRC_INPUTDATA_INVERSION_NONE;
hcrc.Init.OutputDataInversionMode = CRC_OUTPUTDATA_INVERSION_DISABLE;
hcrc.InputDataFormat = CRC_INPUTDATA_FORMAT_BYTES;
if (HAL_CRC_Init(&hcrc) != HAL_OK)
{
Error_Handler();
}
/* USER CODE BEGIN CRC_Init 2 */
/* USER CODE END CRC_Init 2 */
}
/**
* @brief USART1 Initialization Function
* @PAram None
* @retval None
*/
static void MX_USART1_UART_Init(void)
{
/* USER CODE BEGIN USART1_Init 0 */
/* USER CODE END USART1_Init 0 */
/* USER CODE BEGIN USART1_Init 1 */
/* USER CODE END USART1_Init 1 */
huart1.Instance = USART1;
huart1.Init.BaudRate = 921600;
huart1.Init.WordLength = UART_WORDLENGTH_8B;
huart1.Init.StopBits = UART_STOPBITS_1;
huart1.Init.Parity = UART_PARITY_NONE;
huart1.Init.Mode = UART_MODE_TX_RX;
huart1.Init.HwFlowCtl = UART_HWCONTROL_NONE;
huart1.Init.OverSampling = UART_OVERSAMPLING_16;
huart1.Init.OneBitSampling = UART_ONE_BIT_SAMPLE_DISABLE;
huart1.Init.ClockPrescaler = UART_PRESCALER_DIV1;
huart1.AdvancedInit.AdvFeatureInit = UART_ADVFEATURE_NO_INIT;
if (HAL_UART_Init(&huart1) != HAL_OK)
{
Error_Handler();
}
if (HAL_UARTEx_SetTxFifoThreshold(&huart1, UART_TXFIFO_THRESHOLD_1_8) != HAL_OK)
{
Error_Handler();
}
if (HAL_UARTEx_SetRxFifoThreshold(&huart1, UART_RXFIFO_THRESHOLD_1_8) != HAL_OK)
{
Error_Handler();
}
if (HAL_UARTEx_DisableFifoMode(&huart1) != HAL_OK)
{
Error_Handler();
}
/* USER CODE BEGIN USART1_Init 2 */
/* USER CODE END USART1_Init 2 */
}
/**
* @brief GPIO Initialization Function
* @PAram None
* @retval None
*/
static void MX_GPIO_Init(void)
{
/* USER CODE BEGIN MX_GPIO_Init_1 */
/* USER CODE END MX_GPIO_Init_1 */
/* GPIO Ports Clock Enable */
__HAL_RCC_GPIOE_CLK_ENABLE();
__HAL_RCC_GPION_CLK_ENABLE();
__HAL_RCC_GPIOA_CLK_ENABLE();
/* USER CODE BEGIN MX_GPIO_Init_2 */
/* USER CODE END MX_GPIO_Init_2 */
}
/* USER CODE BEGIN 4 */
/* USER CODE END 4 */
/**
* @brief This function is executed in case of error occurrence.
* @retval None
*/
void Error_Handler(void)
{
/* USER CODE BEGIN Error_Handler_Debug */
/* User can add his own implementation to report the HAL error return state */
__disable_irq();
while (1)
{
}
/* USER CODE END Error_Handler_Debug */
}
#ifdef USE_FULL_ASSERT
/**
* @brief Reports the name of the source file and the source line number
* where the assert_param error has occurred.
* @PAram file: pointer to the source file name
* @PAram line: assert_param error line source number
* @retval None
*/
void assert_failed(uint8_t *file, uint32_t line)
{
/* USER CODE BEGIN 6 */
/* User can add his own implementation to report the file name and line number,
ex: printf("Wrong parameters value: file %s on line %d\r\n", file, line) */
/* USER CODE END 6 */
}
#endif /* USE_FULL_ASSERT */
This is the Python file that I use to transfer the image data from computer to N6. It uses a picture from dataset Food 101, and reshape it to the right size and send it to the N6. It also runs the model in computer and gives out the result. below is the py code:
import serial
import time
import sys
import numpy as np
import tensorflow as tf
from PIL import Image
import tkinter as tk
from tkinter import ttk
# save hex image
def save_hex_data(filename, data):
with open(filename, "w") as f:
hex_str_list = [f"0x{byte:02X}" for byte in data]
f.write(", ".join(hex_str_list))
# load tflite
interpreter = tf.lite.Interpreter(
model_path="C:/workspace/BA_related_files/mobilenet_v1_0.25_224_fft_int8.tflite")
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
input_shape = input_details[0]['shape'][1:3]
# preprocess the image
image_path = "C:/workspace/BA_related_files/Picture/913020.jpg"
image = Image.open(image_path).convert('RGB')
image = image.resize(input_shape, Image.Resampling.LANCZOS)
input_data = np.array(image, dtype=np.uint8)
input_data = np.expand_dims(input_data, axis=0)
flatten_data = input_data.flatten()
save_hex_data(f"C:/workspace/BA_related_files/input_data_raw.txt", flatten_data)
# input_tensor = flatten_data.reshape(input_data.shape) might need to rearrange for the mcu
input_tensor = flatten_data.reshape(input_data.shape)
interpreter.set_tensor(input_details[0]['index'], input_tensor)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
print(f"model output:", output_data)
#STM32 default CRC
def stm32_crc(data: bytes) -> int:
poly = 0x04C11DB7
crc = 0xFFFFFFFF
for b in data:
crc ^= (b << 24)
for _ in range(8):
if crc & 0x80000000:
crc = (crc << 1) ^ poly
else:
crc <<= 1
crc &= 0xFFFFFFFF
return crc
# Data send
ser = serial.Serial('COM14', 921600, timeout=5) # Adjust COM port and baudrate
send_data = flatten_data.tobytes()
start_sequence = accept_sequence= bytes([0xAA, 0xBB, 0xCC])
deny_sequence = bytes([0xAA, 0xBB, 0xDD])
remain_len = 224*224*3
chunk_len = 65532
crc_value = stm32_crc(send_data)
crc_bytes = crc_value.to_bytes(4, byteorder='little')
send_chunk = []
buffer = bytearray()
while_flag = 1
while while_flag:
time.sleep(1)
if remain_len >= chunk_len :
send_chunk = start_sequence + send_data[0:chunk_len]
send_data = send_data[chunk_len:]
remain_len = remain_len - chunk_len
print(f"Sent: {send_chunk}")
ser.write(send_chunk)
else:
send_chunk = start_sequence + send_data
print(f"Sent: {send_chunk}")
ser.write(send_chunk)
while_flag = 0
while True:
print('receiving')
byte = ser.read(3)
if byte:
if bytes(byte) == accept_sequence:
print("Accept sequence detected.")
break
elif bytes(byte) == deny_sequence:
print("Deny sequence detected.")
sys.exit(1)
else:
print("No right sequence detected.")
sys.exit(1)
print(f"STM32 CRC32 should be : 0x{crc_value:08X}")
while True:
byte = ser.read(4)
if byte:
if bytes(byte) == crc_bytes:
print("CRC right value.")
print(f"model output:", output_data)
with open("C:/workspace/BA_related_files/food-101/meta/classes.txt", "r") as f:
labels = [line.strip() for line in f.readlines()] # list of 101 strings
# generate list for the data
data = [(i, labels[i], float(f"{output_data[0][i]:.4f}")) for i in range(101)]
# generate windows
root = tk.Tk()
root.title("Food101 result")
root.geometry("500x600")
# generate table
tree = ttk.Treeview(root, columns=("Nubmer", "Food", "Possiblity"), show="headings", height=25)
tree.heading("Nubmer", text="Nubmer")
tree.heading("Food", text="Food")
tree.heading("Possiblity", text="Possiblity")
# set wide
tree.column("Nubmer", width=60, anchor="center")
tree.column("Food", width=200, anchor="center")
tree.column("Possiblity", width=100, anchor="center")
# insert data
for row in data:
tree.insert("", "end", values=row)
# show the table
tree.pack(fill="both", expand=True)
root.mainloop()
sys.exit(1)
else:
print("CRC wrong value.")
sys.exit(1)
The result inside the PC is below (without problem):
However the output buffer of the N6, is different (very different !!!!!!!!!):
(others are all 0)
And its my bachelor thesis, So I have to solve this, but now I am stuck and my professor and supervisor also dont know how to fix this, they never did this before as well and can also now debug or find any logical error in it.
Thanks a lot in advance for helping me, I really need some help from the ST side.
2025-04-22 8:36 AM
Hello @Einstein_rookie_version,
Your variable wait_flag seems to be modified from a IT callback function so it is more careful to declare it as volatile.
But I don't think it is the cause of your problem since the init part of your code seems to work and you compare the CRC before inference. Maybe it is a problem with the image format? The image needs be sent in binary RGB888 format. You can easily check the image format by trying to open the binary image file (saved as .data file) with GIMP, then selecting the RGB888 format and the right resolution.
If you are working on an STM32N6-DK board I suggest you to run the deployment script of the ModelZoo to get a working example. From this point, you can run the code until the LL_ATON_RT_Main function, then restore your image in the NN input buffer (in RGB888 binary format) using the gdb command "restore" (it is easier than using the UART). Clean the cache. Then you can run the inference and check the output. This way should work, and it will allow you to get a working starting point to debug your code.
Guillaume
2025-04-22 9:07 AM
Hello @Einstein_rookie_version,
Here is another possible source of issue:
When you convert a python model (tflite here) to a C model, for x86 (PC) and Arm (MCU), there will be differences because the layers are recorded in C. For example, rounding differences.
You have differences between the C code and the Python code, but also between the C x86 code and C Arm code.
So, independently from your implementation, you can observe differences between the model running on PC or on Target.
Let's make sure that the difference you observe are coming from the model using the validation on target from CubeAI.
Open CubeMX and XCubeAI, upload your model and click on validation on target:
You need to have a COS as close as 1 as possible.
Have a good day,
Julian