2021-02-10 08:19 PM
I'm trying to deploy an object detection system on the STM32H743ZI2 board using CubeIDE and X-CUBE AI. For my evaluation process, I want to load images (stored as flattened text file arrays) from a USB flash drive.
To be clear, the USB appears to be set up properly as I can read/write and mount/unmount the USB flash drive without any issues. However, if I mount the USB, it causes the following line of code to hard fault:
nbatch = ai_sine_model_run(sine_model, &ai_input[0], &ai_output[0]);
I've reproduced the rest of the main() code below:
int main(void)
{
/* USER CODE BEGIN 1 */
char buf[50];
int buf_len = 0;
ai_error ai_err;
ai_i32 nbatch;
uint32_t timestamp;
float y_val;
// Chunk of memory used to hold intermediate values for neural network
AI_ALIGNED(4) ai_u8 activations[AI_SINE_MODEL_DATA_ACTIVATIONS_SIZE];
// Buffers used to store input and output tensors
AI_ALIGNED(4) ai_i8 in_data[AI_SINE_MODEL_IN_1_SIZE_BYTES];
AI_ALIGNED(4) ai_i8 out_data[AI_SINE_MODEL_OUT_1_SIZE_BYTES];
// Pointer to our model
ai_handle sine_model = AI_HANDLE_NULL;
// Initialize wrapper structs that hold pointers to data and info about the
// data (tensor height, width, channels)
ai_buffer ai_input[AI_SINE_MODEL_IN_NUM] = AI_SINE_MODEL_IN;
ai_buffer ai_output[AI_SINE_MODEL_OUT_NUM] = AI_SINE_MODEL_OUT;
// Set working memory and get weights/biases from model
ai_network_params ai_params = {
AI_SINE_MODEL_DATA_WEIGHTS(ai_sine_model_data_weights_get()),
AI_SINE_MODEL_DATA_ACTIVATIONS(activations)
};
// Set pointers wrapper structs to our data buffers
ai_input[0].n_batches = 1;
ai_input[0].data = AI_HANDLE_PTR(in_data);
ai_output[0].n_batches = 1;
ai_output[0].data = AI_HANDLE_PTR(out_data);
/* USER CODE END 1 */
/* Enable I-Cache---------------------------------------------------------*/
SCB_EnableICache();
/* Enable D-Cache---------------------------------------------------------*/
SCB_EnableDCache();
/* MCU Configuration--------------------------------------------------------*/
/* Reset of all peripherals, Initializes the Flash interface and the Systick. */
HAL_Init();
/* USER CODE BEGIN Init */
/* USER CODE END Init */
/* Configure the system clock */
SystemClock_Config();
/* USER CODE BEGIN SysInit */
/* USER CODE END SysInit */
/* Initialize all configured peripherals */
MX_GPIO_Init();
MX_ETH_Init();
MX_USART3_UART_Init();
MX_FATFS_Init();
MX_USB_HOST_Init();
MX_CRC_Init();
MX_TIM16_Init();
/* USER CODE BEGIN 2 */
// Start timer/counter
HAL_TIM_Base_Start(&htim16);
// Greetings!
buf_len = sprintf(buf, "\r\n\r\nSTM32 X-Cube-AI test\r\n");
printf(buf);
// Create instance of neural network
ai_err = ai_sine_model_create(&sine_model, AI_SINE_MODEL_DATA_CONFIG);
if (ai_err.type != AI_ERROR_NONE)
{
buf_len = sprintf(buf, "Error: could not create NN instance\r\n");
printf(buf);
while(1);
}
// Initialize neural network
if (!ai_sine_model_init(sine_model, &ai_params))
{
buf_len = sprintf(buf, "Error: could not initialize NN\r\n");
printf(buf);
while(1);
}
/* USER CODE END 2 */
/* Infinite loop */
/* USER CODE BEGIN WHILE */
int first_while_iter = 1;
while (1)
{
if (first_while_iter)
{
sprintf(buf, "Mounting USB...\r\n");
printf(buf);
while(1)
{
MX_USB_HOST_Process();
sprintf(name,"num5.bmp");
file_ready = 0;
read_bmp(name);
if(file_ready == 1){
// out_img = ProcessBmp(rtext);
break;
}
}
sprintf(buf, "USB Mounted Successfully\r\n");
printf(buf);
first_while_iter = 0;
}
sprintf(name,"b1.txt");
float * b1 = read_txt(name, 2);
printf("%.2f ", b1[0]);
printf("%.2f\n", b1[1]);
free(b1);
if (f_mount(NULL,USBHPath,0) != FR_OK)
{
Error_Handler();
}
// Fill input buffer (use test value)
for (uint32_t i = 0; i < AI_SINE_MODEL_IN_1_SIZE; i++)
{
((ai_float *)in_data)[i] = (ai_float)2.0f;
}
// Get current timestamp
timestamp = htim16.Instance->CNT;
// Perform inference
nbatch = ai_sine_model_run(sine_model, &ai_input[0], &ai_output[0]);
if (nbatch != 1) {
buf_len = sprintf(buf, "Error: could not run inference\r\n");
printf(buf);
}
// Read output (predicted y) of neural network
y_val = ((float *)out_data)[0];
// Print output of neural network along with inference time (microseconds)
buf_len = sprintf(buf,
"Output: %f | Duration: %lu\r\n",
y_val,
htim16.Instance->CNT - timestamp);
printf(buf);
// Wait before doing it again
HAL_Delay(1000);
/* USER CODE END WHILE */
/* USER CODE BEGIN 3 */
}
/* USER CODE END 3 */
}
Because I'm new to X-Cube AI, I consulted this tutorial in addition to consulting the documentation.
The above script is just mounting the USB and loading a dummy text file before performing inference with a very basic model. The model is supposed to approximate sin(x), and the input is hard-coded to 2.0. (The data loaded from the USB is just for testing purposes, it is not used during inference).
The purpose of the test is to make sure that I didn't mess up the hardware configuration ('.ioc') file, and to spot problematic interactions like this.
The USB code works perfectly with and without the X-CUBE AI inference code.
If I comment out the USB code, X-Cube AI works perfectly. However this is not a solution as I need a way to load images onto the board.
Any help/experience with USB OTG and X-Cube AI conflicts would be greatly appreciated.
2021-02-11 12:42 AM
It may be a problem with the stack size, please try to increase it.
Regards
Daniel
2021-02-11 01:04 AM
Hello @SDosh.1
You declared the "activations" buffer within you main() function, in stack. Typically, this buffer may be very large and cause stack overflows. I would recommend you to statically allocate those buffer, e.g. as global variables:
AI_ALIGNED(4) ai_u8 activations[AI_SINE_MODEL_DATA_ACTIVATIONS_SIZE];
AI_ALIGNED(4) ai_i8 in_data[AI_SINE_MODEL_IN_1_SIZE_BYTES];
AI_ALIGNED(4) ai_i8 out_data[AI_SINE_MODEL_OUT_1_SIZE_BYTES];
int main(void)
{ ... }
Also, the USB library usually requires some heap space, more than the default size. You can check for malloc calls to see for yourself. The X-CUBE-AI NetworkRuntime library may also be using some heap space for VLA allocation (stack or heap depending on your compiler).
Regards,
Guillaume
2021-02-12 02:20 PM
Thanks for your suggestions!
I've tried statically allocating the buffers as global variables, but the same issue occurs.
I've also tried increasing the stack size, and no matter how large I make the stack/heap (see images below), the exact same hard fault issue occurs.
Are there other suggestions as to what may be causing the issue?
2021-02-17 01:51 AM
Hello @SDosh.1 ,
Is you following code working with USB OTG and AI enabled ?
// Fill input buffer (use test value)
for (uint32_t i = 0; i < AI_SINE_MODEL_IN_1_SIZE; i++)
{
((ai_float *)in_data)[i] = (ai_float)2.0f;
}
Can the Hardfault come from a problem with a data buffer fetched from USB flash drive that is freed from memory (or overwritten) when passed to "ai_sine_model_run(...)" ?
Regards,
Romain
2021-02-17 01:51 AM
Hello @SDosh.1 ,
Is you following code working with USB OTG and AI enabled ?
// Fill input buffer (use test value)
for (uint32_t i = 0; i < AI_SINE_MODEL_IN_1_SIZE; i++)
{
((ai_float *)in_data)[i] = (ai_float)2.0f;
}
Can the Hardfault come from a problem with a data buffer fetched from USB flash drive that is freed from memory (or overwritten) when passed to "ai_sine_model_run(...)" ?
Regards,
Romain
2021-02-17 12:10 PM
Thanks for the suggestion!
Unfortunately, that data buffer is not fetched from the USB flash drive. For now, the network input is just hardcoded, and the USB is just loading in dummy data. The purpose of this test was to spot potential potential issues with X-CUBE/USB as we later want to load images (stored as text files) from the USB.