2024-03-21 09:13 AM
I want to run AI machine vision applications using STCubeAI. After training the model with TensorFlow components, I obtained a tflite file. Then, I imported and analyzed the model using the CubeAI toolkit in STM32CubeMX, and subsequently generated CubeAI C code. The first strange phenomenon I encountered here is that, although I have configured the use of external RAM and external flash and specified addresses in the CubeAI section of CubeMX, the compiler still reports insufficient RAM during compilation in Keil. This indicates that the external RAM settings I configured in CubeAI did not take effect.
Despite setting the usage of external RAM as mentioned above, the generated code still throws an error indicating insufficient RAM
So I had to use the 'attribute' keyword to specify that some variables used by CubeAI should be stored in external RAM, which is inefficient and error-prone.
Then the program compiled successfully. However, at runtime, the program triggered a hard fault at the 'return ai_platform_network_process(network, input, output);' line and further debugging was not possible. By examining the call stack, it was evident that the issue occurred at 'st_sssa8_ch_convolve_rank1upd'. How should I troubleshoot this problem? A common suggestion is to confirm if CRC is enabled. The answer is yes, of course; STM32CubeMX now automatically enables CRC when using CubeAI.
/* Global handle to reference the instantiated C-model */
static ai_handle network = AI_HANDLE_NULL;
/* Global c-array to handle the activations buffer */
AI_ALIGNED(32)
static ai_u8 activations[AI_NETWORK_DATA_ACTIVATIONS_SIZE]__attribute__((at(0XC1200000))); //
/* Array to store the data of the input tensor */
AI_ALIGNED(32)
static ai_float in_data[AI_NETWORK_IN_1_SIZE]__attribute__((at(0XC1A00000))); //
/* or static ai_u8 in_data[AI_NETWORK_IN_1_SIZE_BYTES]; */
/* c-array to store the data of the output tensor */
AI_ALIGNED(32)
static ai_float out_data[AI_NETWORK_OUT_1_SIZE];
/* static ai_u8 out_data[AI_NETWORK_OUT_1_SIZE_BYTES]; */
/* Array of pointer to manage the model's input/output tensors */
static ai_buffer *ai_input;
static ai_buffer *ai_output;
int aiInit(void) {
ai_error err;
/* Create and initialize the c-model */
const ai_handle acts[] = { activations };
err = ai_network_create_and_init(&network, acts, NULL);
if (err.type != AI_ERROR_NONE) {};
/* Reteive pointers to the model's input/output tensors */
ai_input = ai_network_inputs_get(network, NULL);
ai_output = ai_network_outputs_get(network, NULL);
return 0;
}
int aiRun(const void *in_data, void *out_data) {
ai_i32 n_batch;
ai_error err;
/* 1 - Update IO handlers with the data payload */
ai_input[0].data = AI_HANDLE_PTR(in_data);
ai_output[0].data = AI_HANDLE_PTR(out_data);
/* 2 - Perform the inference */
n_batch = ai_network_run(network, &ai_input[0], &ai_output[0]);
if (n_batch != 1) {
err = ai_network_get_error(network);
};
return 0;
}
void main_loop()
{
/* The STM32 CRC IP clock should be enabled to use the network runtime library */
__HAL_RCC_CRC_CLK_ENABLE();
aiInit();
while (1)
{
/* 1 - Acquire, pre-process and fill the input buffers */
//acquire_and_process_data(in_data);
memcpy(in_data, in_buf, OUT_HEIGHT*OUT_WIDTH*3);
/* 2 - Call inference engine */
aiRun(in_data, out_data);
/* 3 - Post-process the predictions */
//post_process(out_data);
}
}
Solved! Go to Solution.
2024-03-25 01:00 AM
For activation buffer to be automatically allocated partially in external RAM you have to define memory pools where to specify the amount of internal memory that can be used and the amount of external memory.
In the generated code one array per memory pool will be defined.
Regards
2024-03-25 01:00 AM
For activation buffer to be automatically allocated partially in external RAM you have to define memory pools where to specify the amount of internal memory that can be used and the amount of external memory.
In the generated code one array per memory pool will be defined.
Regards
2024-03-25 01:08 AM
This is a minor issue; I have already addressed the memory allocation problem using the 'attribute at' keyword. The critical issue lies in the hard fault. I have ensured that the external SDRAM is available, performing full chip read/write verification three times every power-up. Even if the data type provided to it is incorrect and the allocated RAM space far exceeds the size required for the AI application, I can't think of any scenario that would lead to a hard fault.