‎2021-10-14 11:42 PM
Hello,
I have developed an application based on lwIP and freeRTOS for STM32F746NGH6. The code was generated using STM32CubeMX, freeRTOS version 10.2.1, CMSIS-RTOS version 2.00 and Cube FW F7 V1.16.1.
The application runs a webserver and receives data from a client at the rate X Bytes/min.
The program crashes after around 60 hours. To quickly test this, I reduced the data
reception rate at X Bytes/4sec. The program crashed after around 10 minutes. The analysis
using fault analyzer of STM32CubeIDE showed that the cause of the crash is precise data bus access error (PRECISERR).
The cause of error seems to be the function HAL_ETH_GetReceivedFrame_IT. Please see the
attached snapshots (good case and bad case). According to me, at the time of crash, the address of RXDesc ist invalid which is causing the PRECISERR. Normally the address of RXDesc is 0x2004C000 but I do not understand how this is getting corrupted.
This is an excerpt from my linker script:
/* Memories definition */
MEMORY
{
RAM (xrw) : ORIGIN = 0x20000000, LENGTH = 320K
FLASH (rx) : ORIGIN = 0x08008000, LENGTH = 992K
Memory_B1(xrw) : ORIGIN = 0x2004C000, LENGTH = 0x80
Memory_B2(xrw) : ORIGIN = 0x2004C080, LENGTH = 0x80
Memory_B3(xrw) : ORIGIN = 0x2004C100, LENGTH = 0x17d0
Memory_B4(xrw) : ORIGIN = 0x2004D8D0, LENGTH = 0x17d0
}
User defined FreeRTOS Tasks:
The lwIP port of ST creates two more tasks
As an attempt to solve this issue, I thought if increasing the stack of RTOS tasks can solve the issue. But this unfortuately lead to IMPRECISERR.
Could you please suggest how could I resolve this issue.
Thanks & Regards,
Abhijeet
Solved! Go to Solution.
‎2021-10-19 12:54 AM
It was found that there was a memory leak in other part of the code. Releasing this memory resolved the issue.
‎2021-10-15 07:10 AM
> Normally the address of RXDesc is 0x2004C000 but I do not understand how this is getting corrupted.
Most likely an out of bounds write somewhere in the program.
If it's repeatable, set up a hardware watchpoint for RXDesc and you can see exactly when/where it gets changed.
> As an attempt to solve this issue, I thought if increasing the stack of RTOS tasks can solve the issue. But this unfortuately lead to IMPRECISERR.
Also not good, and suggests there are memory management issues somewhere.
‎2021-10-17 10:35 PM
Thanks for the hint. I set a watchpoint and I could see that DMA descriptor pointer are corrupted in HAL_ETH_GetReceivedFrame_IT function while returning (at return HAL_OK).
/* Set HAL State to Ready */
heth->State = HAL_ETH_STATE_READY;
/* Process Unlocked */
__HAL_UNLOCK(heth);
/* Return function status */
return HAL_OK;
}
I assumed that it could be problem with HAL mutex mechanism and followed solution from this thread https://community.st.com/s/question/0D50X0000C5Tns8SQC/bug-stm32-hal-driver-lock-mechanism-is-not-interrupt-safe?t=1634426786423
Now I get IMPRECISE error. Fault analyzer PC is pointing to memcpy instruction and link register to the instruction from low_level_output function. Here the excerpt from the function.
/* Copy the remaining bytes */
memcpy( (uint8_t*)((uint8_t*)buffer + bufferoffset), (uint8_t*)((uint8_t*)q->payload + payloadoffset), byteslefttocopy );
bufferoffset = bufferoffset + byteslefttocopy;
framelength = framelength + byteslefttocopy;
I am using configUSE_NEWLIB_REENTRANT=1 option in freeRTOS. I don't know why it crashes at memcpy function.
‎2021-10-19 12:54 AM
It was found that there was a memory leak in other part of the code. Releasing this memory resolved the issue.
‎2021-10-19 03:07 AM
Thank you for sharing the solution :smiling_face_with_smiling_eyes: