2023-12-06 06:40 AM
Hi all,
We have a sporadic issue which is difficult to debug. Maybe you can give some advice / solutions / hints? Any input is welcome!
We are working on an application based on the project LoRaWAN_End_Node_FreeRTOS of STM32CubeWL V1.3.0, board: Nucleo-WL55JC1. We frequently add small updates and re-build the application (build system: CMake). The application usually works for many days without issues. However, in some builds we observe sporadic errors: the application hangs with interrupts disabled, so that the watchdog (IWDG) resets the MCU. We are positive that the program does not get stuck in an interrupt/exception-handler, because we instrumented them all.
Software components used:
One specific build of the application shows the following behavior: Reset; LoRaWAN join; transmit/receive data packets every 60 s; at uptime = 541 s: application hangs after waking up from STOP mode -> IWDG reset. We did some experiment with this build, trying to catch the bug. Observations:
Thanks!
2023-12-13 05:27 AM
Update:
We managed to connect the debugger and pause the target after the application hanged (at uptime = 541 seconds) and before the watchdog triggered. (See above: With the debugger connected to the target, the application does not hang.)
We found that the application gets stuck in xTaskResumeAll() in tasks.c in an endless loop (Middlewares/Third_Party/FreeRTOS/Source/tasks.c).
Any ideas?
Did anyone successfully build an application from the above components?
Thanks!
2024-01-31 03:46 AM
@JBive.1 wrote:We found that the application gets stuck in xTaskResumeAll() in tasks.c in an endless loop
So did you go on to find what was causing it to get stuck there?
What prevents it from exiting that loop?
Have you tried the FreeRTOS forums for help with that?
2024-01-31 03:49 AM - edited 2024-01-31 03:53 AM
@JBive.1 wrote:We are positive that the program does not get stuck in an interrupt/exception-handler, because we instrumented them all.
Have you instrumented the whole application?
Does that show anything different between working & non-working scenarios?
@JBive.1 wrote:
- When a debugger (STM32CubeIDE) is connected -> no error or error happens later (?).
When you say "connected", is that just when it's physically attached, or during an active debug session?
Remember that, during an active debug session, the device won't actually be going to sleep - so that may be a clue ... ?