2023-06-05 03:27 PM
I have been stuck on a tricky little issue for some time now.
Using an STM32H753, on a reasonably large and complex application, when trying to erase the upper flash bank, the watchdog expires resetting the chip!
My suspicion is that the CPU is stalling during the erase (~5s), but watchdog keeps going until it resets.
I am running low on ideas to debug this, I am open to any help. Here are some things of significance:
It seems most likely something is trying to access the flash while it is being erased causing the CPU to stall. I guess due to a code bug. I am after ideas, or any insight into the STM32H7 that could help track this issue to its cause.
Thanks in advance...
2023-06-05 06:05 PM
Some ideas...
2023-06-05 06:18 PM
Thanks for your ideas. Here is my progress on them so far:
I have messed around with the MPU a lot and added test cases. It always has a mem manage fault. But never during normal operation. It is all access disabled, so any access to any address in that region by the CPU should mem manage fault. I checked the map file. also the linker script exposes only the lower bank so nothing is located in upper bank by the compiler.
Currently working through this. Unfortunately changing almost anything stops the problem from happening!
I can see the erase duration in the SWV graph. Approx 5-6 seconds.
Yup, rock soild. This is also an existing product that is pretty solid. Literally tens of thousands of flash erases have been done before, never with any problem. Just since a big recent code restructure. Even now, adding a single nop can cause/fix the problem so think is firmware/timing related.
Yup, watchdog reset flag is set. Also disabling the watchdog stops the reset happening, leaving just the CPU stall.
Watchdog is serviced every 16ms (10 - 20msm window) by a system thread that runs during the erase operation. The thread that initiates erase is suspended during erase.
There is safety related compliance aspects that mean the watchdog needs to be left enabled at all times, and on a short leash!
Thanks for the feedback. Helps to discuss it and makes me think down alternate paths.
The CPU should not halt at all during erase as is dual bank mode so main code including watchdog servicing should truck along as normal during the erase.
I am thinking there is a memory overrun or pointer issue somewhere and at runtime there is an access into that address space by accident. But buggered if I can find or prove that!
2023-11-23 04:21 AM
Hi ADunc.1,
Did you find the issue?
We have a similar behavior, our FW is running on bank 1 while the FW erases bank 2.
CPU is not stall during flash erase operation although we noticed that DMAs interrupt where significantly delayed.
Does your execution time depends on interrupts?
Does anyone know if flash erase operation affects interrupt executions?
2024-02-26 01:40 PM
Hi,
I would like to know if you found the root cause of it. I had a similar issue, watchdog resets when erasing flash. It is not watchdog problem.
I suspected this "It seems most likely something is trying to access the flash while it is being erased causing the CPU to stall." but I have not found the proof yet.
Thanks,