cancel
Showing results for 
Search instead for 
Did you mean: 

STM32F746 Firmware and Debugger Lockup

JD1800
Associate II
Posted on August 03, 2017 at 19:09

We are seeing a lockup issue with a STM32F746 design.  On the surface it appears like a normal firmware lockup, then the watchdog triggers a reset and it recovers.  But if I disable the watchdog and try to use the debugger to see what has happened, I cannot get the debugger to connect.  Every time it hangs, the debug port is also inoperable until the processor is reset.  From my testing, it appears that the code is completely stopped.  Using GPIO toggles, I cannot detect any interrupt activity, but the LCD controller keeps refreshing the LCD and is reading the display data from SDRAM.

The issue can be quite infrequent.  The device uses a CAN connection, and I was not able to see it lockup until we had several other CAN devices on the network.  It also seems to be worse in the end application where there is much more CAN traffic, but if I generate CAN traffic using a USB-CAN adapter, I cannot get it to fail more frequently.  It sometimes will lockup a few times within an hour, or sometimes takes more than a day to occur.  Yet it does not seem dependent on the actual CAN traffic, since if I have two of the boards connected on the same CAN network, they do not lockup at the same time.

I have tried using the debugger ITM interface to output various debug info, but it seems to prevent or at least make the lockup much less frequent.  Most recently I created a trace buffer that gets dumped over ITM following a watchdog reset, but it has now run for 2 days without lockup.

Has anyone else seen something like this?  Does anyone know if there is anything firmware can do that could cause the debugger to fail to connect?  I did use the GPIO lock register to lock the configuration for the SWD pins, but the debugger still fails to connect once the board locks up.  I did not see any other registers that looked like they could prevent the debugger from working.  The product is already being produced installed in the end application, so any assistance would be appreciated.

#stm32f7 #debugger-connection-failure #stm32f7-lockup
12 REPLIES 12
Posted on August 03, 2017 at 19:30

If you power down the core you'll lose debugger connectivity, ditto interfering with the pins used by the interface.

Check DBGMCU setting.

Get telemetry via a serial port so you aren't reliant on the debugger. Add interactivity so you can probe state. Have a Hard Fault handler that output useful data.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
JD1800
Associate II
Posted on August 08, 2017 at 18:19

I now have a build that does not drop the debugger when it hangs.  I am not quite sure what the difference is.  So I have found that the cause of the lock up is that something is trashing the QSPI flash control registers, so when the code tries to access the memory mapped QSPI flash, it never gets a response and hangs.  I am still trying to figure out what is trashing the QSPI registers.

Posted on August 09, 2017 at 02:59

usually its a string or table where the write index is outside the table, obliterating initialised structures.

Posted on January 11, 2018 at 00:51

Did you get to root cause of this issue?  I'm seeing a similar lockup when accessing qspi flash through memory mapped mode.  However, as far as I can tell, qspi registers aren't being trashed.  Also, I'm running single threaded, so no chance that another thread is trashing anything.

Posted on January 16, 2018 at 06:08

I never did get to the bottom of this.  When I added a trace buffer in the backup RAM that could be dumped after reset, the problem stopped.  Even when I disabled the trace write function so it simply returned, it would no longer fail.  Since the customers wanted a solution, we ended up releasing this version.

Since the trace function was being called from a number of ISRs, it is possible that the slight change in interrupt timing prevented the issue from occurring.  I am also not sure that the QSPI flash registers were getting trashed.  When I was reading the registers the debugger was able to break and step, but I later discovered that reading any memory location seemed to give bogus values.  But it did appear that the lockup was due to an access to the memory mapped QSPI space.  I wonder if the QSPI was inserting continuous wait states on the memory bus, if that could have prevented the debugger from accessing the memory as well.

I wanted to do more tests to try to determine which change seemed to be preventing the lockup, but in the end priorities did not allow for it.

Posted on January 22, 2018 at 18:52

Which RTOS are you running?  In theory, firmware shouldn't be able to cause this kind of issue.  But it would be interesting to know if we are running the same RTOS.  I am using eCos.

Posted on January 22, 2018 at 22:40

Ours is using freeRTOS.

Michael Stauffer
Associate II
Posted on February 13, 2018 at 21:50

We have found that disabling the QSPI chip select timeout feature seems to resolve the issue for us.  I would recommend giving this a try.

Posted on February 13, 2018 at 22:10

Thanks for the tip.  I'll give it a try when I'm back into that project.