cancel
Showing results for 
Search instead for 
Did you mean: 

guidance needed to debug hard fault

BTrem.1
Senior II

My code, which is in progress, stopped working with a hard fault. This is on a STM32G431 with several timers running and two dma channels. I can single step through the code with breakpoints on the timer IRQ's and the DMA IRQ's and it works for several minutes then jumps to the hard fault.

I've been trying to use an App note from Segger, AN00016, to debug this.

Here are the general registers:

General Registers	General Purpose and FPU Register Group	
	r0	0x1af0 (Hex)
	r1	0x48000000 (Hex)
	r2	1
	r3	0x602b5300 (Hex)
	r4	0x1eb (Hex)
	r5	0
	r6	8
	r7	0x40013400 (Hex)
	r8	0x400 (Hex)
	r9	0x800 (Hex)
	r10	0x100 (Hex)
	r11	1
	r12	0
	sp	0x20007ed0
	lr	0xfffffff1 (Hex)
	pc	0x800413c <HardFault_Handler>
	xpsr	1627389955
	msp	536903376
	psp	0

Per the app note the lr register bit 2 is 0, so the main stack is reporting the fault information. The sp is at 0x20007ed0 and the stack contents are:

memory	        value	       app note ref
0x2007ED0	00001AF0	r0
0X20007ED4	48000000	r1
0X20007ED8	00000001	r2
0X20007EDC	602B5300	r3
0X20007EE0	00000000	r12
0X20007EE4	080086B7	lr
0X20007EE8	602B5300	pc
0X20007EEC	6000000F	xPSR

Per the app note the first four values are r1-r3 and the 5th value should be r12.

If I use this the previous pc value is 0x602B5300. This is not in code, it is in the memory area FSMC bank1. Which puts me at a dead end ....... ;(

Is there any guidance on how to better debug a hard fault? I realize it is most likely an uninitialized pointer or a peripheral access to invalid memory, or a buffer over-run, I'm looking for these things but it is a needle in a haystack. I was looking for a procedure that is a little more deterministic.

Any suggestions are appreciated.

Thanks,

3 REPLIES 3

There are routines I've posted to automatically output register/stack content.

PC suggests you perhaps popped something off the stack, or did a 'blx r3' using ASCII data, or something else unhelpful.

Look at what subroutine LR suggests is the origin, and what function is being called.

Walk back up the stack identifying pointers and subroutines (PC and LR pushes) this might help understand the call tree, and parameters passed.

Walk your own code to understand the flow/logic.

Add sanity checking in the routines/logic implicated.

Add telemetry output so you can establish flow, stack depth, and general integrity as it approaches the fault.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
TDK
Guru

The SCB registers will typically provide the best info in the case of an uninitialized pointer or reading out of bounds. It will likely give you the address of the offending instruction.

If you feel a post has answered your question, please click "Accept as Solution".

One way to proceed is to have a look at the instruction before the address in lr (which has LSB set as Cortex-M runs (permanently) in Thumb mode), That is a subroutine call, and the target of that call is the routine which caused the problem. I'm quite willing to bet that it's a bx r3, and a result of function pointer call. From mixed source/disasm view, find out in which routine that is.

JW