How to debug faults which only occur with the debugger disconnected. STM32H7B3I-DK
- December 6, 2021
- 3 replies
- 2055 views
Hi,
I had a difficult problem where my project would run faultlessly when started with the debugger, but would fail when the power was recycled.
I thought that a peripheral was not initializing due to a delay in the power up phase. Putting HAL_Delay of 2000 ms did not fix this and in any event the power was reaching 3.3 V in under 300us.
I suspected a hard fault do I added a GPIO output in the hard fault handler code:
void HardFault_Handler(void)
{
#if DEBUG_NMI_FAULTS == 1
for(uint32_t i = 1000000; i > 0 ; i--)
{
HAL_GPIO_WritePin(ImpulseLED_GPIO_Port, ImpulseLED_Pin, GPIO_PIN_SET);
}
HAL_GPIO_WritePin(ImpulseLED_GPIO_Port, ImpulseLED_Pin, GPIO_PIN_RESET);
NVIC_SystemReset();
#endif
}Power cycling caused the GPIO to go high proving the Hard Fault.
Problem was where was the hard fault occurring and why.
Following the AN4989 App note - STM32 microcontroller debug toolbox I set up Uart 1 which ports to the STLink uP to send data to the STLink Virtual Com port.
I then modified the startup file to include seperate handlers for the different types of hard faults.
See the snippet below:
/*****************************************************************************/
.section .text.Reset_Handler
/*****************************************************************************/
.weak HardFault_Handler
.type HardFault_Handler, %function
HardFault_Handler:
movs r0,#4
movs r1, lr
tst r0, r1
beq _MSP1
mrs r0, psp
b _HALT1
_MSP1:
mrs r0, msp
_HALT1:
ldr r1,[r0,#20]
b HardFault_Handler_C
bkpt #0
.size HardFault_Handler, .-HardFault_HandlerThis code replaces the standard hard fault handlers in the Interrupt file which must be commented out or removed so the assmebly files get included in the link. You must also force the linker to use these by making the following mods to another part of the startup file viz:
.weak HardFault_Handler
.thumb_set HardFault_Handler,HardFault_Handler
.weak MemManage_Handler
.thumb_set MemManage_Handler,MemManage_Handler
.weak BusFault_Handler
.thumb_set BusFault_Handler,BusFault_Handler
.weak UsageFault_Handler
.thumb_set UsageFault_Handler,UsageFault_HandlerThe assembly code calls various c code handlers depending on which type of fault has occurred viz:
void HardFault_Handler_C(unsigned int *hardfault_args)
{
_BFAR = SCB->BFAR;
_MMAR = SCB->MMFAR;
_CFSR = SCB->CFSR;
stacked_r0 = ((unsigned int) hardfault_args[0]);
stacked_r1 = ((unsigned int) hardfault_args[1]);
stacked_r2 = ((unsigned int) hardfault_args[2]);
stacked_r3 = ((unsigned int) hardfault_args[3]);
stacked_r12 = ((unsigned int) hardfault_args[4]);
stacked_lr = ((unsigned int) hardfault_args[5]);
stacked_pc = ((unsigned int) hardfault_args[6]);
stacked_psr = ((unsigned int) hardfault_args[7]);
#if(PRINT_HARD_FAULTS == 1)
printf("\n\r==== [HardFault_Handler] ====\n\r");
printOutFault();
#endif
#if(LOOP_AT_COMPLETION == 1)
__asm("BKPT #0\n") ; // Break into the debugger
#else
for(uint32_t i = 1000000; i > 0 ; i--)
{
HAL_GPIO_WritePin(ImpulseLED_GPIO_Port, ImpulseLED_Pin, GPIO_PIN_SET);
}
HAL_GPIO_WritePin(ImpulseLED_GPIO_Port, ImpulseLED_Pin, GPIO_PIN_RESET);
NVIC_SystemReset();
#endif
}The hard fault handler prints out details of the fault to a terminal such as Terra Terminal set to the STLink Virtual Com port which can be seen in the Windows Device Manager Com port list.
In my case the fault only occurred in a FIR filter, and only when the FIR filter circular buffer was placed in the DTCM. I picked up the location from the program counter value output by the fault handler printf to the Terra terminal.
It was obvious when I inspected the code where the problem was. When the buffer was placed in the DTCM ram it wasn't zero intialised so that the pointer to the buffer had invalid number in it. When the same buffer pointer was defaulted to the AXI ram it got intialised. I am unsure why it didn't fail when started from debug, but I suspect the DTCM ram must be initialized in the startup phase. Has anyone a theory on this?
Initializing the buffer pointer to zero fixed the problem.
I have attached the relevant files as a zip to assist others. I have included a flowchart (PDF) of the problem I experienced to assist in understanding.
I didn't produce all the code from scratch there are a lot of other peoples work in this I just pulled it together to find a particular bug.