HardFault Debug in STM32CubeIDE

rrnicolay · ‎2019-12-17

Some time ago, I was getting a Hardfault in a STM32F103 baremetal firmware. Even posted the question here, but I wasnt able to fix it. So, I moved to FreeRTOS-CMSIS, not to get rid of the problem, I was moving anyway. I'm still with this issue.

I think it is related to printing some floating point numbers.

I checked "Use float with printf from newlib-nano (-u _print_float)" in the project.

One task prints to a UART (redirection using _write()) the float values of GForce in 3 axis every 10ms.

printf("%5.2f, %5.2f, %5.2f\r\n", accScaled[0], accScaled[1], accScaled[2]);

/* Redirection of printf */
int _write(int file, char *ptr, int len)
{
	/* Wait for the transaction to complete */
	while (HAL_UART_GetState(uartHandler) != HAL_UART_STATE_READY) {}
 
	/* Fill buffer */
	if(len < TX_BUF_SZ)
	{
		strncpy(txBuffer, ptr, len);
	}
	else
	{
		strcpy(txBuffer, "[WARN] Tx size exceeded\r\n");
	}
 
	/* Transmit */
	HAL_UART_Transmit_DMA(uartHandler, (uint8_t *)txBuffer, len);
 
  return len;
}

It doesnt seem to be a Stack overflow, since it runs for several seconds (minutes sometimes) and uxTaskGetStackHighWaterMark() returns 90 words left of stack on the task.

Started debugging the Hardfault using this commit.

.section  .text.Reset_Handler
.weak  HardFault_Handler
.type  HardFault_Handler, %function
HardFault_Handler:
  movs r0,#4
  movs r1, lr
  tst r0, r1
  beq _MSP
  mrs r0, psp
  b _HALT
_MSP:
  mrs r0, msp
_HALT:
  ldr r1,[r0,#20]
  b hard_fault_handler_c
  bkpt #0
 
.size  HardFault_Handler, .-HardFault_Handler

void hard_fault_handler_c(unsigned long *hardfault_args){
  volatile unsigned long stacked_r0 ;
  volatile unsigned long stacked_r1 ;
  volatile unsigned long stacked_r2 ;
  volatile unsigned long stacked_r3 ;
  volatile unsigned long stacked_r12 ;
  volatile unsigned long stacked_lr ;
  volatile unsigned long stacked_pc ;
  volatile unsigned long stacked_psr ;
  volatile unsigned long _CFSR ;
  volatile unsigned long _HFSR ;
  volatile unsigned long _DFSR ;
  volatile unsigned long _AFSR ;
  volatile unsigned long _BFAR ;
  volatile unsigned long _MMAR ;
 
  stacked_r0 = ((unsigned long)hardfault_args[0]) ;
  stacked_r1 = ((unsigned long)hardfault_args[1]) ;
  stacked_r2 = ((unsigned long)hardfault_args[2]) ;
  stacked_r3 = ((unsigned long)hardfault_args[3]) ;
  stacked_r12 = ((unsigned long)hardfault_args[4]) ;
  stacked_lr = ((unsigned long)hardfault_args[5]) ;
  stacked_pc = ((unsigned long)hardfault_args[6]) ;
  stacked_psr = ((unsigned long)hardfault_args[7]) ;
 
  // Configurable Fault Status Register
  // Consists of MMSR, BFSR and UFSR
  _CFSR = (*((volatile unsigned long *)(0xE000ED28))) ;
 
  // Hard Fault Status Register
  _HFSR = (*((volatile unsigned long *)(0xE000ED2C))) ;
 
  // Debug Fault Status Register
  _DFSR = (*((volatile unsigned long *)(0xE000ED30))) ;
 
  // Auxiliary Fault Status Register
  _AFSR = (*((volatile unsigned long *)(0xE000ED3C))) ;
 
  // Read the Fault Address Registers. These may not contain valid values.
  // Check BFARVALID/MMARVALID to see if they are valid values
  // MemManage Fault Address Register
  _MMAR = (*((volatile unsigned long *)(0xE000ED34))) ;
  // Bus Fault Address Register
  _BFAR = (*((volatile unsigned long *)(0xE000ED38))) ;
 
  (void) stacked_r0 ;
  (void) stacked_r1 ;
  (void) stacked_r2 ;
  (void) stacked_r3 ;
  (void) stacked_r12 ;
  (void) stacked_lr ;
  (void) stacked_pc ;
  (void) stacked_psr ;
  (void) _CFSR ;
  (void) _HFSR ;
  (void) _DFSR ;
  (void) _AFSR ;
  (void) _BFAR ;
  (void) _MMAR ;
 
  __asm("BKPT #0\n") ; // Break into the debugger
}

The disassembly point to this code (instruction 0x080105a2):

__i2b:
08010594:   push    {r4, lr}
08010596:   mov     r4, r1
08010598:   movs    r1, #1
0801059a:   bl      0x8010370 <_Balloc>
0801059e:   movs    r2, #1
080105a0:   str     r4, [r0, #20]
080105a2:   str     r2, [r0, #16]
080105a4:   pop     {r4, pc}

I know I could stop printing floating point numbers and the fault would probably stop. But is very handy for debugging purposes. It should work fine.

Right now, I'm stuck on this. Can anyone give me a hand?

Let me know if is any info I could post to better expose the problem.

STM32F103

CMSIS v1 (1.02)

FreeRTOS 10.0.1

shorai · ‎2019-12-19

I often run into problems with sprintf overflowing a buffer, as well as RTOS not allocating enough space for a task, both are my fault.

Running in the debugger and catching the place where the error occurs is often the best approach as the stacks (and memory) may be badly mangled before the fault transpires. i.e. don't necessarily look at the Hard fault registers, by then it may be too late. Rather look at what is goingon immediately before the fault.

Try switching tasks off and isolate the problem to a task or short piece of code.

Formatting and parsing floating point is non trivial and may consume yards of stack with all it's ifs, buts and maybes. You may want to try an alternate sprintf or even ftoa.

In embedded I generally know the range of your FP numbers, so can easily do work arounds. e.g. multiply/divide by 1000s and print as integer with metric suffixes. I once wrote a function to auto scale from 10^24 to 10^-24, (yotta to yocto) very simple and quick.

For embedded work, I prefer to limit usage of malloc and free as sooner or later I get into garbage collection of some form.

As a result I usually use stack space or pre-allocated buffers. It's surprising how little you can get away with without adding complexity to the code.

My code simplified substantially once I learned to work with tasks and static buffers.

BBasn.1 · ‎2024-01-22

I had the same problem, which kept me baffled for a very long time.

I just doubled the "Stack Size(Words)" from "128" to "256" from RTOS -> "Tasks and Queues" and my problem was solved.