HardFault Exception: Why oh why??!

relaxe · ‎2010-01-13

Posted on January 14, 2010 at 08:41

ping · ‎2011-05-17

Posted on May 17, 2011 at 12:55

Usually, a HardFault Exception is caused by access to non-exist memory area.

To debug it, you may put a breakpoint at exception handler and exam the stack to check where it comes from when the exception triggered. Know the fault location will help you find out the root cause.

;)

16-32micros · ‎2011-05-17

Posted on May 17, 2011 at 12:55

Dear Joseph,

I'm facing an issue with Usage Fault, I have implemented your method posted in our forum and also described in your Book : which is really very helpful and better than ARM Cortex-M3 Manuals for describing theses kind Issues { comments already escalated to your colleagues @ ARM :) }

The issue is happening after 5/6 hours of code run using RVCT Compiler, linker -O1 from Keil which are the same as RVDS/ARM. Here attached a screen-shot, could you please give some pointers, what is going wrong ? Moving the Compiler options to -02 , the problem disappears. It seems it to me a Corrupted stack in an IRQ event. I've used SWV but it can not help since the asynchronous trace is not efficient, Next step, I will use a real Trace with SignumJtrace-Cortex-M3. Are you aware of such issues with RVCT/Keil ?

Thank you a lot in advance.

Cheers,

STOne-32.

joseph239955 · ‎2011-05-17

Posted on May 17, 2011 at 12:55

Hi STOne-32,

Thanks for your kind words :)

[LR = 0xFFFFFFF1]

Fault occurred in an exception handler.

[PC = 0x08000E4C]

Is that part of hard fault handler?

[Stacked_PC value = 0]

I guess this is the caused.

Somehow an interrupt handler stack frame

got courrupted so the return address is 0.

When exception return is carried out, it got

INVSTATE fault because T bit is 0 (ARM state).

[Stacked_LR = 0x08005587] (code shown in image)

It is likely that this part of code is executed shortly

before it crash. When BL.W messagehandler is executed

0x08005587 is store onto LR. But the actual fault

can happen sometime after that. The fault could be in

a different interrupt handler (if the crash is caused

by an exception handler), or could be in the

messagehandler.

It might worth checking the stack memory location

to see if any address values were pushed onto the stack

during the interrupt handler. It might give you hint

where the problem is. But then, the address value

could be put there ages ago and not related to the fault.

Have you tried interrupt trace feature in RealView-MDK?

It would be useful to know what is the interrupt sequence

just before the crash:

http://www.keil.com/support/man/docs/ulink2/ulink2_trace_exception.htm

From the stacked_PSR it shown exception 35 (SPI1 ?).

Maybe worth checking what exceptions have higher priority

than this exception, than we can narrow down the cause of the

issue to small number of exception handlers.

When the code is compiled with different optimizations,

it is possible that some local variables were kept in

the register bank rather than being put onto the stack.

As a result, the stack corruption problem could disappear

when you use higher level of optimization.

(even the memory location is corrupted, it might not be

causing problem as less stack is used).

Yes, getting a full instruction trace would be ideal.

Cheers

Joseph

[ This message was edited by: joseph.yiu on 10-12-2008 17:17 ]

stm324 · ‎2011-05-17

Posted on May 17, 2011 at 12:55

Dear All,

i know it's a quite old thread, but i encountered similar problems so i will proceed with it.

I also have problems with hard fault exception on an STM32F103VET6. I extended the hard fault exception handler like described by joseph.yiu and relaxe. Now I have some results at the exception handler:

[Hard fault handler]

R0 = 200003f8

R1 = 200003f8

R2 = 20000400

R3 = 20000400

R12 = 20000408

LR = 20000408

PC = 20000410

PSR = 20000410

BFAR = e000ed38

CFSR = 400

HFSR = 40000000

DFSR = b

AFSR = 0

But what this means to me? How can i locate the cause for the fault?

I'm new with the STM32 and would appreciate your assistance.

Best regards

Tom

domen2 · ‎2011-05-17

Posted on May 17, 2011 at 12:55

Your registers seem... wrong? Seems like you're printing addresses instead of contents or what?

I debug this like this:

- check PC for code that caused the fault (and hope it's not ''inexact'', then PC might not be exactly at the instruction).

- check BFAR (if valid, meaning not containing its address) for address that was read/written and caused the fault.

matic2 · ‎2012-09-14

Posted on September 14, 2012 at 16:17

Hi,

I had exactly the same problem. Hard faultoccurredwhen I at the same time declared and initialized local variable in a function. If I used a global variable or declare the local variable as static everything was ok.

The hard fault was caused at local variable initialization by unknown instruction.

I found the solution in

https://my.st.com/public/STe2ecommunities/mcu/Lists/STM32Discovery/DispForm.aspx?ID=2593

thread.

After I added the flags

cortex-m4 -mthumb to LDFLAGS varialbe (linker flags) there were no more hard faults. I am using stm32f407xxmicro-controllerand arm-none-eabi toolchain.

Regards,

Matic

engenharia9 · ‎2013-08-22

Posted on August 22, 2013 at 16:13

Hi Joseph

I have a big problem with the hard fault STM32F103RTB7

I made a lot of PCIs, five works well and seventeen don't work always with hard fault, but all PCI go to hardfault in differents moments.

I read this topic and I included the hard fault code, see bellow

R0 = 0x7261485B

R1 = 0x61662064

R2 = 0x20746C75

R3 = 0x646E6168

R12 = 0x5D72656C

LR = 0x62363400

PC = 0x386638

PSR = 0x305247C0

BFAR = 0x42993B01

MMSR = 0x0

HFSR = 0x40000000

DFSR = 0xB

AFSR = 0x0

BFSR = 0x82

UFSR = 0x0

Tesla DeLorean · ‎2013-08-22

Posted on August 22, 2013 at 16:46

The date on the thread is deceptive, the May date is when the forum melted down and was reconstructed, the thread is from 2010, or perhaps 2008. We don't see Joseph here that often, but his books are certainly recommended reading.

Looks like the register data is full of ASCII? Review the stack, or have the debugger fired up to look at the actual system in failure.

The PC looks bogus, as does LR. It's going to be hard to pin-point the faulting instruction here, which is critical. As a first step you're going to want to validate that your fault routine works properly, as I'm not convinced it is. ie generate a fault at a known address.

Your Hard Faults likely occur for the same reasons everyone elses do, namely

Stack Corruption, or inadequate stack size, causing immediate or latent failure.

Read/Writes to inappropriate memory addresses.

Executing 32-bit instructions.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..