The program enters HardFault _ Handler, but no valid stack information can be found
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2025-04-28 11:48 PM
Hello, my program runs to the HardFault _ Handler function, I usually use interrupt check whether there is a function variable overflow, but this time the jump function is irregular, I do not know how to judge, please help me, I will provide the relevant information you need
It finally jumped into the DMA function, but I had no idea why it would trigger HardFault _ Handler, let alone call the UART _ WaitOnFlagUntilTimeout function
Sometimes, my program will jump directly from UART _ WaitOnFlagUntilTimeout function to HardFault _ Handler, in the following line, and I know through debugging that the huart 's Instance is NULL, which makes me not understand.
if (Timeout != HAL_MAX_DELAY)
{
if (((HAL_GetTick() - Tickstart) > Timeout) || (Timeout == 0U))//this line jump to HardFault_Handler
{
return HAL_TIMEOUT;
}
I am using STM32G431RBT6, using three UART and its DMA.
Thanks.
- Labels:
-
DMA
-
STM32G4 Series
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2025-04-29 12:01 AM
Do you know this document ? https://www.keil.com/appnotes/files/apnt209.pdf
The SCB registers, as described there, will tell you more about the cause and location of the fault.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2025-04-29 1:15 AM - edited ‎2025-04-29 1:18 AM
Some compilers optimize no-return functions so that the call stack cannot be seen in the debugger.
Try to change HardFault _ Handler:
void HardFault_Handler(void)
{
// make compiler believe this can return
static volatile int junk = 0;
while(!junk) { __NOP(); }
}
or:
void HardFault_Handler(void)
{
__BKPT(0);
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2025-04-29 1:59 AM
Thank you first for your reply. I tested the two functions you gave, and the phenomenon did not change. I intercepted a more detailed graph of the register state, which contains a page with Instance as NULL. Incidentally, I use the GCC compiler + HAL G4 v1.6.1 package. I can trigger this error steadily
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2025-04-29 2:20 AM
I tried more on Keil, and if I run on Keil without closing a DMA loop sampling ( yes, I forgot to say I have another DMA loop sampling ), it will get into Fault faster.
After annotating the ADC sampling startup code, I can trigger HardFault _ Handler again according to the previous steps in the CLion + GCC compiler, but the stack is still clueless. By the way, I have read the manual. If there is any omission, please tell me.
https://www.keil.com/appnotes/files/apnt209.pdf
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2025-04-29 4:00 AM
This is what the appnote says :
If the occurences are seemingly random, you might have a stacksize problem.
BTW, reasons 'a' and 'c' in this screenshot mean your code tried to call ARM code (instead of ThumbII). But that would happen rather synchronous, so it seems unlikely.
> I tried more on Keil, and if I run on Keil without closing a DMA loop sampling ( yes, I forgot to say I have another DMA loop sampling ), it will get into Fault faster.
Built with another toolchain (Keil) ?
Anyway, you can try to increase the stacksize, somewhere in the project settings.
Especially printf-style formatting library functions, semihosting, or FPU usage can cause problems.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2025-04-29 9:04 AM - edited ‎2025-04-29 9:11 AM
Great, so you confirmed the call stack: something is going in HAL_DMA_Abort (line 248), called while UART_WaitOnFlagUntilTimeout is on stack. This is something to ponder on.
DMA is dangerous, it can overwrite memory and do things.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2025-04-29 7:28 PM
I will try to stop all dma and check it again
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2025-04-29 7:33 PM
Yes, when I use keil, the DMA of the ADC will make my program faster hardfault.
By the way, the errors I trace back through the CmBacktrace library are as follows:
Firmware name: Hello, hardware version: 1, software version: 2
Fault on interrupt or bare metal(no OS) environment
=================== Registers information ====================
R0 : 00000000 R1 : 00000040 R2 : 0000000d R3 : 00000020
R12: ffffffff LR : 080034bb PC : 08002286 PSR: 200b0000
==============================================================
Usage fault is caused by attempts to switch to an invalid state (e.g., ARM)
Show more call stack info by run: addr2line -e Hello.elf -afpiC 08002286 080034ba 0800596e
The result of addr2line parsing is consistent with the following figure
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2025-04-29 10:29 PM
> The result of addr2line parsing is consistent with the following figure ...
The relevant information in this picture is unreadably small. Anyway ...
> Usage fault is caused by attempts to switch to an invalid state (e.g., ARM)
> Show more call stack info by run: addr2line -e Hello.elf -afpiC 08002286 080034ba 0800596e
I suppose this refers to the same lines as in the post timestamped "2025-04-29 2:20 AM" (it's a shame ST doesn't number posts in a thread).
As you might know, ARM cores only fetch instructions from even addresses, thus they often use the LSB of a vector to denote the mode of the routine (ARM or Thumb). And Cortex M only supports Thumb.
Almost certainly your stack becomes corrupted, and the return address overwritten with a "random" value.
As said, try increasing the stack size.
You could profile your code, to see what exactly happens. There are some good (but expensive) tools to do that.
Or go the cheaper route, and use GPIO pins and a scope / logic analyzer, instrumentalising the relevent routines and calls.
The spuriousness suggests it is related to interrupts.
