2023-10-27 04:27 AM
Hello,
I have developed an application that saves accelerometer data when a machine starts up. This data is stored in the last page of the controller's memory (page 31).
I'm encountering a HardFault exception that occurs intermittently, roughly 1 in 20 to 50 times when running the code with the debugger attached. However, it consistently happens when flashing and running the application without the debugger.
I'm seeking guidance on how to effectively debug this issue. Here's a snippet of the relevant code:
The page parameter is 31, with a size of 1 when this function is called.
Any assistance in troubleshooting this problem would be greatly appreciated.
static _Bool flash_erase_pages(uint8_t page,uint8_t size)
{
static FLASH_EraseInitTypeDef EraseInitStruct;
uint8_t bResult=0;
uint8_t retry=0;
static volatile uint32_t flasherror;
uint32_t PAGEError;
do{
/* Unlock the Flash to enable the flash control register access *************/
HAL_FLASH_Unlock();
/* Clear OPTVERR bit set on virgin samples */
__HAL_FLASH_CLEAR_FLAG(FLASH_FLAG_OPTVERR);
/* Fill EraseInit structure*/
EraseInitStruct.TypeErase = FLASH_TYPEERASE_PAGES;
EraseInitStruct.Banks = FLASH_BANK_1 ;
EraseInitStruct.Page = (uint32_t)page;
//EraseInitStruct.NbPages = ((EndPage - StartPage)) +1;
EraseInitStruct.NbPages = size;
if (HAL_FLASHEx_Erase(&EraseInitStruct, &PAGEError) != HAL_OK)
{
/*Error occurred while page erase.*/
flasherror = HAL_FLASH_GetError ();
}
else{
bResult = 1;
}
HAL_FLASH_Lock();
if(bResult) return true;
HAL_Delay(1);
if(++retry>5) return false;
}while(1);
}
on the occasional times i was able to generate this while i was debuggin i got these data :
The HardFault happens in HAL_FLASH_Lock(); -> SET_BIT(FLASH->CR, FLASH_CR_LOCK);
before we enter the HardFault i see this data in the flash_erase_pages function.
PAGEError = 2103
PAGEError = 536876972 (another time)
Which seems to make no sense as there are only 32 pages.
flasherror = 0
Upon a successful flash write, the value of PAGEError is consistently 0xFFFFFFFF. I'm perplexed by what might be causing this unexpected behavior.
Here's the structure of EraseInitStruct for your reference:
Thank you
2023-10-30 06:05 AM
The STM32G030K8 only have 1 bank
2023-10-30 06:18 AM
I've made some modifications to the code to initiate a page wipe 1 second after startup and then every 4 seconds subsequently. Additionally, I've set up an LED on my board to light up immediately before and right after the page erasure. The LED's behavior serves as an indicator for me: if it lights up briefly, the page erasure was successful. However, if it remains illuminated, it suggests a hard fault has occurred.
The page erases correctly in the following scenarios:
1) When I run the program with a debugger.
2) When I've just bootloaded the code, (the application starts automatically after bootload), the page wipe performs as expected.
However, a hard fault occurs when:
1) I bootload the code and then power cycle the controller (turning it off and back on).
I'm perplexed as to why the system behaves differently after a power cycle, especially since it operates smoothly after bootloading or while debugging without any power interruption.
Would you be open to assisting me in investigating this matter? I'd be happy to treat you to a few cups of coffee or more :) in appreciation for your help!
2023-10-30 09:37 AM
Hmm, I'm not sure I can help much more than what I'm writing here. I don't have a G030 board and I'll be gone for the next few weeks.
If you really need the problem solved, here's what I would do if it were my project:
2023-10-30 11:51 AM
Great advice !
1) "You can attach a debugger to a chip in such a state without resetting it".
This i did not know but can be a great help !
2) After removing the retry logic and introducing a brief delay post-initializing the SPI CS pin, I noticed the print started to work again. However, I'm puzzled as to why an SPI read attempt leads to a Hard Fault when erasing the last page, especially given that this fault appears several instructions later. I double checked the code that talks with the SPI chip, this seems oké.
// In main.c, I used this retry code to initialize the SPI interface.
// The initialization did not worked the first attempt because I attempted communication // too quickly after initializing the CS pin. After introducing a brief delay, the retry // became unnecessary, and the chip no longer enters a Hard Fault state.
uint8_t retry=0;
for (retry = 0; retry < 3; retry++) {
paccelero_handle = lis2dw12_init(&hspi2, SPI_CS) ;
if (!paccelero_handle) {
pApp_h->status.flags.bAcceleroError = 1;
SetLedPattern(led_Error);
} else {
SetLedPattern(led_Alive);
break;
}
}
lis2dw12_t *lis2dw12_init(SPI_HandleTypeDef *phspi,Pins_t cs_pin){
lis2dw12.pSPIinterface = phspi;
lis2dw12.cs_pin = cs_pin;
WritePin(lis2dw12.cs_pin,1) ; //SPI mode disable
HAL_Delay(2); // -> adding this delay solved the Hard fault at page wipe
if(!lis2dw12_read_device_info()){
return 0;
}
return &lis2dw12;
}
3) I've incorporated USART output and added several printf statements in the code. These were the results I observed when the Hard Fault occurred. However, my limited knowledge of STM assembler and its inner mechanisms prevents me from comprehending the entire situation. I'll need more time to study and analyze this thoroughly.
R0 = 0xFFFFFFFF
R1 = 0xFFFFFFFF
R2 = 0xFFFFFFFF
R3 = 0xFFFFFFFF
R12 = 0x20000410
LR [R14] = 0x200003EC subroutine call return address
PC [R15] = 0x00000001 program counter
PSR = 0xFFFFFFF9
ICSR = 0x0440F003
AIRCR = 0xFA050000
SCR = 0x00000000
CCR = 0x00000208
SHCSR = 0x00000000
2023-10-30 01:17 PM
> LR [R14] = 0x200003EC subroutine call return address
Does your code run in the RAM?
2023-10-30
01:58 PM
- last edited on
2023-11-13
04:34 AM
by
Lina_DABASINSKA
please check if your program occupied the second bank of flash. If so you can not program the flash.
You are wrong and it has nothing to do with read-while-write. Learn how these things work before you "consult" other people!
2023-10-31 12:46 AM
no
2023-10-31 11:50 AM
Then how come that return address in LR points to RAM? Stack overwrite, again?
2023-10-31 01:55 PM
If I had a clear answer, I would provide it. To truly understand what happens when a Hard Fault occurs, I need to delve deeper into the study of STM registers, assembly, and their inner workings. For now i got the issue fixed , thanks to checking all my code and make sure no repeats are happening just likeTDK proposed, but i do not know what triggered the Fault.