cancel
Showing results for 
Search instead for 
Did you mean: 

Imprecise error flag after writing to the flash

aga
Associate II

Hi experts,


I'm adding to my existing code (that runs on a STM32F105) a function to store a word in the flash.
I basically copied the code from the official example based on the HAL libraries, here my function:

uint32_t FLASH_Write_Data(uint32_t StartPageAddress, uint32_t *Data, uint16_t NWords)
{
	static FLASH_EraseInitTypeDef EraseInitStruct;
	uint32_t PageError;

	/* Unlock the Flash memory to enable the flash control register access */
	HAL_FLASH_Unlock();
	/* Erase the FLASH area*/
	EraseInitStruct.TypeErase   = FLASH_TYPEERASE_PAGES;
	EraseInitStruct.PageAddress = FLASH_USER_START_ADDR;
	EraseInitStruct.NbPages     = (FLASH_USER_END_ADDR - FLASH_USER_START_ADDR) / FLASH_PAGE_SIZE;

	if (HAL_FLASHEx_Erase(&EraseInitStruct, &PageError) != HAL_OK)
	{
		/*Error occurred while page erase.*/
		return HAL_FLASH_GetError();
	}
	/* Program the user FLASH area word by word*/
	uint32_t i = 0;
	while (i < NWords)
	{
		if (HAL_FLASH_Program(FLASH_TYPEPROGRAM_WORD, StartPageAddress, Data[i]) == HAL_OK)
		{
			StartPageAddress += MEMORY_OFFSET;
			i++;
		}
		else
		{
			/* Error occurred while writing data in Flash memory*/
			return HAL_FLASH_GetError();
		}
	}
	HAL_FLASH_Lock();
	return HAL_OK;
}

The very first strange thing is that as soon as I start debugging, the program counter often jumps to the HardFault_Handler function.

From there, I can reset the chip and restart the debug session without any apparent problem.

However, when I execute the program, it consistently ends up in the HardFault_Handler function.

If I comment out the flash write function, the program works as expected.

I started commenting out part of the flash function code, and I noticed that when the HAL_FLASHEx_Erase and the HAL_FLASH_Program functions are not executed, the program continues to work as expected.

I stepped through the code and I didn't notice anything obviousy wrong, as I can erase and even write the passed value to the passed memory address without any immediate error!

The only thing that happens before the program counter jumps into the HardFault_Handler function is that, after several instructions, the IMPRECISERR flag of the CFS register is set.

I tried moving the flash function into the main code, just after the SystemClock_Config or just after before the while(1), but surprisingly, the instruction where the IMPRECISERR flag is set does not change.

The point where this is happening is on the closing (yes on the close function parenthesis "}") of this function:

bool Shaft_Measure(void)
{
	static int32_t aShaft_old;
	int32_t aShaft;
	msg.a_shaft = (int16_t)aShaft;
        if (ABS(aShaft - aShaft_old) > 1){
		bRead_speed = true;
	}
	else{
		bRead_speed = false;
	}
	aShaft_old = aShaft;
	return bRead_speed;
}

I'm sure that the problem is not there, but unfortunately I don't know how to proceed to identify the issue.

The MCU I'm using is the STM32F105RC which has 256 kB of flash, and the value I'm trying to save it is just a uint32_t number.
The constants used by the flash functions are defined in my flash.h here reported:

#define FLASH_ADDR_PAGE_127     ((uint32_t)0x0803F800)
#define FLASH_USER_START_ADDR   FLASH_ADDR_PAGE_127
#define FLASH_USER_END_ADDR     FLASH_ADDR_PAGE_127 + FLASH_PAGE_SIZE
#define MEMORY_OFFSET           ((uint32_t)0x4U)

These addresses have been copied from the device datasheet.

Here my system:

Segger JLink base (with the latest ver: 7.96l)

STM32CubeIDE (ver: 1.15.1)

STM32CubeMX (ver: 6.11.1)

HAL libraries for STM32F1 (ver: 1.8.5)

 

Any help would be greatly appreciated!

12 REPLIES 12

In this context "Imprecise" means it's a deferred write, ie through the Write Buffers

The code address of the fault is therefore not exact, as it was started a few cycles earlier in the pipe-line, and you're now executing later instructions.

Look at what's actually reported, look at the address of the failed write, which will be correct, and look a little earlier in the code instructions.

Addresses must be 4-byte / 32-bit aligned, and can't be written more than once per erase cycle.

Have a Hard Fault handler that outputs actionable data, during development, and in the field, so support techs can actual identify and fix issues..

https://github.com/cturvey/RandomNinjaChef/blob/main/KeilHardFault.c

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

Hi @Tesla DeLorean ,

first of all, many thanks for your message and for providing the code to catch the HardFault exceptions!

I spent the entire day integrating the code you suggested into my project and I'm sure that it was time well spent.

Here's what I ended up with:

[Hard Fault]
CPU registers dump:
r0 = 00000000, r1 = 20000578, r2 = 200009EC, r3 = 00000000
r4 = 20000A18, r5 = 24264CEE, r6 = 8715A7C8, sp = 2000FFE8
r12= 0803F802, lr = 080045EB, pc = 080045F0, psr= 01000000
bfar=E000ED38, cfsr=00000400, hfsr=40000000, dfsr=00000000, afsr=00000000
Stack dump:
00000001
0800DC81
20000A18
24264CEE
2000FF90
08005833
Instructions dump:
B2DB 2B00 D00A F001 F8A9 4603 71FB 79FB (4618) F001 F8D1 4B3B 2200 701A 

 

As I have the Joseph Yiu's book "The definitive guide to ARM Cortex-3 and Cortex-M4 processors..." I started reading about this fault and I discovered that I could disable the write buffer feature to properly catch the point that triggered the bus fault. Unfortunately I couldn't set the DISDEFWBUF bit (SCnSCB->ACTLR) as my MCU (ARM Cortex-M3 revision r0p1) does not have it.

So, I went throught the stack starting from the last address (0x08005833) and I found these instructions (in the disassembly view):

          Reset_Handler:
08005800:   bl      0x80051a4 <SystemInit>
 68         ldr r0, =_sdata
08005804:   ldr     r0, [pc, #44]   @ (0x8005834 <Reset_Handler+51>)
 69         ldr r1, =_edata
08005806:   ldr     r1, [pc, #48]   @ (0x8005838 <LoopFillZerobss+18>)
 70         ldr r2, =_sidata
08005808:   ldr     r2, [pc, #48]   @ (0x800583c <LoopFillZerobss+22>)
 71         movs r3, #0
0800580a:   movs    r3, #0
 72         b LoopCopyDataInit
0800580c:   b.n     0x8005814 <Reset_Handler+19>
 75         ldr r4, [r2, r3]
0800580e:   ldr     r4, [r2, r3]
 76         str r4, [r0, r3]
08005810:   str     r4, [r0, r3]
 77         adds r3, r3, #4
08005812:   adds    r3, #4
 80         adds r4, r0, r3
08005814:   adds    r4, r0, r3
 81         cmp r4, r1
08005816:   cmp     r4, r1
 82         bcc CopyDataInit
08005818:   bcc.n   0x800580e <Reset_Handler+13>
 85         ldr r2, =_sbss
0800581a:   ldr     r2, [pc, #36]   @ (0x8005840 <LoopFillZerobss+26>)
 86         ldr r4, =_ebss
0800581c:   ldr     r4, [pc, #36]   @ (0x8005844 <LoopFillZerobss+30>)
 87         movs r3, #0
0800581e:   movs    r3, #0
 88         b LoopFillZerobss
08005820:   b.n     0x8005826 <Reset_Handler+37>
 91         str  r3, [r2]
08005822:   str     r3, [r2, #0]
 92         adds r2, r2, #4
08005824:   adds    r2, #4
 95         cmp r2, r4
08005826:   cmp     r2, r4
 96         bcc FillZerobss
08005828:   bcc.n   0x8005822 <Reset_Handler+33>
 99         bl __libc_init_array
0800582a:   bl      0x800dc4c <__libc_init_array>
101         bl main
0800582e:   bl      0x80044d4 <main>
102         bx lr
08005832:   bx      lr
 68         ldr r0, =_sdata
08005834:   movs    r0, r0
08005836:   movs    r0, #0
 69         ldr r1, =_edata
08005838:   movs    r4, r1
0800583a:   movs    r0, #0
 70         ldr r2, =_sidata
0800583c:   b.n     0x80050f8 <HAL_CAN_RxFifo0MsgPendingCallback+92>
0800583e:   lsrs    r0, r0, #32
 85         ldr r2, =_sbss
08005840:   movs    r0, r2
08005842:   movs    r0, #0
 86         ldr r4, =_ebss
08005844:   lsrs    r0, r3, #8
08005846:   movs    r0, #0
115         b Infinite_Loop
          WWDG_IRQHandler:
08005848:   b.n     0x8005848 <WWDG_IRQHandler>
266         TST lr, #4

 

To me the location (0x08005833) looks like the reset handler, althought the address does not perfectly match.

Moreover I don't know how to read the instructions dump.

What are the following steps should I do?

Many thanks! 🙏

 

pc = 080045F0

 

What you want is to look at disasm a couple of instructions before this address, and from content of other registers in the fault handler discern, which was the offending instruction. If you have mixed disasm/C view, it's usually quite obvious what's the problem in the source.

JW

Hi @waclawek.jan,

thanks for your help.

here the registers dump value and below the dissasembly code that contains where the PC points after the HardFault is triggered.

[Hard Fault]
CPU registers dump:
r0 = 00000000, r1 = 20000578, r2 = 200009E8, r3 = 00000000
r4 = 20000A10, r5 = 64264CEE, r6 = 8715A5C8, sp = 2000FFE8
r12= 0803F802, lr = 0800460F, pc = 08004614, psr= 01000000
bfar=E000ED38, cfsr=00000400, hfsr=40000000, dfsr=00000000, afsr=00000000
Stack dump:
00000001
0800DCA9
20000A10
64264CEE
2000FF90
0800585B
Instructions dump:
B2DB 2B00 D00A F001 F8A9 4603 71FB 79FB (4618) F001 F8D3 4B3B 2200 701A 

Disassembly around the address: 0x08004614

279       		if (DEV_VAR_X == NDevice_variant){
080045f8:   ldr     r3, [pc, #252]  @ (0x80046f8 <main+512>)
080045fa:   ldrb    r3, [r3, #0]
080045fc:   cmp     r3, #0
080045fe:   bne.n   0x8004620 <main+296>
282       			if (bRun_task_shaft_meas){
08004600:   ldr     r3, [pc, #260]  @ (0x8004708 <main+528>)
08004602:   ldrb    r3, [r3, #0]
08004604:   uxtb    r3, r3
08004606:   cmp     r3, #0
08004608:   beq.n   0x8004620 <main+296>
285       				bool bRead_speed = Shaft_Measure();
0800460a:   bl      0x8005760 <Shaft_Measure>
0800460e:   mov     r3, r0
08004610:   strb    r3, [r7, #7]
286       				Shaft_Speed(bRead_speed);
08004612:   ldrb    r3, [r7, #7]
08004614:   mov     r0, r3
08004616:   bl      0x80057c0 <Shaft_Speed>
289       				bRun_task_shaft_meas = false;
0800461a:   ldr     r3, [pc, #236]  @ (0x8004708 <main+528>)
0800461c:   movs    r2, #0
0800461e:   strb    r2, [r3, #0]
298       		if (DEV_VAR_X == NDevice_variant){
08004620:   ldr     r3, [pc, #212]  @ (0x80046f8 <main+512>)
08004622:   ldrb    r3, [r3, #0]
08004624:   cmp     r3, #0
08004626:   bne.n   0x8004638 <main+320>
301       			if (bRun_task_LEDs){
08004628:   ldr     r3, [pc, #224]  @ (0x800470c <main+532>)
0800462a:   ldrb    r3, [r3, #0]
0800462c:   uxtb    r3, r3
0800462e:   cmp     r3, #0
08004630:   beq.n   0x8004638 <main+320>
306       				bRun_task_LEDs = false;
08004632:   ldr     r3, [pc, #216]  @ (0x800470c <main+532>)
08004634:   movs    r2, #0
08004636:   strb    r2, [r3, #0]

 The correspondig C code is this:

#ifdef	USE_HALL_SENSORS

		if (DEV_VAR_X == NDevice_variant){

			/* Run the shaft measurement task - triggered by HAL_TIM_IC_CaptureCallback() */
			if (bRun_task_shaft_meas){

				bool bRead_speed = Shaft_Measure();
				Shaft_Speed(bRead_speed);

				bRun_task_shaft_meas = false;
			}
		}

#endif	// USE_HALL_SENSORS

From your suggestion as the PC points to Shaft_Speed(bRead_speed) the error should be around here.

void Shaft_Speed(bool bRead_speed)
{
	if (bRead_speed){
		WG_TX_1_msg.n_shaft = (uint16_t)(1000000/htim4.Instance->CCR1);
	}
	else{
		WG_TX_1_msg.n_shaft = 0U;
	}
}

Here I don't really see anything wrong, as the code it is very simple. Looking a bit before, I have the other function Shaft_Measure (reported in the first post), which also looks good to me, so I have no idea where the issue could be.

Any other help please?

Many thanks!

Would look at what R7 points at

00000000 B2DB			uxtb	r3, r3
00000002 2B00			cmp	r3, #0
00000004 D00A			beq.n	loc_00001C
00000006 F001 F8A9		bl	sub_00115C
0000000A 4603			mov	r3, r0
0000000C 71FB			strb	r3, [r7, #7]  <<<<
0000000E 79FB			ldrb	r3, [r7, #7]
00000010 4618			mov	r0, r3
00000012 F001 F8D3		bl	sub_0011BC
00000016 4B3B			ldr	r3, [pc, #236]	; ($000104)
00000018 2200			movs	r2, #0
0000001A 701A			strb	r2, [r3, #0]
Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

I'd say the problem happens in

08004610: strb r3, [r7, #7]

can we  know r7 from the hardfault handler?

OTOH it's a local variable so should be located at the stack; don't quite understand what might have happened to r7 (which is probably the local frame pointer, i.e. points to the stack where local variables are allocated).

Is the fault reproducible? Can you get content of r7?

JW

Hi both,

Here what I found while steping through the disassembly code around the instruction you both mentioned.
I took some time to read the values to provide more detailed information.

All values were read before the execution the line they are commenting.

265       		if (bRun_task_analog){
08003bde:   ldr     r3, [pc, #296]  @ (0x8003d08 <main+524>) --> pc = 0x8003BDE
08003be0:   ldrb    r3, [r3, #0]    --> r3 = 0x200007DC (contains 0x0)
08003be2:   uxtb    r3, r3          --> r3 = 0x1
08003be4:   cmp     r3, #0
08003be6:   beq.n   0x8003bfc <main+256>
267       			ADC_Measure();
08003be8:   bl      0x8002214 <ADC_Measure>
269       			ADC_Copy_Results(NDevice_variant); --> pc = 0x8003BEC
08003bec:   ldr     r3, [pc, #268]  @ (0x8003cfc <main+512>) --> r3 = 0x200007E8
08003bee:   ldrb    r3, [r3, #0]    --> r3 = 0x0
08003bf0:   mov     r0, r3
08003bf2:   bl      0x80022bc <ADC_Copy_Results>
272       			bRun_task_analog = false;
08003bf6:   ldr     r3, [pc, #272]  @ (0x8003d08 <main+524>) --> pc = 0x8003BF6
08003bf8:   movs    r2, #0
08003bfa:   strb    r2, [r3, #0]    --> r3 = 0x200007dc (contains 01010000)
280       		if (DEV_VAR_X == NDevice_variant){
08003bfc:   ldr     r3, [pc, #252]  @ (0x8003cfc <main+512>) --> pc = 0x8003BFC
08003bfe:   ldrb    r3, [r3, #0]    --> r3 = 0x200007E8 (contains 0x0)
08003c00:   cmp     r3, #0
08003c02:   bne.n   0x8003c24 <main+296>
283       			if (bRun_task_shaft_meas){
08003c04:   ldr     r3, [pc, #260]  @ (0x8003d0c <main+528>) --> pc =  0x8003C04
08003c06:   ldrb    r3, [r3, #0]    --> r3 = 0x200007DD (contains 0x00010100)
08003c08:   uxtb    r3, r3          --> r3 = 0x1
08003c0a:   cmp     r3, #0
08003c0c:   beq.n   0x8003c24 <main+296>
286       				bool bRead_speed = Shaft_Measure();
-----------------------------------------------------------> jumps to code below



          Shaft_Measure:
08004d64:   push    {r7}            --> r7 = 0xFFFFFFFF
08004d66:   sub     sp, #12         --> sp = 0x2000FFE4 (contains 0xFFFFFFFF)
08004d68:   add     r7, sp, #0      --> sp = 0x2000FFD8 (contains 0x02F80308)
448       	bool bRead_speed = false;
08004d6a:   movs    r3, #0
08004d6c:   strb    r3, [r7, #7]    --> r7 = 0x2000FFD8 (contains 0x02F80308)
451       	if (htim1.Instance == NULL) {
08004d6e:   ldr     r3, [pc, #72]   @ (0x8004db8 <Shaft_Measure+84>) --> pc = 0x8004D6E
08004d70:   ldr     r3, [r3, #0]    --> r3 =  0x20000884 (contains 002C0140)
08004d72:   cmp     r3, #0
08004d74:   bne.n   0x8004d7a <Shaft_Measure+22> -> jumps to 0x08004d7a
-------
452       		return false; // or handle the error as appropriate
08004d76:   movs    r3, #0
08004d78:   b.n     0x8004dae <Shaft_Measure+74>
458       	aShaft = (int32_t)htim1.Instance->CNT;
-------
08004d7a:   ldr     r3, [pc, #60]   @ (0x8004db8 <Shaft_Measure+84>) --> pc = 0x8004D7A
08004d7c:   ldr     r3, [r3, #0]    --> r3 =  0x20000884 (contains 002C0140)
08004d7e:   ldr     r3, [r3, #36]   @ 0x24 --> r3 = 0x0
08004d80:   str     r3, [r7, #0]
459       	WG_TX_1_msg.a_shaft = (int16_t)aShaft;
08004d82:   ldr     r3, [r7, #0]    --> r7 = 0x2000FFD8 (contains 0x0)
08004d84:   sxth    r2, r3
08004d86:   ldr     r3, [pc, #52]   @ (0x8004dbc <Shaft_Measure+88>) --> pc = 0x8004D86
08004d88:   strh    r2, [r3, #6]    --> r3 = 0x20000614 (contains 0x0)
469       	if ((int32_t)abs(aShaft - aShaft_old) > 1){
08004d8a:   ldr     r3, [pc, #52]   @ (0x8004dc0 <Shaft_Measure+92>) --> pc = 0x8004D8A
08004d8c:   ldr     r3, [r3, #0]    --> r3 = 0x200009E8 (contains 0x0)
08004d8e:   ldr     r2, [r7, #0]    --> r7 = 0x2000ffd8 (contains 0x0)
08004d90:   subs    r3, r2, r3      --> both r2 and r3 = 0x0
08004d92:   cmp     r3, #0
08004d94:   it      lt
08004d96:   neglt   r3, r3
08004d98:   cmp     r3, #1
08004d9a:   ble.n   0x8004da2 <Shaft_Measure+62> -> jumps to 08004da2
-------
470       		bRead_speed = true;
08004d9c:   movs    r3, #1
08004d9e:   strb    r3, [r7, #7]
08004da0:   b.n     0x8004da6 <Shaft_Measure+66>
473       		bRead_speed = false;
-------
08004da2:   movs    r3, #0          --> r3 = 0x0
08004da4:   strb    r3, [r7, #7]    --> r7 = 0x2000FFD8 (contains 0x0)
475       	aShaft_old = aShaft;
08004da6:   ldr     r2, [pc, #24]   @ (0x8004dc0 <Shaft_Measure+92>) --> pc = 0x8004DA6
08004da8:   ldr     r3, [r7, #0]    --> r7 = 0x2000FFD8 (contains 0x0)
08004daa:   str     r3, [r2, #0]    --> r2 = 0x200009E8 (contains 0x0)
476       	return bRead_speed;
08004dac:   ldrb    r3, [r7, #7]    --> r7 = 0x2000FFD8 (contains 0x0)
477       }
08004dae:   mov     r0, r3          --> r3 = 0x0
08004db0:   adds    r7, #12         --> r7 = 0x2000FFD8 (contains 0x0)
08004db2:   mov     sp, r7          --> r7 = 0x2000FFE4 (contains 0xFFFFFFFF)
08004db4:   pop     {r7}
08004db6:   bx      lr              --> lr = 0x8003C13 (jumps to the code below)
08004db8:   lsrs    r4, r0, #2
08004dba:   movs    r0, #0
08004dbc:   lsls    r4, r2, #24
08004dbe:   movs    r0, #0
08004dc0:   lsrs    r0, r5, #7
08004dc2:   movs    r0, #0
482       {


-----------------------------------------------------------
08003c0e:   bl      0x8004d64 <Shaft_Measure>
08003c12:   mov     r3, r0          --> r0 = 0x0
08003c14:   strb    r3, [r7, #7]    --> r7 = 0xFFFFFFFF
287       				Shaft_Speed(bRead_speed);
08003c16:   ldrb    r3, [r7, #7]    --> r3 now contains 0x0 and r7 still 0xFFFFFFFF
08003c18:   mov     r0, r3
08003c1a:   bl      0x8004dc4 <Shaft_Speed>
290       				bRun_task_shaft_meas = false;

Why did you point to that particular instruction?
I continued to step through the code, and although the error was triggered at the same point, I was still able to proceed to other instructions below the point that normally jumps to the HardFault handler.
Why is this happening?

Still many thanks for further help! 🙏

Hi @waclawek.jan,

the fault happens every time I run the code, but I'm not sure if that would be reproducible... It would take some time (that I do not have) to create a similar program.

The value of r7 while the function Shaft_Measure is executed it is always 0x2000FFD8 which contains 0x02F80308, and never changes.

Then after the HardFault within the startup assembly is invoked it changes to 0xFFFFFFFF.

When it executes the hard_fault_handler_c function it changes again to 0x2000F90 which now contains 0xC8A50587 (that was missing from the first dump - sorry).

I really hope that this could help to find the issue.

Many thanks!

Why does the offending code's address keep changing?

Are you adding/removing code for the various experiments? That makes it a moving target, harder to aim and hit.

In the above code, Shaft_Measure() is irrelevant - r7 is pushed to stack at the beginning and then popped back at end, i.e. it's unchanged. As your comment also indicates, r7 was already 0xFFFFFFFF at the point where Shaft_Measure()  was called, and that's incorrect value as it's used as stack frame i.e. it should point somewhere near the top of stack.

In other words, go back to the beginning of the calling function (i.e. top of function which called Shaft_Measure()), and observe how r7 is set up there.

JW