IAP Procedure doesn't work - strange behavoiur

Roberto Giovinetti · ‎2018-08-10

Hello everyone

I'm using STM32F072 microcontroller. I've to update my software reading from an external memory via SPI the whole HEX program, and the program it in the flash. To do so, I do the steps:

Create a section in the linker file called ramfunc
Some functions that i need (SPI communication, Flash reprogramming) are "marked" with __attribute__((section("ramfunc))) so a the startup are copied into SRAM
When I need to update the software I call the function in RAM and I want to reprogram entire flash

Well everything works if I "simulated" the update procedure: if I don't Mass erase i can debug the process and I can see that whole the functions are in RAM, so I don't see flash branches in my assembler, the code is executed in RAM and it works normally.

If I try to Mass Erase the Flash and I debug, I can do some steps, but after a while (in a CMP assembler instruction) i see that program jump tu 0xFFFFFFFE, as there is a branch in a flash location!!!

Prior to call RAM function I disable interrupts: i put some break point to see if during function in RAM there are some jumps to flash, but there's not! I can't understand! I'also checked than optimization don't insert some jumps to routine (for example _gnu_thumb1_case_uqi): the compiler do the optimizations and create veener in RAM, so all the process besides in the RAM.

The only thing maybe is that there's a Hard Fault but I cannot debug it and I don't know why the hard fault happens some instructions after the completition of the Mass Erase instruction, prior to reprogram the firsts bytes of the flash, when I'm parsing HEX file from SPI.

Any ideas?!

EDIT1

I also checked option bytes, and Level protection is set to 0, so I can read/write/debug and the Mass Erase procedure doesn't change information Block of the address space, so option bytes are default ST values.

EDIT2

Ok: i've moved isr vector table to SRAM and performed __HAL_SYSCFG_REMAPMEMORY_SRAM(). I've moved HardFault_Handler to the SRAM.

Now the error that is triggered, jump my code to HardFault and I've to investigate why it happens!!!

EDIT3

I've debugged application and I obtained the Fault Analizer image as follows

I notice that SP and PC are pointing to valid memory regions. I've not any information that can help me more....

EDIT4

Now I found that during various attempts in the pc register I've 0x08000108 that's the address for __gnu_thumb1_case_sqi routine. I can't understand why!!!! The loader created the __gnu_thumb1_case_sqi _veneer at address 0x20000700 (SRAM). The only thing that I can think is that the problem is the snippest that follows:

20000700 <____gnu_thumb1_case_sqi_veneer>:
20000700:	b401      	push	{r0}
20000702:	4802      	ldr	r0, [pc, #8]	; (2000070c <____gnu_thumb1_case_sqi_veneer+0xc>)
20000704:	4684      	mov	ip, r0
20000706:	bc01      	pop	{r0}
20000708:	4760      	bx	ip
2000070a:	bf00      	nop
2000070c:	08000109 	.word	0x08000109

EDIT5

__gnu_thumb1_case_sqi_veneer is not the matter. I've tried no optimizations, so the snippest was removed from code, but still I've the problem that a hard fault is triggered. I can't understand why. Now in lr I find 0xFFFFFFFF when we have to jump to the function that erase the whole flash. Strange behaviours...

Tesla DeLorean · ‎2018-08-10

Behaviour does suggest it still has some dependencies. The CM0 is more difficult due to the use of library functions to perform even simple operations (math)

Suggest you do a disassembly of the code generated, and walk all of it manually.

I tend to write my Flashing routines in assembler where I can keep some address independent code in a short linear block, and I know what I'm calling.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

Pavel A. · ‎2018-08-10

Jump to 0xFFFFFFFE looks like a return from interrupt handler. Do you have active interrupts while executing code in RAM? If yes, where are the interrupt vectors?

IMHO doing mass erase for IAP update is an interesting exercise, but too risky for production. A separate bootloader & updater in flash is more reliable. This also is simpler because copying code to RAM is not needed. You can erase and write a flash page from code running in a different page.

-- pa

Tesla DeLorean · ‎2018-08-10

Agree with @Pavel A. splitting the loader/app, only selectively erase blocks, and have the loader do a full-span integrity check of the app before jumping in, and going into a recovery mode if not.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

Roberto Giovinetti · ‎2018-08-10

Hello Pavel and Clive

I agree with you that is a risky operation and separate bootloader will be more flexible: but in that case when I "parse" HEX file of the new version I should know prior what addresses need to be programmed or not. At this point, i'm looking for a solution that "works", for testing purpose and I would like to understand why fully erase doesn't work.

Maybe in future I can create different sections in the linker files and put bootloader at the orinal start address, and the rest of the program in the sections i've created.

Roberto

Pavel A. · ‎2018-08-13

So... Do you have active interrupts while executing code in RAM?