2018-04-28 12:00 PM
I have developed a Firmware upgrade Over The Air (FOTA) mechanism for my custom F7 board that seems to work properly, but hard faults during the subsequent reboot process of the main application.
Thanks to Clive1's suggestions I was able to fashion a reliable bootloader that checks a pre-defined RAM location for a pre-defined value. This value sends the program to either 1) the main application (at 0x8040000), 2) the firmware upgrade flash function (co-resides with the bootloader sector at 0x8000000), or jumps to the DFU (0x1FF00000).
The FOTA function in the main app downloads a binary image file over the air to RAM. The RAM start location is pre-defined so that both the app and bootloader know its location. Once the binary image has been successfully downloaded to RAM, the file size is also stored in a pre-defined RAM location. The app then does a system reset passing control back to the bootloader.
void FOTAEndSession()
{ SCB_DisableDCache(); *((unsigned long *)0x2007FEE0) = fw_upgrade_file_size; *((unsigned long *)0x2007FEF0) = 0xDEADF00D; // Boot to Flasher __DSB(); NVIC_SystemReset();}The bootloader then flashes the binary image to the app location (0x80400000). It then sets the stack pointer and VTOR to the appropriate application address and jumps to this address.
case 0xDEADF00D: //Do FW upgrade then jump to main app
{ SaveFlashApplication(size); appStack = (uint32_t) *((__IO uint32_t*)APPLICATON_ADDRESS); appEntry = (pFunction) *(__IO uint32_t*)(APPLICATON_ADDRESS + 4); SCB->VTOR = APPLICATON_ADDRESS; __set_MSP(appStack); appEntry(); while(1); break; }The code successfully arrives at the app location and begins to run the startup code. However, something causes a hard fault in/around the subsequent startup (_stm32f765xx.s) code bl __libc_init_array. I have stepped through the code and it appears that the processor is skipping over this array initialization. The TrueStudio Fault Analyzer says:
This seems to have something to do with data alignment, but I have no clue where. I have seen other threads that talk about DFU data alignment issues. But I am not using DFU. I am simply programming the image directly to flash. The size of the binary image that gets flashed is 0x2B1EF, so I pass the size value of 0x2B1F0 to the flasher in the bootloader hoping that it may avoid some sort of data alignment/boundary issue with the flasher. to no avail.
The other curious thing is, if I load flash manually using J-Link the
__libc_init_array. code executes and
application runs properly.I'm pulling what's left of my hair out trying to figure out what's going on. If anyone has any suggestions, I'm all ears.
2018-04-28 01:35 PM
Make sure the Boot Loader side initialization doesn't trash the data you are going to Flash. ie Zeroing statics, stack, etc
I'm presuming the data is in RAM, not to a secondary FLASH area.
When I do a reboot with constants in RAM it is to code that I have in the Reset_Handler, not C code deeper in.
Do a CRC or checksum of the Application FLASH image, compare with the PC side computed value, I package app firmware with a CRC so the loader can check it for validity before calling, this way a broken image doesn't brick the device.
You could also try reading back the current data using ST-LINK Utilities and compare against the valid/working image.
Check the SP/PC values being seen by the loader.
2018-04-30 09:54 AM
Clive,
Many thanks for your thoughts.
Make sure the Boot Loader side initialization doesn't trash the data you are going to Flash. ie Zeroing statics, stack, etc
Check.
I'm presuming the data is in RAM, not to a secondary FLASH area.
Yes.
When I do a reboot with constants in RAM it is to code that I have in the Reset_Handler, not C code deeper in.
Yes. I realize that. But I am doing some additional functions in the bootloader that are much easier to generate with C code.
You could also try reading back the current data using ST-LINK Utilities and compare against the valid/working image.
Check. Did this with my Segger J-Link.
Check the SP/PC values being seen by the loader.
Check.
Do a CRC or checksum of the Application FLASH image, compare with the PC side computed value, I package app firmware with a CRC so the loader can check it for validity before calling, this way a broken image doesn't brick the device.
This is where the problem was. My radio does a CRC on each received packet so that my code doesn't need to, but my code had an error in the packet counter. So every 50th byte in the RAM image had a 2 byte gap. UGH! Fixing this makes everything work beautifully!
Clive, thanks again for your valuable contribution.! Very much appreciated.