2023-11-07 03:32 PM - edited 2023-11-07 03:35 PM
I have a project using the STM32L072CZU6, with a custom bootloader that computes CRCs over blocks of the App part in the flash and compares it with block-CRCs in the app header.
This has worked for this project for 2 years or so, from time to time adding a bit of functionality to the app code.
... ah, forget about the long story of what I did so far, I narrowed this down a lot ...
My current firmware image is logically divided into 7 blocks of 16K, the last block is only ~14K here.
Now, extracting, via debugger, that last block, which has the bad CRC problem, and compared it to the same block from my source image. At a certain offset, the hex editor reveals that 4 extra bytes are "inserted" in what I read back from flash, compared to the original image, sucth that the rest of the image is shifted, in memory, 4 bytes forwards.The offset where that happens is, from app image start: 6*16K + 864 bytes. The debug config is set up to flash the bootloader of 32K first, but IIRC the loader vs. app are handled separately, as it has 2 .ELF files for debugging.
So, it's not broken flash or so.
I deleted the 4 bytes in the editor, and the contents then all matched, save missing 4 byts at the end - and all the CRCs on bootloader and hader generator ends check out here, so it's not that my debugger had a hiccup when extracting the memory - this really was what caused the (exact same value) bad CRC.
I see only 2 possible branches of how this can happen:
- when my app header generator program calculates the CRC for the last block, the data this is based on differs from the saved .ELF file that CubeIDE uses to flash: I use an .elf read/write library to get the data, compute CRC, and enter the CRC values into a header section, which I added in the linkerscript.
- or, the CubeIDE flash algorithm does weird stuff
What's bizarre to me is that:
- this happens only for a version of my app code that has some 650 bytes of change in them, where particular code is linked in, because the new module gets called in the code
- when I comment out the call, so the linker throws it out, and add nonsense code to inflate the image artificially, to the exact same size as the bad image - the CRC values are all good and all is fine - so it's not an issue with the app image size, but the content... which I find rather puzzling.
Anyone seen such effect before?
I will, of course, examin the "bad elf handling" part on my side, but the 3rd party ELF library, which I did not update ever so far, has not given me any trouble so far, and I just load the original .ELF as output by CubeIDE, write something into my header section, then save an .ELF again... which has all worked for 34 versions of my app software, without changing this process in between.
2023-11-07 04:56 PM
Probably something in the object and how a section is handled by the linker script.
Use objcopy to make a .HEX file, check that it shows the memory shift you see in FLASH
2023-11-08 09:25 AM - edited 2023-11-08 09:25 AM
Ah, that format. The ELFs actually agree pre- vs. post my editing with the library, the 4 bytes must be not extra, but missing in the .bin file I create by glueing together the sections' data.
The odd thign being that that's done in a straigh forward way that generally "works".
And that those 4 bytes are at the very beginning of a section much largher than 4 bytes, so it's not an entire section that was ditched for some reason.
...digging further...
2023-11-08 11:10 AM
Then likely an ". = ALIGN(8)" in the Linker Script between sections, especially in the latter sections used for fixup or constructor / destructor lists. You might be able to unify those separate sections
The ELF Format can get a bit involved if you have to pick out the PROGBITS sections.
I tend to do my signing at a .BIN level, although I've built tools to process ELF and HEX files too, but obviously the object files can have a lot of chaff and voids in them, so you need to unpack / stage in a form that mirrors your FLASH, Internal, External, etc.
2023-11-09 07:06 PM - edited 2023-11-09 07:07 PM
Thanks! After having closer looks at differences of present sections in the ELFs of working vs. nonworking builds, I noticed that the .rodata section was at different addresses. Before that is the .text section, I changed the ALIGN(4) at .text section'send to ALIGN(8), and this now seems to work.
Could you elaborate on why this is necessary? My naive idea was, 4 bytes to be fine for 32 bit machines.
And then I also noticed that the alignment of .rodata itself was 8 instead of 4 in the ELF in the bad build, though the linkerscript states (4) in that section, no changes there. Odd.
Also interesting is that when I call objcopy to convert ELF to bin,without the hack from above, on the "bad build", it looks correct. So objcopy must do something that I don't. [NOTE to self: perhaps I should look at the source code of objcopy]
I chose to use ELF that way because that's what the IDE eats for debugging, and if I want to debug, everything in flash needs to be "like normal", i.e. with the header & CRCs. It's actually in the flash, so that the bootloader can check that a valid app image is in the flash. While I can see other ways to store somewhere that a firmware update was started, and completed - this CRC-over-app-image at startup is so unnoticeably fast at these flash sizes that it seemed like a good idea, and covers more possible problems, incl. gone-bad flash. (I don't expect anywhere near that many updates, to wear out the flash. A broken "update only when needs to" could have trashed the flash early in development, though)