STM32H750 Quad SPI Flash Dual Image

ESawa.1 · ‎2023-08-02

Hello together,

we have a board with an STM32H750I connected to a W25Q128 Quad SPI flash. We have our application code on the Quad SPI flash and run our application via XiP. This works fine. Now we work on our bootloader to update the application code in the field. We have a lot of free space on the external flash, so it would be interessting to keep 2 images on the external flash. Here is our idea how the memory map could look.

QUAD-SPI

0x9000 0000 – 0x9000 2000 Bootflags (decides where to jump to)

0x9000 2000 – 0x9080 1000 Application 1

0x9080 1000 – 0x9100 0000 Application 2

But we know that the application code is always build for location 0x9000 2000. So when we copy the code to location 0x9000 200 it works correct. But when we copy it to location 0x9080 1000 and jump there, it will fail of course. Is there any way to remap the QUAD SPI location? Or is the only way to compile it as "Position Independent Code" and take the burden of copying the GOT and PLT?

Thanks all of your for your support.

Andreas Bolsch · ‎2023-08-09

Apart from the suggestions above I see only two options, but both are probably not feasible for you:
1) Use two flash devices in parallel except for NCS, and select between both either via Bank1/Bank2 NCS and FSEL in QUADSPI_CR, or via two different pins for Bank1 NCS.

2) Use W25Q512 or similar in 3-byte address mode, and use the flash's internal bank register to supply the topmost address bit. Or use 4-byte address mode, but configure QUADSPI interface to 3-byte address and supply the fourth byte as alternate byte. Of course, that's a hack working only as long as your application fits into 16 MByte. The latter could even be scaled down to 2-byte address and one alternate byte for the W25Q128, but 64 kByte windows would be a pain.

View solution in original post

FBL · ‎2023-08-02

Hello @ESawa.1,

Could you explain how do you jump to application 2 ? In other words, copy it to location 0x9080 1000 and jump there? Do you mean in the linker script?

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

Pavel A. · ‎2023-08-02

No, H750 does not have any means to remap external flash. So yes, PIC. Another possibility is to run in the RAM, if the program code is small enough. Build the program for the RAM address, of course.

Tesla DeLorean · ‎2023-08-02

The Vector Table has to contain Absolute Addresses. You can however copy a version to RAM and fix-up any offsets in the copy. Generally you'd likely want the code to be built in a position independent fashion, something you can check by linking at different addresses, and then diff-ing the output binaries.

One of the general difficulties in staging data in external flash is that you can't concurrently read/write/erase and use in memory-mapped mode. Best to handle this in the Internal Flash loader, as well as system clock and interface bring-up.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

ESawa.1 · ‎2023-08-02

Hey all of you,

thanks a lot for your fast support. Let me answer some of your questions.

@FBL: The jump would happen in the bootloader. While this bootloader is running, it writes the new image of the application file to the address 0x9080 1000. Then it jumps from the bootloader to the position 0x9080 1000 to run the new application. In the application code the VTOR is relocated to this position. But of course, the application must be build for that position.

@Pavel A.: Thats very sad. I hoped there is an easy way to do it. This feature would be super powerful, as I can imagine that more people would love to place multiple images onto the external flash as this is quite cheap. The RAM is not large enough, so this option is not possible. To use PIC I am a bit worried, as I guess this will have a big impact to performance. Is this assumption right?

@Tesla DeLorean: Let me check if i got you right. You would suggest to build the application image once for 0x9000 2000 and once for 0x9080 1000. Then diff the binary output, to see where the difference between these files is. The bootloader will receive always the image build for 0x9000 2000 and somehow the Diff information. When the bootloader writes to 0x9080 1000, it needs to apply the Diff afterwards to fix up the difference. Then I can jump to 0x9080 1000 and the image will run. Its this what you suggested? This idea is very cool.

Best regards,

Eric

Pavel A. · ‎2023-08-03

> The RAM is not large enough, so this option is not possible.

Can you divide the application to several smaller "overlays"? For example, all the initialization and self-test goes to one overlay, etc.

> To use PIC I am a bit worried, as I guess this will have a big impact to performance. Is this assumption right?

Someone did a test and posted results in the forum (before migration). I cannot find it. IIRC the impact of PIC is noticeable but not too large. XIP in QSPI flash has a bigger impact by itself.

ESawa.1 · ‎2023-08-06

Hey Pavel,

thanks for your reply. Yes splitting it into chunks could be also a solution. This is a bit difficult now with a almost running application but we will think of. I also contacted ST to ask for thier proposal. As I think this could be a common use-case especially with external flash. Let's see what they say. If they provide a good info, I will copy it here. I will also try to search the perfomance test you mentioned. If we find it we can link it here as well.

Andreas Bolsch · ‎2023-08-09

Apart from the suggestions above I see only two options, but both are probably not feasible for you:
1) Use two flash devices in parallel except for NCS, and select between both either via Bank1/Bank2 NCS and FSEL in QUADSPI_CR, or via two different pins for Bank1 NCS.

2) Use W25Q512 or similar in 3-byte address mode, and use the flash's internal bank register to supply the topmost address bit. Or use 4-byte address mode, but configure QUADSPI interface to 3-byte address and supply the fourth byte as alternate byte. Of course, that's a hack working only as long as your application fits into 16 MByte. The latter could even be scaled down to 2-byte address and one alternate byte for the W25Q128, but 64 kByte windows would be a pain.

ESawa.1 · ‎2023-08-09

Hey Andreas,

wow option 2 is a nice suggestion and probably a very good solution for further projects in this field. Of course W25Q512 devices are way more expensive than W25Q128, but it could be very interessting if the dual image is 100% necessary. Thanks all of you for the support.