2024-02-27 5:26 AM
There is this errata for STM32G0B1
2.2.10 Prefetch failure when branching across flash memory banks
Description
In rare cases, the code prefetch may fail upon branching and function calls across flash memory banks,
regardless of the DUAL_BANK and nSWAP_BANK option byte settings. The failing prefetch then provides an
incorrect data to the CPU, which causes code execution corruption and may lead to HardFault interrupt.
The workaround is obvious, either disable prefetch when jumping to code in the other bank and enable it again afterwards or use a small "trampoline" function in RAM. We can live with that, as our software is split into separately compiled "core" and "application" parts loaded into separate banks, having only a handful of interface functions there would be e.g. no jumps into library functions in the other bank.
What about exceptions and interrupts, how are they affected?
The ARM literature generally refers to loading the exception handler address from the vector table as fetching. It makes sense on architectures where there is a separate instruction and data bus, as it can stack a few registers while waiting for the address from the slower flash. This is not the case on Cortex-M0+, however there could still be an internal distinction between data loads and instuction fetches.
So, what happens when an exception occurs while code is running in bank 1 and NVIC->VTOR points to bank 0?
What happens when an interrupt handler running in bank 0 returns to code in bank 1?
Are those kinds of branches affected by this errata or only literal BX,BLX and BL instructions? All of them?
Having the vector table and a small handler function in RAM would solve that problem, right?
Oh, and to complicate matters further there is a RTOS (ThreadX) running in bank 0 that might want to return to a different address and mess with the stack.
2024-03-02 1:14 PM - edited 2024-03-02 1:15 PM
2024-03-12 3:34 PM - edited 2024-03-12 3:37 PM
Yeah, this looks super messy, would really like to see a worked example of the failure.
I'm guessing it would impact a lot of things. Sounds like there would need to be a safety region between the two banks at the boundary, so no code crosses, and no branch/call targets within it. This will be a pig to catch as the linker/user might want to concatenate banks into larger linear region.
Back linking to this thread so it's less hard to find later.. https://community.st.com/t5/stm32-mcus-products/facing-system-config-crystal-and-rtc-crystal-problem-in/td-p/649325
ie large size induced weirdness
2025-11-25 3:11 AM
Just putting here cross-link to a thread, where probably consequences of this erratum were seen in the wild.
2025-11-28 3:37 AM
In my understanding, prefetch failure on cross-bank branch is a known issue described in Errata 2.2.10. It can lead to code execution corruption and HardFault interrupts affecting the vector fetch on exception entry and return. Indeed, normal branch instructions crossing banks are also affected. Even for 256KB devices, set DUAL_BANK=0 will not help solve the issue.
Disabling prefetch is the most reliable solution, though it impacts performance. Jumping to RAM would be also a valid workaround if you really know where your code cross banks (I mean it's not obvious in all applications)
To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.
2025-11-28 5:45 AM
Hi @FBL ,
Throughout the following discussion, let's assume prefetch is always on.
The main issue is, that ST does not publish enough information on the nature of the flaw for the users to be able to judge, what exactly constitutes branching and function calls across flash memory banks, under which the prefetch provides incorrect data (data? or just instructions?) to CPU. There are nuances to this, which impact the resulting workable options severely.
For example, your assertion above, jumping to RAM would be also a valid workaround, does not explain, whether we can use short stubs in RAM to use as trampolines - intermediate code between code executed from one FLASH bank and code executed from other FLASH bank. In other words, we don't know if the prefetch unit "knows" about execution from memory-other-than-FLASH and if that is sufficient to prevent the error to occur. So, that's here my question number 1 (Q1).
But let's assume it does.
The usage mode proposed by @berendi is to consciously partition the code to the two banks, and provide trampolines in RAM to jump between them. For example, most functions would go (by default) into Bank0, and a hand-picked few would go into Bank1; the latter would be accessed in the following way:
void Function1(void) {
Function2();
}
// following is the trampoline in RAM
void Function2(void) __attribute__((noinline, section(".trampoline"))) {
RealFunction2();
}
// following is the real function
void RealFunction2(void) __attribute__((section(".bank1"))) {
[do whatever this Function needs to do here]
}(linker script ensures .trampoline is located to RAM (plus functions are loaded there in startup code), and .bank1 is located to FLASH Bank1, while unmarked functions are located to FLASH Bank 0).
Q2. The trampoline code, as written above, may be optimized by compiler into a single direct branch to RealFunction2. When RealFunction2 returns (either through bx lr or some form of pop pc), it effectively jumps from Bank1 to Bank2, but does that constitute a branching and function calls across flash memory banks affected by the erratum?
Such scenario expects, that functions from Bank1 never call functions from Bank0, but that is a relatively common case - as @berendi outlined, it would be a quite natural "application" and "library" partitioning.
(Of course, the other direction can be routed also through trampolines, and it's not impossible to automate the compilation tools to do so for both directions; but the outlined scenario is one which would increase the 'G0Bx usability significantly instantly, just with writing some small extra code, without investing into modified tools).
Q3. As under this scenario code would run from both banks, what happens when interrupts occur? Again, we don't know, if reading the interrupt vector is considered data or code from the prefetch unit's viewpoint, and whether reading the interrupt vector and reading of the first instruction of the ISR occurs occurs before/after/in-between the automatic stacking/unstacking of the registers. From the user standpoint, there are several possible options:
A. vector table is in FLASH Bank X, ISR code is in either of FLASH Banks
B. vector table is in FLASH Bank X, ISR code is only in FLASH Bank X
C. vector table is in FLASH Bank X, ISR code is in either FLASH Bank, but called through a trampoline in RAM (B. and C. are distinguished by exactly the same issue than in Q2)
D. vector table is in FLASH Bank X, ISR code is entirely in RAM
E. vector table is in RAM, ISR code in either FLASH Bank but called through a trampoline in RAM
F. vector table is in RAM, ISR code is entirely in RAM
Which of these scenarios is usable, without potentially running into the error?
Thanks,
JW
2025-11-28 8:44 AM
Thank you for the explanations and detailed scenarios. An internal ticket is submitted to dedicated team for explanation from Flash in standpoint in G0 (222805)
To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.