Why might setting BFB2 cause printf to stop working?

Brian H · ‎2022-11-04

I've written a fairly simple test case for using the system bootloader to boot from flash bank 2 on a STM32F429ZIT6U (developing on a Nucleo F429ZI board). With stdio redirected to a UART, I'm able to direct the code to:

Mass-Erase Bank 2
Copy the entirety of Bank 1 to Bank 2
Set the BFB2 bit

The function that sets BFB2 looks like this:

void boot_bank_select(uint8_t bank_bit) {
   FLASH_AdvOBProgramInitTypeDef optionbits = {
      .OptionType = OPTIONBYTE_BOOTCONFIG,
      .BootConfig = (bank_bit) ? OB_DUAL_BOOT_ENABLE : OB_DUAL_BOOT_DISABLE
   };
   HAL_FLASH_OB_Unlock();
   HAL_FLASHEx_AdvOBProgram(&optionbits);
   printf("Waiting for option bit write...\r\n");
   HAL_StatusTypeDef rc = HAL_FLASH_OB_Launch();
   if(rc == HAL_OK)
      printf("Bank select completed.\r\n");
   else
      printf("Bank select failed.\r\n");
}

By placing a few strategic breakpoints, I can see that once the option bit write takes place, printf starts causing an endless reboot cycle. Code in main() executes up to the first place printf() is called, then it reboots.

Looking at disassembly, I can see a point where a "bx lr" jumps into the middle of the boilerplate startup code, so I don't think an actual fault or reset is occuring; it's more likely a case of stack corruption. However, the situation persists through hard resets; I can only regain control of the chip by using STProg to clear the BFB2 option bit.

I'm at a loss. Bank 2 is identical to Bank 1; I know this because the code itself copies it directly, and I can download and compare the two banks via STProg and see that they're identical. So why would having the bank switched cause the code to jump to strange places?

I can provide all of the code if necessary.

Brian H · ‎2022-11-14

OH MY $DEITY I SOLVED IT.

Based on this one short paragraph in AN4767 that I managed to gloss over before:

"Beware that the VTOR reset value is zero, in case of BFB2 option active, it will by default point to system memory."

So yes, of course, if BFB2 is set and VTOR isn't re-pointed appropriately, the very first interrupt that comes along after the bank switch will send execution back into system memory. That's why I first saw it in the context of printf, which led to UART interrupts.

If the first instruction in main (or, really, any that happen before an interrupt) properly point the VTOR back at 0x0800 0000, the core stays happy in application code.

View solution in original post

gbm · ‎2022-11-05

It may be something related to cache and prefetch buffer operation. Try to disable all these mechanisms before bank switching. If this doesn't help, then move the bank switch routine to RAM and execute it from there. In sucha a case the routine should not call any routines from Flash - neither HAL nor printf; just make your own routine being an equivalent of HAL_FLASH_OB_Launch.

My STM32 stuff on github - compact USB device stack and more: https://github.com/gbm-ii/gbmUSBdevice

Pavel A. · ‎2022-11-05

While waiting for better replies, replace the printfs to direct UART output (HAL_UART_Transmit...).

and call __ISB() after HAL_FLASH_OB_Launch (this may be too late, though...)

Brian H · ‎2022-11-07

Thank you both for your answers. They are sensible suggestions, but there's a wrinkle that perhaps I didn't express clearly: The problem persists after a hard reset / power cycle. I wouldn't expect cache / prefetch / etc. problems to remain after the chip has been fully reset. At that point, the cache and pipeline is flushed anyway, no? Starting from a hard reset, with an identical image in bank 2, I don't understand why the behavior should be any different from bank 1.

I'll still try moving the bank select into RAM (and removing all calls to external methods) just to see what happens.

Pavel A. · ‎2022-11-08

> once the option bit write takes place, printf starts causing an endless reboot cycle. Code in main() executes up to the first place printf() is called

Do you mean that the first printf in your snippet, line 8, already causes reboot?

Are there earlier prints in main() that work?

Brian H · ‎2022-11-08

Here's the first few bits of main(), up to the first printf:

int main(void) {
   HAL_Init();
   SystemClock_Config();
   MX_GPIO_Init();
   MX_CRC_Init();
   MX_USART3_UART_Init();
   printf("Good morning!\r\n");
   printf("Waiting for debugger...\r\n");
   // ...

If B2BF is set and I set a breakpoint at line 5, this happens:

Hit breakpoint at line 5
"Step over" and hit line 6
"Step over" and hit line 7
"Step over" and back to 1 (the breakpoint at line 5 is hit)

If I use ST-Prog to clear the B2BF option bit, the entire code runs as expected.

Brian H · ‎2022-11-08

Ok, the printf thing is definitely a red herring. Here's my latest modification to main, which examines the SYSCFG-->MEMRMP register to see which bank is booted, and does nothing but run a timer and blink an LED:

int main(void) {
    HAL_Init();
    SystemClock_Config();
    MX_GPIO_Init();
    MX_CRC_Init();
    MX_USART3_UART_Init();
    uint32_t timer = 0;
    if(SYSCFG->MEMRMP & 0x100) {
        while (1) {
            if ((++timer % BLINK_DELAY) == 0) {
                  uint32_t odr = LD1_GPIO_Port->ODR;
                  LD1_GPIO_Port->BSRR = ((odr & LD1_Pin) << 16) | (~odr & LD1_Pin);
            }
        }
    } else {
        while ((huart3.Instance->SR & 0x20) == 0) {
            if ((++timer % BLINK_DELAY) == 0) {
                  uint32_t odr = LD3_GPIO_Port->ODR;
                  LD3_GPIO_Port->BSRR = ((odr & LD3_Pin) << 16) | (~odr & LD3_Pin);
            }
        }
        peek_char();
    }

The not-remapped version is a bit more complicated so that I can break out of the blink phase and move on to the rest of the application.

Once again, when B2BF is cleared, the application runs completely as expected. If B2BF is set, I can set a breakpoint at line 11 that never gets hit and execution winds up back at the top of main. A breakpoint at line 10 does get hit.

I'm really stymied. It doesn't seem to be related to crossing compilation unit boundaries, because HAL_Init() is in a separate compilation unit.

Pavel A. · ‎2022-11-08

Set breakpoints in disassembly view, to be sure.

Even use a hardcoded breakpoint: __BKPT(n)

Then step by instruction.

Brian H · ‎2022-11-08

Thanks for the input, Pavel. That is actually exactly what I've been doing lately. Here's a snippet:

167       			if ((++timer % BLINK_DELAY) == 0) {
0800075e:   ldr     r3, [r7, #36]   ; 0x24
08000760:   adds    r3, #1
08000762:   str     r3, [r7, #36]   ; 0x24
08000764:   ldr     r2, [r7, #36]   ; 0x24
08000766:   lsrs    r3, r2, #3
08000768:   ldr     r1, [pc, #540]  ; (0x8000988 <main+592>)
0800076a:   umull   r1, r3, r1, r3
0800076e:   lsrs    r3, r3, #8
08000770:   movw    r1, #25000      ; 0x61a8
08000774:   mul.w   r3, r1, r3
08000778:   subs    r3, r2, r3
0800077a:   cmp     r3, #0
0800077c:   bne.n   0x800075e <main+38>
169       				  uint32_t odr = LD1_GPIO_Port->ODR;
0800077e:   ldr     r3, [pc, #524]  ; (0x800098c <main+596>)
08000780:   ldr     r3, [r3, #20]
08000782:   str     r3, [r7, #4]

I can put a breakpoint at line 13 (the cmp) and line 16 (the instruction if the branch is not taken). I also put a breakpoint at the top of ResetHandler. If, while the debugger is stopped at line 13, I manually change the value in r3 to 0, single-step, then I wind up at line 16 like I should and, a few single-steps later, the LED changes state. But if I clear the breakpoint at line 13 and just hit "continue", I wind up back in the reset handler. Further, no bits are set in any of the fault registers.

Edit: Expanded the amount of disassembly; updated line number references in the text

Pavel A. · ‎2022-11-09

What the disassembly shows at 0x800075e ? Is the stack pointer good?