cancel
Showing results for 
Search instead for 
Did you mean: 

[solved][STM32H7-33] HardFault_Handler in application jumping from custom bootloader

simosilva
Senior

Hi!

I'm currently having troubles jumping from custom bootloader to application.

Working on STM32H733 MCU, the boot code that makes the jump is the following:

void vSetupAndJumpToAddr(uint32_t flashStartAddr)
{
    uint32_t i=0;
  
  // Disable all interrupts
   __disable_irq(); //correspond to assembly cmd: CPSID i
    HAL_MPU_Disable();
 
    
    HAL_RCC_DeInit();
    HAL_DeInit();
    
    /* Clear Interrupt Enable Register & Interrupt Pending Register */
    for (i = 0 ; i < 8 ; i ++)
    {
	    NVIC->ICER[i] = 0xFFFFFFFF;
	    NVIC->ICPR[i] = 0xFFFFFFFF;
    }
    
    // Disable I-Cache
    SCB_DisableICache();
    // Disable D-Cache
    SCB_DisableDCache();
  
    __enable_irq();
    // Jump to user application
    uint32_t JumpAddress = *(__IO uint32_t*)(flashStartAddr+4);
    pFunction Jump_To_Adress = (pFunction)JumpAddress;
          
		// re-init stack pointer (first entry of the vector table)
    __set_MSP(*(__IO uint32_t*) flashStartAddr);
    Jump_To_Adress();
	
	  // *** codeline never reached ***
		Error_Handler();
}

After that, I correctly jump to application with start address 0x0804 0200.

In the Application project, the define "VECT_TAB_OFFSET=0x40200" was inserted to offset the Interrupt Vector Table.

The application starts correctly (I checked both the entrance in the startup code startup_stm32h723xx.s, the execution of SCB->VTOR = FLASH_BANK1_BASE | VECT_TAB_OFFSET; that correctly setup the Application VIT address at 0x08040200), but then the same application that worked alone placed at 0x0800 0000 enter in hard fault.

Hard fault is reached when main.c is initializing stuffs, when HAL_TIM_PeriodElapsedCallback(htim); is hit by the htim7, wich in my case is set as the Timebase Source of the SYS in the CubeMX project.

the Application main.c init code looks like this:

int main(void)
{
  /* USER CODE BEGIN 1 */
  
  __HAL_DBGMCU_FREEZE_IWDG1();		// must be declared using debug session
  
  __HAL_DBGMCU_FREEZE_IWDG1();		// must be declared using debug session
  __HAL_DBGMCU_FREEZE_TIM1 ();
  __HAL_DBGMCU_FREEZE_TIM4 ();
  __HAL_DBGMCU_FREEZE_TIM6 ();
  __HAL_DBGMCU_FREEZE_TIM12 ();
  __HAL_DBGMCU_FREEZE_TIM15 ();
  __HAL_DBGMCU_FREEZE_TIM16 ();
  __HAL_DBGMCU_FREEZE_TIM23 ();
  __HAL_DBGMCU_FREEZE_TIM3 ();
  __HAL_DBGMCU_FREEZE_I2C5 ();
  __HAL_DBGMCU_FREEZE_WWDG1 ();
  __HAL_DBGMCU_FREEZE_IWDG1 ();
  
  
  /* USER CODE END 1 */
 
  /* MPU Configuration--------------------------------------------------------*/
  MPU_Config();
 
  /* Enable I-Cache---------------------------------------------------------*/
  SCB_EnableICache();
 
  /* Enable D-Cache---------------------------------------------------------*/
  SCB_EnableDCache();
 
  /* MCU Configuration--------------------------------------------------------*/
 
  /* Reset of all peripherals, Initializes the Flash interface and the Systick. */
  HAL_Init();
 
  /* USER CODE BEGIN Init */
 
memset ((void *)0x30004000, 0xEE, 0x00004000); //LWIP_RAM_HEAP_POINTER area
 
memset ((void *)0x30004640, 0xAA, 0x000039C0); //LWIP_RAM_HEAP_POINTER area
 
  /* USER CODE END Init */
 
  /* Configure the system clock */
  SystemClock_Config();
 
/* Configure the peripherals common clocks */
  PeriphCommonClock_Config();
 
  /* USER CODE BEGIN SysInit */
 
  /* USER CODE END SysInit */
 
  /* Initialize all configured peripherals */
  MX_GPIO_Init();
  MX_DMA_Init();
  MX_ADC1_Init();
  MX_ADC2_Init();
  MX_ADC3_Init();
  MX_FDCAN3_Init();
  MX_I2C5_Init();
  MX_OCTOSPI1_Init();
  MX_TIM1_Init();
  MX_TIM4_Init();
  MX_TIM12_Init();
  MX_TIM15_Init();
  MX_TIM16_Init();
  MX_TIM23_Init();
  MX_UART4_Init();
  MX_UART5_Init();
  MX_UART9_Init();
  MX_USART3_Init();
  MX_USART10_Init();
  MX_TIM3_Init();
  MX_DAC1_Init();
  MX_TIM6_Init();
  MX_CRC_Init();
  MX_RTC_Init();
  /* USER CODE BEGIN 2 */
 
  //__enable_irq();
 
	// HW initialization 
  
  HAL_GPIO_WritePin([blablabla]_GPIO_Port, [blablabla]_Pin, GPIO_PIN_RESET);
   
  HAL_TIM_Base_Start (&htim6);                                                          // Start TIM6.
 
  BSP_Capture_Timers_Init ();
  BSP_Analog_Init();			// 	HW board support package for Analog Components	
 
 
[ ... other code never reached before the error ... ]

Any clue on the problem?

I enabled and disabled in different ways irqs without success, it might be something small I cannot see.

It is for sure an interrupt problem, _irq_disable() is used in the begin of the code, the hard fault is never reached.

Thanks 🙏

1 ACCEPTED SOLUTION

Accepted Solutions
simosilva
Senior

EUREKA!

The problem was in the offset of the VTOR.

Looking to the _Vectors size for the STM32H733. the lenght is 0x2CC, so i have the last idea to try different addressing of the application, and moved it from 0x08040200 to 0x08040400, AND IT WORKED FIRST TRY! (obviously changed accordingly the jump in the boot and all the scatter files).

note: the vector table is charged FROM the specified address, so it has all the needed space in memory to fillupp the complete _Vectors with no problem, and it was on multiple of 0x200 as specified in the manuals, so I was following all the rules written in manuals.

The big issue is that in the ST reference/programming manuals the reserved area is always on 9 bits, not 10!

Into the PM0253, chapter 4.3.4 "The table alignment requirements mean that bits [8:0] of the table offset are always zero." so multiple of 0x200! not 0x400!

@Community member​ maybe you know some good ST employee to tag here to correct this precious info?

I surfed so many different blogs and forum searching the solution, but no-one have the same issue simply because tipically the application starts in addresses like 0x08###000 with last 3 words at 0, so this issue just did't appear...

View solution in original post

14 REPLIES 14
TDK
Guru

What do the SCB registers indicate as the reason for the hard fault?

> VECT_TAB_OFFSET=0x40200

> that correctly setup the Application VIT address at 0x08040000

Shouldn't it be 0x08040200?

If you feel a post has answered your question, please click "Accept as Solution".

The control transfer looks convoluted.

The GOAL is to tear-down the interrupts you have running, do it purposefully rather than shot-gun the entire thing.

Sounds like you still have one or more interrupts still enabled on the peripheral side, and code that depends on structures that the startup.s code zeros out.

You should perhaps move the __enable_irq() to the far-side of the SystemInit() code AFTER it's pointed SCB->VTOR at the correct location.

Any benefit to disabling the caches here? Do you ensure the content is flushed to memory?

Any benefit to standing up the clocks and PLLs a second time? Perhaps the loader can own that?

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
alister
Lead

>After that, I correctly jump to application with start address 0x0804 0200.

>In the Application project, the define "VECT_TAB_OFFSET=0x40200" was inserted to offset the Interrupt Vector Table.

>Application VIT address at 0x08040000), but then the same application that worked alone placed at 0x0800 0000 enter in hard fault.

Your terms "start address" etc above are contradictory and confusing.

The app's linker script file defines the app's vector table is, or consult your link map file.

Typically it is at the start of the app. But doesn't have to be.

Ref PM0253 section 4.3.4, VTOR bits 0..8 are reserved, so the vector table must be 0x200-byte aligned.

If flashStartAddr in your bootloader's vSetupAndJumpToAddr function is 0x8040200, then it should find your app's entry point (reset vector) ok.

>The application starts correctly (I checked both the entrance in the startup code startup_stm32h723xx.s, the execution of SCB->VTOR = FLASH_BANK1_BASE | VECT_TAB_OFFSET; that correctly >setup the Application VIT address at 0x08040000), but then the same application that worked alone placed at 0x0800 0000 enter in hard fault.

The app should be assigning SCB->VTOR = 0x8040200.

>Hard fault is reached when main.c is initializing stuffs, when HAL_TIM_PeriodElapsedCallback(htim); is hit by the htim7, ...

Best practice is for the bootloader to launch the app as though it were coming out of reset, e.g. CPS interrupts disabled (__disable_irq), NVIC all interrupts disabled, SysTick disabled (SysTick->CTRL = 0), caches disabled.

Likely, as others are saying, your app is taking an interrupt for something that is not vectored or not initialised properly.

Your vSetupAndJumpToAddr does this incorrectly.

    for (i=0;iICER[i]=0xFFFFFFFF;
	    NVIC->ICPR[i]=0xFFFFFFFF;
    }

This instead.

  for (i = 0; i < 8; i++)
  {
    NVIC->ICER[i] = 0xFFFFFFFF;
    NVIC->ICPR[i] = 0xFFFFFFFF;
  }

The app should enable interrupts as it completes its initialisation.

If i understood the logic the offset is correct, it is just the offset from the start of flash bank1 start address wich is 0x0800 0000, i corrected from 0x08040000 to 0x08040200 in the post, I wrote it wrong, thanks!

For the registers, here all the informations I can gather from Keil:

0693W00000Dmd1lQAB.jpg 

I will continue debugging and checking things in the meanwhile.

Thanks for the quick answer!

Hi @Community member​ , nice to see you there!

I'm trying to "shot-gun" everything to generate a legacy code esily reusable in other similar projects on the same MCU without any arrangments, the understanding and tear-down of the specific case (in wich I'm very interested) as I can see will eventually come like in this case.

Moving the __enable_irq() on the application side, just before BSP_Capture_Timers_Init (); (that redquire it) the result is also HardFault condition.

Cache disabled following best practices i have found (and also the comments below for example), no data is shared between boot and application on the MCU so i think there's no problem also not flushing caches.

Clocks and PLL if I understood your question are standup a second time because of the CubeMX code generation, that's it.

simosilva
Senior

Hi @alister​ !

Confusion came from the mismatch of addresses, wich is now corrected to 0x08400200 wich is the correct flash address where the application is placed.

My scatter file looks like that:

LR_IROM1 0x08040000 0x00080000     ; load region in Flash1
{
  [ProjectName].bin 0x08040000 0x00000200   ; Application header
  {
   .ANY (+RO)
  }
 
  ER_IROM2 0x08040200 0x0007FE00  {  ; load address = execution address
   *.o (RESET, +First)
   *(InRoot$$Sections)
   .ANY (+RO)
   .ANY (+XO)
  }
 
[... other negligible stuff ...]

The first area of 0x200 is an header used in other projects, forced here by "__attribute__((at(0x08040000)))", correctly filed up.

I'm confident that the application is in the right place since the jump from the custom bootloader is made and in debug I see the "LDR   R0, =SystemInit" correctly reached in the startup file, and from them the jump into main.c.

SCB->VTOR = 0x8040200 is verified, correct.

The NVIC->ICER[i] and NVIC->ICPR[i] clear you mentioned is clearly wrong in my copy-pasted code, i have the same for loop as the one you suggested in fact, now edited in the initial question for the community, thanks for noticing me.

I added then the SysTick disable but the problem is not solved.

More troubleshooting is coming! 😎

Good reasons for the bootloader to not "__enable_irq();" before launching the app:

  1. bootloader may not have disabled something that interrupts
  2. some parts of the app init may assume one or all interrupts are disabled, and be an invalid state if its thing interrupts before its initialisation completes
  3. an RTOS scheduler would typically enable interrupts when it is ready for its first task to execute

SCB -> HSFR -> FORCED = 1

BFSR -> PRECISERR = 1

BFSR -> BFARVALID =1

BFAR = 0x00070210, pointing to the Reserved area 0x0001000 - 0x07FFFFFF

simosilva
Senior

update: Still blocked on this Hard Fault.

Checking registers, the fault is Bus Fault, PRECISERR=1, from the application note "the PC value stacked for the exception return points to the instruction that caused the fault."

BFARVALID =1 and the address stored in BFAR is 0x00070210.

In debug mode, I see a strange behaviour.

0x080454E2 BDF8   POP      {r3-r7,pc}

The POP is executed as last operation of the "HAL_ADCEx_Calibration_Start", here is the local stack:

0693W00000Dn0dwQAB.jpg 

The code fault stay in a POP instruction, the POP is executed when the SP is empty, and this result in turning r4 (the register who is in charge to hold the address of the timer7 in this case) to 0xE000ED14, and then the timer handler is called with the wrong address, the HardFault occurs!

Tryed adding more instruction on the Boot part to disable all the peripherals ongoing, all of these have been found in different forums and threads about custom bootloader jump, none of them seem to solve the issue

For this reason, i think I't something just in the Application side.

Checked the SCB->VTOR, correctly offsetted.

checked in the .map file and the _Vectors is in the correct place:

__Vectors                                0x08040200   Data           4  startup_stm32h723xx.o(RESET)

Also RESET seem in the correct place:

RESET                                    0x08040200   Section      716  startup_stm32h723xx.o(RESET)
!!!main                                  0x080404cc   Section        8  __main.o(!!!main)
!!!scatter                               0x080404d4   Section       52  __scatter.o(!!!scatter)

Any clue?