cancel
Showing results for 
Search instead for 
Did you mean: 

When using a custom bootloader, application hard faults

geirlandunico
Associate II

Hi all,

 

I put together a custom bootloader for my application (to allow the program to update itself).  The bootloader itself is incredibly simple.  It just reads from a known location in flash to determine which address to load from and then attempts to boot to that address.  The main() function for that bootloader is included below (the rest of the code is auto generated minus some defines):

int main(void)
{

  /* USER CODE BEGIN 1 */

  /* USER CODE END 1 */

  /* MPU Configuration--------------------------------------------------------*/
  MPU_Config();

  /* MCU Configuration--------------------------------------------------------*/

  /* Reset of all peripherals, Initializes the Flash interface and the Systick. */
  HAL_Init();

  /* USER CODE BEGIN Init */

  /* USER CODE END Init */

  /* Configure the system clock */
  SystemClock_Config();

  /* USER CODE BEGIN SysInit */

  /* USER CODE END SysInit */

  /* Initialize all configured peripherals */
  /* USER CODE BEGIN 2 */

  uint32_t load_addr;
  void (*appl_ptr)(void);

  // If magic number address contains our magic number (for saying it's ok to use the
  // program space) then use it, if not then it's corrupted and we need ot boot from
  // gold working space
  if ((*(__IO uint32_t*)SEMAPHORE_ADDR) == 1) {
	  load_addr = PROG1_ADDR;
  } else if ((*(__IO uint32_t*)SEMAPHORE_ADDR) == 2) {
	  load_addr = PROG2_ADDR;
  } else {
	  load_addr = GOLD_ADDR;
  }
  // disable irq and set vector table location
  __disable_irq();
  SCB->VTOR = load_addr;
  // Clear interrupts?
	for (uint8_t i = 0; i < 8; i++)
	{
		NVIC->ICER[i]=0xFFFFFFFF;
		NVIC->ICPR[i]=0xFFFFFFFF;
	}
  // Force hardcoded address (+4 for first command) to application pointer
  appl_ptr = (void (*)())(load_addr + 4);
  // Other stuff a bootloader example did
  HAL_RCC_DeInit();
  HAL_DeInit();
  SysTick->CTRL = 0;
  SysTick->LOAD = 0;
  SysTick->VAL = 0;
  __HAL_RCC_SYSCFG_CLK_ENABLE();
  // Set stack pointer to the value at the beginning of that space
  __set_MSP(*(__IO uint32_t*) load_addr);
  // Jump!
  appl_ptr();

  /* USER CODE END 2 */

  /* Infinite loop */
  /* USER CODE BEGIN WHILE */
  while (1)
  {
    /* USER CODE END WHILE */

    /* USER CODE BEGIN 3 */
	  // SHOULD NEVER GET HERE
  }
  /* USER CODE END 3 */
}

 

Right now that semaphore location is set to 0 so it attempts to boot from the GOLD_ADDR location (0x080C0000).


If I program the application itself (after changing the STM32H735VGHX_FLASH.ld file to have flash start at 0x080C0000) and changing the VECT_TAB_OFFSET in system_stm32h7xx.c to 0x000C0000U it works perfectly fine.  I can verify that we're in 0x080C0000+ space with the PC when debugging directly and can verify that the VECT_TAB_OFFSET is being set correctly because when I set it back to 0, interrupts no longer function correctly (and at 0xC0000 they do work correctly).

 

If I attempt to boot with that bootloader code above though, I end up hitting the application's hard fault handler (shown below):

geirlandunico_0-1729176631462.png

I can verify in the .map file that the applications hardfault handler is at 0x080c0a54.  Looking at the stack trace it seems to imply the hardfault triggers upon the calling the instruction at 0x80C0004.  This is confusing to me though because it seems like the hardfault happens sometime after jumping but before reaching my applications main function.  I think that because after line 62 in the above bootloader main function is called, the stack pointer is set to 0x24050000 but when the hardfault occurs, its equal to 0x2404ffe0 (the same value it seems to be when first reaching the main function).  If I put an infinite loop at the start of my applications main function though, I still end up in the hardfault when using the bootloader instead of stuck in that loop.

 

On the other hand though, looking at 0x80C0004, it contains 0x80C0D3D which seems like the address to the reset handler (0x80C0D3C) but not word aligned.  Is it possible that this is where the issue is coming from?  If so, why does that value get generated when compiling and programming the application project and why does it work when directly running but not when using the bootloader?  Looking at the bootloader's equiavlent address (0x8000004) it's 0x8000691 it's also the reset handler + 1.

 

Is there something else I'm missing?

1 ACCEPTED SOLUTION

Accepted Solutions
geirlandunico
Associate II

Ah I got it!

 

I was continuously calling the 0x80C0004 location as an instruction instead of properly calling the reference of that address.  I needed to do replace line 53 in that function as so:

 

//appl_ptr = (void (*)())(load_addr + 4);
appl_ptr = (void (*)())*((uint32_t*)(load_addr + 4));

 

Thank you for your help!

View solution in original post

8 REPLIES 8

>>Looking at the stack trace it seems to imply the hardfault triggers upon the calling the instruction at 0x80C0004. 

It's NOT an INSTRUCTION, it's a pointer in a Vector Table

>>On the other hand though, looking at 0x80C0004, it contains 0x80C0D3D which seems like the address to the reset handler (0x80C0D3C) but not word aligned.

Correct, and it's ODD because it's 16-bit THUMB(2) code, even addresses infer 32-bit ARM code, which the CMx MCU's don't run.

Consider a more helpful HardFault_Handler() outputting actionable data

Treat the vector table as holding Function Pointers (like you'd use with qsort())

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

https://www.st.com/resource/en/programming_manual/pm0253-stm32f7-series-and-stm32h7-series-cortexm7-processor-programming-manual-stmicroelectronics.pdf

Joseph Yiu has other books on assorted Cortex-Mx, but not M7, but these would provide at least a foundation on the architecture.

https://github.com/cturvey/RandomNinjaChef/blob/main/KeilHardFault.c

 

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
geirlandunico
Associate II

Ah I see my bad on the understanding of the value at 0x80C0004.

 

Looking at the data grabbed in that KeilHardFault example:

geirlandunico_0-1729184700600.pnggeirlandunico_1-1729184800217.png

The msp is 0x24050000 so I'm assuming the sp being 2404ffe0 in the hardfault handler is due to dumping the register info onto the stack.  The CFSR points towards this being related to the EPSR.  The register values list the iepsr when in the hardfault as being the default value of 0x1000000.

 

I'm not sure I understand enough of the ARM instruction set and core structure to make much of this.  From what I read of someone's description of this error online, it seems to often come from mistakes assembly code instructions (which I shouldn't have touched).  Is it possible I'm calling the wrong code or jumping to the wrong location?

Yes, I'm not sure what instructions it's running.

Typically one might step the debugger through the transfer point, and see how far into the Reset_Handler() on the app side it gets.

Generally you'd set the SCB->VTOR in SystemInit(), ST uses defines, I prefer to use the symbol for the Vector Table, ie g_pfnVectors, to establish where the build/linker put it.

You likely don't want to initialize the clocks/plls again, as a) they should already be running, b) you can't change the PLL whilst running, so it needs a two stage process to move the MCU off the PLL, the back to a PLL with potentially different settings. Ditto External Memories, usually don't want to bring them up twice.

Watch for code ending up in Error_Handler(), I prefer the Error_Handler(__FILE__,__LINE__) variant for tracking do the source of the thrown error.

The SP pre-decrements, so it being just beyond the end of RAM should be fine.

The HardFault will occur if you touch memory that's not currently addressable.

Watch also reading unprogrammed FLASH, or situations where the ECC will fail.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
geirlandunico
Associate II

I did try to follow along with the debugger but stepping into the application pointer just immediately hardfaults too.  I'm not sure if that's just because I have the STM32 IDE inproperly setup or if it's truly failing on the initial jump itself.  What I just noticed though is that it seems to hardfault with a different instruction address 0x80c02cc (which seems to be empty space between the isr_vector and the .text section) than if I just run it.  Not sure what to make of that at all.

geirlandunico_0-1729195720671.png

geirlandunico_1-1729197430308.png

 

The CFSR lists a different reason for hard faulting in that case.  Instead of an EPSR issue, it's an undefined instruction fault. (CFSR of 0x10000).  That makes perfect sense as 0x80c02cc is unwritten (it contains 0xFFFFFFFF).  

 

I will definitely make a note of that for the clocks/PLLS and remove those from the application.

 

I'm not sure why doing the step into and just running would have different results like this.

Dump all the vectors in the 0x080C0000..0x080C02CB vector table as uint32_t

It would appear you've transitioned from the SCB->VTOR at 0x08000000 too 0x080C0000 at this point.

Perhaps you have a whole bunch of IRQ's running, including SysTick, which are going to trigger into code that's not ready/initialized to handle them.

Go and disable all your current interrupt sources.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

Shouldn't the bootloader code I posted be disabling all IRQs?  I'm not only running __disable_irq(), I'm also setting:

 

  HAL_RCC_DeInit();
  HAL_DeInit();
  SysTick->CTRL = 0;
  SysTick->LOAD = 0;
  SysTick->VAL = 0;
  __HAL_RCC_SYSCFG_CLK_ENABLE();

and I think I should be clearing all interrupts with:

	for (uint8_t i = 0; i < 8; i++)
	{
		NVIC->ICER[i]=0xFFFFFFFF;
		NVIC->ICPR[i]=0xFFFFFFFF;
	}

 

Here is the full vector table:

	0x80c0000[0]	int	0x24050000 (Hex)	
	0x80c0000[1]	int	0x80c0d3d (Hex)	
	0x80c0000[2]	int	0x80c0a4d (Hex)	
	0x80c0000[3]	int	0x80c0a55 (Hex)	
	0x80c0000[4]	int	0x80c0a5d (Hex)	
	0x80c0000[5]	int	0x80c0a65 (Hex)	
	0x80c0000[6]	int	0x80c0a6d (Hex)	
	0x80c0000[7]	int	0x0 (Hex)	
	0x80c0000[8]	int	0x0 (Hex)	
	0x80c0000[9]	int	0x0 (Hex)	
	0x80c0000[10]	int	0x0 (Hex)	
	0x80c0000[11]	int	0x80c0a75 (Hex)	
	0x80c0000[12]	int	0x80c0a83 (Hex)	
	0x80c0000[13]	int	0x0 (Hex)	
	0x80c0000[14]	int	0x80c0a91 (Hex)	
	0x80c0000[15]	int	0x80c0a9f (Hex)	
	0x80c0000[16]	int	0x80c0d8d (Hex)	
	0x80c0000[17]	int	0x80c0d8d (Hex)	
	0x80c0000[18]	int	0x80c0d8d (Hex)	
	0x80c0000[19]	int	0x80c0d8d (Hex)	
	0x80c0000[20]	int	0x80c0d8d (Hex)	
	0x80c0000[21]	int	0x80c0d8d (Hex)	
	0x80c0000[22]	int	0x80c0d8d (Hex)	
	0x80c0000[23]	int	0x80c0d8d (Hex)	
	0x80c0000[24]	int	0x80c0d8d (Hex)	
	0x80c0000[25]	int	0x80c0d8d (Hex)	
	0x80c0000[26]	int	0x80c0d8d (Hex)	
	0x80c0000[27]	int	0x80c0d8d (Hex)	
	0x80c0000[28]	int	0x80c0d8d (Hex)	
	0x80c0000[29]	int	0x80c0d8d (Hex)	
	0x80c0000[30]	int	0x80c0d8d (Hex)	
	0x80c0000[31]	int	0x80c0d8d (Hex)	
	0x80c0000[32]	int	0x80c0d8d (Hex)	
	0x80c0000[33]	int	0x80c0d8d (Hex)	
	0x80c0000[34]	int	0x80c0d8d (Hex)	
	0x80c0000[35]	int	0x80c0d8d (Hex)	
	0x80c0000[36]	int	0x80c0d8d (Hex)	
	0x80c0000[37]	int	0x80c0d8d (Hex)	
	0x80c0000[38]	int	0x80c0d8d (Hex)	
	0x80c0000[39]	int	0x80c0d8d (Hex)	
	0x80c0000[40]	int	0x80c0d8d (Hex)	
	0x80c0000[41]	int	0x80c0d8d (Hex)	
	0x80c0000[42]	int	0x80c0d8d (Hex)	
	0x80c0000[43]	int	0x80c0d8d (Hex)	
	0x80c0000[44]	int	0x80c0d8d (Hex)	
	0x80c0000[45]	int	0x80c0d8d (Hex)	
	0x80c0000[46]	int	0x80c0d8d (Hex)	
	0x80c0000[47]	int	0x80c0d8d (Hex)	
	0x80c0000[48]	int	0x80c0d8d (Hex)	
	0x80c0000[49]	int	0x80c0d8d (Hex)	
	0x80c0000[50]	int	0x80c0d8d (Hex)	
	0x80c0000[51]	int	0x80c0d8d (Hex)	
	0x80c0000[52]	int	0x80c0d8d (Hex)	
	0x80c0000[53]	int	0x80c0d8d (Hex)	
	0x80c0000[54]	int	0x80c0d8d (Hex)	
	0x80c0000[55]	int	0x80c0d8d (Hex)	
	0x80c0000[56]	int	0x80c0d8d (Hex)	
	0x80c0000[57]	int	0x80c0d8d (Hex)	
	0x80c0000[58]	int	0x0 (Hex)	
	0x80c0000[59]	int	0x80c0d8d (Hex)	
	0x80c0000[60]	int	0x80c0d8d (Hex)	
	0x80c0000[61]	int	0x80c0d8d (Hex)	
	0x80c0000[62]	int	0x80c0d8d (Hex)	
	0x80c0000[63]	int	0x80c0d8d (Hex)	
	0x80c0000[64]	int	0x80c0d8d (Hex)	
	0x80c0000[65]	int	0x80c0d8d (Hex)	
	0x80c0000[66]	int	0x80c0d8d (Hex)	
	0x80c0000[67]	int	0x80c0d8d (Hex)	
	0x80c0000[68]	int	0x80c0d8d (Hex)	
	0x80c0000[69]	int	0x80c0d8d (Hex)	
	0x80c0000[70]	int	0x80c0d8d (Hex)	
	0x80c0000[71]	int	0x80c0d8d (Hex)	
	0x80c0000[72]	int	0x80c0d8d (Hex)	
	0x80c0000[73]	int	0x80c0d8d (Hex)	
	0x80c0000[74]	int	0x80c0d8d (Hex)	
	0x80c0000[75]	int	0x80c0d8d (Hex)	
	0x80c0000[76]	int	0x80c0d8d (Hex)	
	0x80c0000[77]	int	0x80c0d8d (Hex)	
	0x80c0000[78]	int	0x80c0d8d (Hex)	
	0x80c0000[79]	int	0x80c0d8d (Hex)	
	0x80c0000[80]	int	0x0 (Hex)	
	0x80c0000[81]	int	0x0 (Hex)	
	0x80c0000[82]	int	0x0 (Hex)	
	0x80c0000[83]	int	0x0 (Hex)	
	0x80c0000[84]	int	0x80c0d8d (Hex)	
	0x80c0000[85]	int	0x80c0d8d (Hex)	
	0x80c0000[86]	int	0x80c0d8d (Hex)	
	0x80c0000[87]	int	0x80c0d8d (Hex)	
	0x80c0000[88]	int	0x80c0d8d (Hex)	
	0x80c0000[89]	int	0x80c0d8d (Hex)	
	0x80c0000[90]	int	0x80c0d8d (Hex)	
	0x80c0000[91]	int	0x80c0d8d (Hex)	
	0x80c0000[92]	int	0x80c0d8d (Hex)	
	0x80c0000[93]	int	0x80c0d8d (Hex)	
	0x80c0000[94]	int	0x80c0d8d (Hex)	
	0x80c0000[95]	int	0x80c0d8d (Hex)	
	0x80c0000[96]	int	0x80c0d8d (Hex)	
	0x80c0000[97]	int	0x80c0d8d (Hex)	
	0x80c0000[98]	int	0x80c0d8d (Hex)	
	0x80c0000[99]	int	0x80c0d8d (Hex)		
	0x80c0000[100]	int	0x80c0d8d (Hex)	
	0x80c0000[101]	int	0x80c0d8d (Hex)	
	0x80c0000[102]	int	0x80c0d8d (Hex)	
	0x80c0000[103]	int	0x80c0d8d (Hex)	
	0x80c0000[104]	int	0x80c0d8d (Hex)	
	0x80c0000[105]	int	0x80c0d8d (Hex)	
	0x80c0000[106]	int	0x80c0d8d (Hex)	
	0x80c0000[107]	int	0x0 (Hex)	
	0x80c0000[108]	int	0x80c0d8d (Hex)	
	0x80c0000[109]	int	0x80c0d8d (Hex)	
	0x80c0000[110]	int	0x80c0d8d (Hex)	
	0x80c0000[111]	int	0x80c0d8d (Hex)	
	0x80c0000[112]	int	0x80c0d8d (Hex)	
	0x80c0000[113]	int	0x80c0d8d (Hex)	
	0x80c0000[114]	int	0x0 (Hex)	
	0x80c0000[115]	int	0x0 (Hex)	
	0x80c0000[116]	int	0x0 (Hex)	
	0x80c0000[117]	int	0x0 (Hex)	
	0x80c0000[118]	int	0x80c0d8d (Hex)	
	0x80c0000[119]	int	0x0 (Hex)	
	0x80c0000[120]	int	0x0 (Hex)	
	0x80c0000[121]	int	0x0 (Hex)	
	0x80c0000[122]	int	0x0 (Hex)	
	0x80c0000[123]	int	0x0 (Hex)	
	0x80c0000[124]	int	0x0 (Hex)	
	0x80c0000[125]	int	0x0 (Hex)	
	0x80c0000[126]	int	0x80c0d8d (Hex)	
	0x80c0000[127]	int	0x80c0d8d (Hex)	
	0x80c0000[128]	int	0x80c0d8d (Hex)	
	0x80c0000[129]	int	0x80c0d8d (Hex)	
	0x80c0000[130]	int	0x0 (Hex)	
	0x80c0000[131]	int	0x80c0d8d (Hex)	
	0x80c0000[132]	int	0x80c0d8d (Hex)	
	0x80c0000[133]	int	0x80c0d8d (Hex)	
	0x80c0000[134]	int	0x80c0d8d (Hex)	
	0x80c0000[135]	int	0x80c0d8d (Hex)	
	0x80c0000[136]	int	0x80c0d8d (Hex)	
	0x80c0000[137]	int	0x0 (Hex)	
	0x80c0000[138]	int	0x80c0d8d (Hex)	
	0x80c0000[139]	int	0x0 (Hex)	
	0x80c0000[140]	int	0x80c0d8d (Hex)	
	0x80c0000[141]	int	0x80c0d8d (Hex)	
	0x80c0000[142]	int	0x0 (Hex)	
	0x80c0000[143]	int	0x80c0d8d (Hex)	
	0x80c0000[144]	int	0x80c0d8d (Hex)	
	0x80c0000[145]	int	0x80c0d8d (Hex)	
	0x80c0000[146]	int	0x80c0d8d (Hex)	
	0x80c0000[147]	int	0x80c0d8d (Hex)	
	0x80c0000[148]	int	0x80c0d8d (Hex)	
	0x80c0000[149]	int	0x80c0d8d (Hex)	
	0x80c0000[150]	int	0x80c0d8d (Hex)	
	0x80c0000[151]	int	0x80c0d8d (Hex)	
	0x80c0000[152]	int	0x80c0d8d (Hex)	
	0x80c0000[153]	int	0x80c0d8d (Hex)	
	0x80c0000[154]	int	0x80c0d8d (Hex)	
	0x80c0000[155]	int	0x80c0d8d (Hex)	
	0x80c0000[156]	int	0x80c0d8d (Hex)	
	0x80c0000[157]	int	0x80c0d8d (Hex)	
	0x80c0000[158]	int	0x80c0d8d (Hex)	
	0x80c0000[159]	int	0x0 (Hex)	
	0x80c0000[160]	int	0x80c0d8d (Hex)	
	0x80c0000[161]	int	0x80c0d8d (Hex)	
	0x80c0000[162]	int	0x80c0d8d (Hex)	
	0x80c0000[163]	int	0x80c0d8d (Hex)	
	0x80c0000[164]	int	0x0 (Hex)	
	0x80c0000[165]	int	0x80c0d8d (Hex)	
	0x80c0000[166]	int	0x80c0d8d (Hex)	
	0x80c0000[167]	int	0x80c0d8d (Hex)	
	0x80c0000[168]	int	0x80c0d8d (Hex)	
	0x80c0000[169]	int	0x80c0d8d (Hex)	
	0x80c0000[170]	int	0x80c0d8d (Hex)	
	0x80c0000[171]	int	0x80c0d8d (Hex)	
	0x80c0000[172]	int	0x80c0d8d (Hex)	
	0x80c0000[173]	int	0x80c0d8d (Hex)	
	0x80c0000[174]	int	0x80c0d8d (Hex)	
	0x80c0000[175]	int	0x80c0d8d (Hex)	
	0x80c0000[176]	int	0x80c0d8d (Hex)	
	0x80c0000[177]	int	0x80c0d8d (Hex)	
	0x80c0000[178]	int	0x80c0d8d (Hex)	
geirlandunico
Associate II

Ah I got it!

 

I was continuously calling the 0x80C0004 location as an instruction instead of properly calling the reference of that address.  I needed to do replace line 53 in that function as so:

 

//appl_ptr = (void (*)())(load_addr + 4);
appl_ptr = (void (*)())*((uint32_t*)(load_addr + 4));

 

Thank you for your help!