cancel
Showing results for 
Search instead for 
Did you mean: 

Need for invalidating cache when executing from flash that has been erased and written to

DanielPi
Associate III

Hi!

I am writing a bootloader for an STM32H7S3L8, that received the application binary over UART and places it into flash, specifically on address 0x08008000, and I am getting hard faults when jumping to the address.

If I "install" the application AND bootloader through STM32CubeIDE with SWD, I can successfully jump from the bootloader to the application at runtime. Even if I copy all the application bytes from flash to a RAM section, I can successfully jump to the application and execute it there (from RAM).

But as soon as I am trying to execute from flash that during that runtime session has been erased and then written to, I get hard faults. I even tried just copying the application bytes to ram, erase the flash and copy the application bytes back to the same memory location, and execution from there doesn't work.

From googling around, I suspect that the D and/or I cache might be the problem. I can disable I cache, but I can't disable D cache, then I'm not able to write and read from flash properly (i.e. I don't even reach the line where I am trying to jump to the application).

I have also tried various combinations of invalidating D cache (using SCB_CleanInvalidateDCache_by_Addr/SCB_InvalidateDCache_by_Addr) after erasing and writing, but to no avail.

So to put it simply, how do I robustly execute from flash that I have erased and then written to? What do you need to do/set up before you try to execute from flash memory that you have once erased and written to?

Code is something along the lines of this

typedef void (*pFunction)(void);

#define FLASH_APP_START_ADDRESS ((uint32_t)0x8008000U)

void JumpToApp(void)
{
  const uint32_t jump_address = *(__IO uint32_t*) (FLASH_APP_START_ADDRESS + 4U);
  const pFunction jump_to_application = (pFunction) jump_address;
  SCB->VTOR = FLASH_APP_START_ADDRESS;

  uint32_t sp_addr = *(__IO uint32_t*)FLASH_APP_START_ADDRESS;
  __set_MSP(sp_addr);

  __enable_irq();
  jump_to_application();
}

int main()
{
  SCB_EnableDCache();
  EraseFlash();

  uint8_t* application_data = (uint8_t*)malloc(...);
  ReceiveApplication(application_data);

  // 'application_data' now contains application bytes
  const flash_status status = FlashWriteBytes(FLASH_APP_START_ADDRESS, application_data, application_size);


  JumpToApp();
}

 

9 REPLIES 9
DanielPi
Associate III

@Tesla DeLorean  @Pavel A. @Piranha  I have seen many replies from you in other threads that regard this topic, do  you have any input?

SofLit
ST Employee

Hello,

To my opinion, you need to invalidate the ICache before jumping so the CPU will fetch correct instructions from the flash using SCB_InvalidateICache().

See for example FLASH_SwapBank: https://github.com/STMicroelectronics/STM32CubeH7/tree/master/Projects/STM32H743I-EVAL/Examples/FLASH/FLASH_SwapBank

The example invalidates the I cache before executing from the new Bank.

SofLit_0-1718878683530.png

 

 

To give better visibility on the answered topics, please click on "Accept as Solution" on the reply which solved your issue or answered your question.
DanielPi
Associate III

Thanks @SofLit , unfortunately I didn't get it to work. Tried adding SCB_InvalidateICache() in several different places. Here is my code in brevity:

#include "main.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "stm32h7rsxx_hal_flash.h"

uint32_t GetSector(uint32_t Address)
{
  return (Address - FLASH_BASE) / FLASH_SECTOR_SIZE;
}

typedef enum {
  FLASH_OK              = 0x00U,
  FLASH_ERROR_SIZE      = 0x01U,
  FLASH_ERROR_WRITE     = 0x02U,
  FLASH_ERROR_READBACK  = 0x04U,
  FLASH_ERROR           = 0xFFU 
} FlashStatus;

typedef void (*p_function)(void);

#define FLASH_APP_START_ADDRESS ((uint32_t)0x8008000U)
#define FLASH_APP_END_ADDRESS   ((uint32_t)0x8010000U - 1U)

void JumpToApp(void)
{
  const uint32_t jump_address = *(__IO uint32_t*) (FLASH_APP_START_ADDRESS + 4U);
  const p_function jump_to_application = (p_function) jump_address;

  SCB->VTOR = FLASH_APP_START_ADDRESS;

  uint32_t sp_addr = *(__IO uint32_t*)FLASH_APP_START_ADDRESS;
  __set_MSP(sp_addr);

  __enable_irq();

  SCB_InvalidateICache();

	jump_to_application();
}

void EraseFlash()
{
  HAL_FLASH_Unlock();

  uint32_t FirstSector = 0U, NbOfSectors = 0U;
  uint32_t sector_error = 0U;
  FLASH_EraseInitTypeDef erase_init_struct;

  FirstSector = GetSector(FLASH_APP_START_ADDRESS);
  NbOfSectors = GetSector(FLASH_APP_END_ADDRESS) - FirstSector + 1;

  erase_init_struct.TypeErase     = FLASH_TYPEERASE_SECTORS;
  erase_init_struct.Sector        = FirstSector;
  erase_init_struct.NbSectors     = NbOfSectors;

  if (HAL_FLASHEx_Erase(&erase_init_struct, &sector_error) != HAL_OK)
  {
    FlashEraseError();
  }

  HAL_FLASH_Lock();
}

FlashStatus FlashWrite8(uint32_t address, uint8_t *data, uint32_t length)
{
  FlashStatus status = FLASH_OK;

  HAL_FLASH_Unlock();

  for (uint32_t i = 0U; (i < length) && (FLASH_OK == status); i++)
  {
    if (address > FLASH_APP_END_ADDRESS)
    {
      status |= FLASH_ERROR_SIZE;
    }
    else
    {
      if (HAL_OK != HAL_FLASH_Program(FLASH_TYPEPROGRAM_BYTE, address, (uint32_t)(data) + i * sizeof(uint8_t)))
      {
        status |= FLASH_ERROR_WRITE;
      }

      volatile const uint8_t read_back_data = (*(volatile uint8_t*)address);
      if (data[i] != read_back_data)
      {
        status |= FLASH_ERROR_READBACK;
      }

      address += 1U;
    }
  }

  HAL_FLASH_Lock();

  return status;
}


int main(void)
{
  SCB_EnableICache();
  SCB_EnableDCache();

  HAL_Init();

  SystemClock_Config();

  uint32_t application_size = 8232U;
  uint8_t* application_data = (uint8_t*)malloc(sizeof(uint8_t) * application_size);
  if((!application_data) || (application_data == NULL))
  {
	  while(1) {
      // Blink red LED or something
    }
  }

  uint8_t* flash_address_to_read = (uint8_t*)FLASH_APP_START_ADDRESS;

  for(uint32_t k = 0; k < application_size; k++)
  {
	  application_data[k] = (*flash_address_to_read);
	  flash_address_to_read += 1U;
  }

  SCB_DisableICache();

  EraseFlash();

  FLASH_WaitForLastOperation(100000U);

  const FlashStatus status = FlashWrite8(FLASH_APP_START_ADDRESS, application_data, application_size);

  FLASH_WaitForLastOperation(100000U);
  SCB_DisableICache();

  JumpToApp();
}

What if you disable the cache while Handling the Flash (erase/program) and enable it in the application that you would jump to it? (inspired from the example in STM32CubeH7: Projects\STM32H743I-EVAL\Applications\ExtMem_CodeExecution\ExtMem_Boot)

Are you also enabling the cache in the application that you want to jump to?

To give better visibility on the answered topics, please click on "Accept as Solution" on the reply which solved your issue or answered your question.
DanielPi
Associate III

Yeah that doesn't work either unfortunately... I don't need I Cache or D Cache for my bootloader (or at least I have no knowledge of anything in particular that I need them for)...

It seems like I have to have D Cache enabled, in order to write to and read from flash. But even if I disable D Cache just before I make the jump to the Application, I the same hard faults... The PC is set to 0xFFFFFFFF and everything goes bananas...

DanielPi
Associate III

I have enabled the cache in the application I'm jumping to, but I'm not getting past the first instruction in the Application ResetHandler, so I don't think that matters right now...

DanielPi
Associate III

Interesting example that you provided, but if I do the same thing, I get an error on the line where I'm getting the address of the Application ResetHandler:

void JumpToApp(void)
{
  SCB_DisableICache();
  SCB_DisableDCache();

  __DSB();
  __ISB();

  const uint32_t jump_address = *(__IO uint32_t*) (FLASH_APP_START_ADDRESS + 4U);  // <- This line gives a HardFault
  const pFunction jump_to_application = (pFunction) jump_address;

  SCB->VTOR = FLASH_APP_START_ADDRESS;*/

  SCB->VTOR = FLASH_APP_START_ADDRESS;

  uint32_t sp_addr = *(__IO uint32_t*)FLASH_APP_START_ADDRESS;
  __set_MSP(sp_addr);

  __enable_irq();

  jump_to_application();
}

And in Control->CFSR the BFARVALID bit and PRECISERR bit are set to 1.

Having the disabling of the caches just before jump_to_application doesn't make a difference...

Pavel A.
Evangelist III

@DanielPi I don't have a board with H7S, this MCU is the newest in H7 family so it very well can differ in the caches behavior. I don't understand why D cache cannot be disabled. Besides of the CM7 caches there is some ST proprietary accelerator or cache.  Anyway, before jump to modified memory you need ISB instruction/intrinsic.

You can try to prevent hard fault by enabling more specific exceptions, such as memfault, and mask it. If this works, you can then understand the root reason of the fault, either ignore it or work down further to fix. It likely may be ECC error.

Hey Pavel. Thanks a bunch 🙂