2024-06-20 12:58 AM - edited 2024-06-20 01:05 AM
Hi!
I am writing a bootloader for an STM32H7S3L8, that received the application binary over UART and places it into flash, specifically on address 0x08008000, and I am getting hard faults when jumping to the address.
If I "install" the application AND bootloader through STM32CubeIDE with SWD, I can successfully jump from the bootloader to the application at runtime. Even if I copy all the application bytes from flash to a RAM section, I can successfully jump to the application and execute it there (from RAM).
But as soon as I am trying to execute from flash that during that runtime session has been erased and then written to, I get hard faults. I even tried just copying the application bytes to ram, erase the flash and copy the application bytes back to the same memory location, and execution from there doesn't work.
From googling around, I suspect that the D and/or I cache might be the problem. I can disable I cache, but I can't disable D cache, then I'm not able to write and read from flash properly (i.e. I don't even reach the line where I am trying to jump to the application).
I have also tried various combinations of invalidating D cache (using SCB_CleanInvalidateDCache_by_Addr/SCB_InvalidateDCache_by_Addr) after erasing and writing, but to no avail.
So to put it simply, how do I robustly execute from flash that I have erased and then written to? What do you need to do/set up before you try to execute from flash memory that you have once erased and written to?
Code is something along the lines of this
typedef void (*pFunction)(void);
#define FLASH_APP_START_ADDRESS ((uint32_t)0x8008000U)
void JumpToApp(void)
{
const uint32_t jump_address = *(__IO uint32_t*) (FLASH_APP_START_ADDRESS + 4U);
const pFunction jump_to_application = (pFunction) jump_address;
SCB->VTOR = FLASH_APP_START_ADDRESS;
uint32_t sp_addr = *(__IO uint32_t*)FLASH_APP_START_ADDRESS;
__set_MSP(sp_addr);
__enable_irq();
jump_to_application();
}
int main()
{
SCB_EnableDCache();
EraseFlash();
uint8_t* application_data = (uint8_t*)malloc(...);
ReceiveApplication(application_data);
// 'application_data' now contains application bytes
const flash_status status = FlashWriteBytes(FLASH_APP_START_ADDRESS, application_data, application_size);
JumpToApp();
}
Solved! Go to Solution.
2024-07-17 12:36 PM
Yes of course @STOne-32 !
So using the ST online help I was provided with the solution. Basically is was cache invalidation that was missing, which I had tried before to some extent, but not in the correct order, or type of cache invalidation. The flow now is basically:
EraseFlash(); // Erase the relevant flash sectors before writing to flash
SCB_CleanInvalidateDCache(); // Clean and Invalidate D Cache now that flash has been erased
// Write application data to flash (i.e. the application binary)
FlashWrite(FLASH_APP_START_ADDRESS, application_data, application_size);
SCB_CleanInvalidateDCache();
// Invalidate instruction cache, so new code is visible to CPU
SCB_InvalidateICache();
JumpToApplication();
Now it works, though my understanding of the cache is very limited, so I can't say WHY this worked, as opposed to all the other various combinations that I tried. Should add that I used the Clean/InvalidateD/ICache_by_Addr functions, instead of the non address functions, which turned out to be the correct solution.
2024-06-20 02:37 AM
@Tesla DeLorean @Pavel A. @Piranha I have seen many replies from you in other threads that regard this topic, do you have any input?
2024-06-20 03:21 AM
Hello,
To my opinion, you need to invalidate the ICache before jumping so the CPU will fetch correct instructions from the flash using SCB_InvalidateICache().
See for example FLASH_SwapBank: https://github.com/STMicroelectronics/STM32CubeH7/tree/master/Projects/STM32H743I-EVAL/Examples/FLASH/FLASH_SwapBank
The example invalidates the I cache before executing from the new Bank.
2024-06-20 04:32 AM
Thanks @SofLit , unfortunately I didn't get it to work. Tried adding SCB_InvalidateICache() in several different places. Here is my code in brevity:
#include "main.h"
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "stm32h7rsxx_hal_flash.h"
uint32_t GetSector(uint32_t Address)
{
return (Address - FLASH_BASE) / FLASH_SECTOR_SIZE;
}
typedef enum {
FLASH_OK = 0x00U,
FLASH_ERROR_SIZE = 0x01U,
FLASH_ERROR_WRITE = 0x02U,
FLASH_ERROR_READBACK = 0x04U,
FLASH_ERROR = 0xFFU
} FlashStatus;
typedef void (*p_function)(void);
#define FLASH_APP_START_ADDRESS ((uint32_t)0x8008000U)
#define FLASH_APP_END_ADDRESS ((uint32_t)0x8010000U - 1U)
void JumpToApp(void)
{
const uint32_t jump_address = *(__IO uint32_t*) (FLASH_APP_START_ADDRESS + 4U);
const p_function jump_to_application = (p_function) jump_address;
SCB->VTOR = FLASH_APP_START_ADDRESS;
uint32_t sp_addr = *(__IO uint32_t*)FLASH_APP_START_ADDRESS;
__set_MSP(sp_addr);
__enable_irq();
SCB_InvalidateICache();
jump_to_application();
}
void EraseFlash()
{
HAL_FLASH_Unlock();
uint32_t FirstSector = 0U, NbOfSectors = 0U;
uint32_t sector_error = 0U;
FLASH_EraseInitTypeDef erase_init_struct;
FirstSector = GetSector(FLASH_APP_START_ADDRESS);
NbOfSectors = GetSector(FLASH_APP_END_ADDRESS) - FirstSector + 1;
erase_init_struct.TypeErase = FLASH_TYPEERASE_SECTORS;
erase_init_struct.Sector = FirstSector;
erase_init_struct.NbSectors = NbOfSectors;
if (HAL_FLASHEx_Erase(&erase_init_struct, §or_error) != HAL_OK)
{
FlashEraseError();
}
HAL_FLASH_Lock();
}
FlashStatus FlashWrite8(uint32_t address, uint8_t *data, uint32_t length)
{
FlashStatus status = FLASH_OK;
HAL_FLASH_Unlock();
for (uint32_t i = 0U; (i < length) && (FLASH_OK == status); i++)
{
if (address > FLASH_APP_END_ADDRESS)
{
status |= FLASH_ERROR_SIZE;
}
else
{
if (HAL_OK != HAL_FLASH_Program(FLASH_TYPEPROGRAM_BYTE, address, (uint32_t)(data) + i * sizeof(uint8_t)))
{
status |= FLASH_ERROR_WRITE;
}
volatile const uint8_t read_back_data = (*(volatile uint8_t*)address);
if (data[i] != read_back_data)
{
status |= FLASH_ERROR_READBACK;
}
address += 1U;
}
}
HAL_FLASH_Lock();
return status;
}
int main(void)
{
SCB_EnableICache();
SCB_EnableDCache();
HAL_Init();
SystemClock_Config();
uint32_t application_size = 8232U;
uint8_t* application_data = (uint8_t*)malloc(sizeof(uint8_t) * application_size);
if((!application_data) || (application_data == NULL))
{
while(1) {
// Blink red LED or something
}
}
uint8_t* flash_address_to_read = (uint8_t*)FLASH_APP_START_ADDRESS;
for(uint32_t k = 0; k < application_size; k++)
{
application_data[k] = (*flash_address_to_read);
flash_address_to_read += 1U;
}
SCB_DisableICache();
EraseFlash();
FLASH_WaitForLastOperation(100000U);
const FlashStatus status = FlashWrite8(FLASH_APP_START_ADDRESS, application_data, application_size);
FLASH_WaitForLastOperation(100000U);
SCB_DisableICache();
JumpToApp();
}
2024-06-20 05:00 AM - edited 2024-06-20 05:10 AM
What if you disable the cache while Handling the Flash (erase/program) and enable it in the application that you would jump to it? (inspired from the example in STM32CubeH7: Projects\STM32H743I-EVAL\Applications\ExtMem_CodeExecution\ExtMem_Boot)
Are you also enabling the cache in the application that you want to jump to?
2024-06-20 05:22 AM
Yeah that doesn't work either unfortunately... I don't need I Cache or D Cache for my bootloader (or at least I have no knowledge of anything in particular that I need them for)...
It seems like I have to have D Cache enabled, in order to write to and read from flash. But even if I disable D Cache just before I make the jump to the Application, I the same hard faults... The PC is set to 0xFFFFFFFF and everything goes bananas...
2024-06-20 05:24 AM
I have enabled the cache in the application I'm jumping to, but I'm not getting past the first instruction in the Application ResetHandler, so I don't think that matters right now...
2024-06-20 05:34 AM
Interesting example that you provided, but if I do the same thing, I get an error on the line where I'm getting the address of the Application ResetHandler:
void JumpToApp(void)
{
SCB_DisableICache();
SCB_DisableDCache();
__DSB();
__ISB();
const uint32_t jump_address = *(__IO uint32_t*) (FLASH_APP_START_ADDRESS + 4U); // <- This line gives a HardFault
const pFunction jump_to_application = (pFunction) jump_address;
SCB->VTOR = FLASH_APP_START_ADDRESS;*/
SCB->VTOR = FLASH_APP_START_ADDRESS;
uint32_t sp_addr = *(__IO uint32_t*)FLASH_APP_START_ADDRESS;
__set_MSP(sp_addr);
__enable_irq();
jump_to_application();
}
And in Control->CFSR the BFARVALID bit and PRECISERR bit are set to 1.
Having the disabling of the caches just before jump_to_application doesn't make a difference...
2024-06-20 02:10 PM - edited 2024-06-20 02:22 PM
@DanielPi I don't have a board with H7S, this MCU is the newest in H7 family so it very well can differ in the caches behavior. I don't understand why D cache cannot be disabled. Besides of the CM7 caches there is some ST proprietary accelerator or cache. Anyway, before jump to modified memory you need ISB instruction/intrinsic.
You can try to prevent hard fault by enabling more specific exceptions, such as memfault, and mask it. If this works, you can then understand the root reason of the fault, either ignore it or work down further to fix. It likely may be ECC error.
2024-06-21 01:05 AM
Hey Pavel. Thanks a bunch :)