STM32N6 Hard_Fault when accessing TCM

filipxsikora · ‎2025-01-01

Hello,

I have a problem with STM32N6. When trying to access any TCM (ITCM/DTCM) the CPU immediately crashes to the Hard_Fault. Do the TCM memories need to be explicitly enabled or something? From my experience from STM32H7, it was never needed. I've looked around RM for N6 but cannot find anything related. I have found in the examples (FLEXMEM_Configurations) that the HardFault shall happen when accesing beyond the configured TCM size, but I'm getting the crash at the base addresses.

uint16_t* _tcmData = (uint16_t*)0x30000000;

int main()
{
    ... //system init etc..
    _tcmData[0] = 0x1234; //Hard_Fault crash here
}

RomainR. · ‎2025-01-04

Hi @filipxsikora

A Cortex-M55 is still different from a Cortex-M7. Hardfaults are probably not caused by optimization.

First you should check the Auxiliary Fault Status Register (AFSR) and check if your code does not cause a PECC Precise fault caused by uncorrectable ECC error. It is in AFSR bit 17. https://developer.arm.com/documentation/101273/0101/Cortex-M55-Processor-level-components-and-system-registers---Reference-Material/System-Control-and-Implementation-Control-Block/Auxiliary-Fault-Status-Register?lang=en

There is ECC in the TCM memory and it is under the control of the CPU. Please refer to the Arm Cortex-M55 Processor Technical Reference Manual r0p2. See at the bottom of the page below (Error processing in the TCMs): https://developer.arm.com/documentation/101051/0002/Reliability--Availability--and-Serviceability-Extension-support/ECC-memory-protection-behavior/Error-detection-and-processing

The TCM needs to be initialized before use it. Here is a snipped assembler code that you will need to add in the reset handler of your startup_stm32n657xx.s file (replace the Reset_Handler function)
It will allow to correctly initialize at 0 your DTCM - Baseline region at address 0x30000000 and allow LDR operations without PECC error and then hardfault.

Reset_Handler:
  ldr   r0, =_sstack
  msr   MSPLIM, r0
  ldr   r0, =_estack
  mov   sp, r0          /* set stack pointer */
/* Clear D-TCM */
  ldr R0, = 0x30000000
  ldr R1, = 0x30020000
  mov R2, #0
clear_dtcm:
  str R2, [R0]
  add R0, R0, #4
  cmp R0, R1
  bcc clear_dtcm
/* Call the clock system initialization function.*/
  bl  SystemInit

And with my code:

/* USER CODE BEGIN PV */
uint16_t* _tcmData0 = (uint16_t*)0x30000000;
uint16_t* _tcmData1 = (uint16_t*)0x30000010;
uint16_t* _tcmData2 = (uint16_t*)0x30000020;
/* USER CODE END PV */
...
int main(void)
{

  /* USER CODE BEGIN 1 */
  /* USER CODE END 1 */

  /* MCU Configuration--------------------------------------------------------*/
  HAL_Init();

  /* USER CODE BEGIN Init */
  _tcmData0[0] = 0x1234;
  _tcmData1[0] = 0x1234;
  _tcmData2[0] = 0x1234;
  /* USER CODE END Init */

  /* Configure the system clock */
  SystemClock_Config();

After that, whatever you do with the TCM memory, you should also define these corresponding regions in the linker file. This is how it should be done.
Remember that on one side there is a STM32N6 device which is a complex SoC. And on the other side there are the compiler tools. For both to match and work, there are these best practices that must be applied.

I hope it will fix your hardfault issues when you use TCM with STM32N6. Let me know.

Best regards,

Romain,

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

View solution in original post

tjaekel · ‎2025-01-03

VERY INTERESTING (and confusing)!

I play right now also with the STM32N6 - I tried your issue on on my project... and YES: it can crash in HardFault_Handler (but see below: it depends just on how you compile with debug options!)

First, I thought:
A non-secure vs. secure issue (DTCM 0x30000000 is Secure, 0x20000000 is Non-Secure). So, accessing DTCM as secure when main() is launched as non-secure (potentially) could cause such an issue. But it does NOT!

It just depends on:
How you set Optimization level! VERY STRANGE:

if I set Optimization level to "none (-O0)" - it crashes in HardFault_Handler!
if I set "Optimize for debug (-Og)" - it WORKS!

I use an STM32N6 project with FSBL build. It really just depends on how I set the debug level!
Secure and non-secure address location does not have an effect! (it crashes with both address locations if not -Og!)

What is this?
(I have realized already: the STM32N6 is very "delicate": sometimes it works, sometimes not - it seems to depend meanwhile on "how" I generate the code (with vs. without debug optimization, FSBL vs. Appli, ...).

Here the details:
I do also in main():

int main(void)
{
  int no = 6;
  /* USER CODE BEGIN 1 */
  XSPI_RegularCmdTypeDef sCommand = {0};
  XSPI_MemoryMappedTypeDef sMemMappedCfg = {0};

  uint16_t index = 0;
  uint16_t res = 0;
  uint32_t address = 0;
  __IO uint8_t step = 0;
  __IO uint8_t *mem_addr;

#if 1
  {
	  uint16_t* _tcmData = (uint16_t*)0x20000000;	//0x30000000 is secure memory, 0x20000000 is non-secure
	  _tcmData[0] = 0x1234;
	  if (_tcmData[0] == 0x1234)
		  testVar = 1;
  }
#endif

Just toggle debug level for code generation!

This is a debug session with debug level -Og:

This works fine!: I can write and read DTCM as secure and non-secure (0x30000000 vs. 0x20000000).

This is a debug session with none, as -O0:

This CRASHES: both addresses accessed result in a HardFault_Handler.

What is this?

I assume: the debugger has an effect on secure vs. non-secure access permission or system configuration done (by debugger instead of code).
(the system behavior should not not have any effect if generated for debugger and/or using debugger to let execute my code)

I am pretty "shocked" being able to replicate this issue (and I am very confused, losing my confidence in this N6 MCU...)

filipxsikora · ‎2025-01-04

Very interesting findings @tjaekel .

I can confirm, if I switch to -O2, then it does not crash into the HardFault when accessing the memory. However, this is far from over.

To address your points:

(I have realized already: the STM32N6 is very "delicate": sometimes it works, sometimes not - it seems to depend meanwhile on "how" I generate the code (with vs. without debug optimization, FSBL vs. Appli, ...).

Same here. The N6 goes wild on me like 50 times a day, when I have to physically reconnect the USB to reset the ST-Link and Power. Reset button does not "fix" things.

(I'm using the N6-Nucleo board)

A non-secure vs. secure issue (DTCM 0x30000000 is Secure, 0x20000000 is Non-Secure). So, accessing DTCM as secure when main() is launched as non-secure (potentially) could cause such an issue. But it does NOT!

I'm no expert in this secure/non-secure thing. Actually, it just bothers me, I don't want it, but it is what it is. But I was thinking, my code and data runs from secure (using the Appli LRUN.ld, which places the code and data into the 0x34000400 AXI SRAM, which is secure. When I was checking addresses of peripherals, they were all in the 0x50000000 range, again, secure. So my thinking was - I'm in secure, I shall use secure.

I hope these are mostly ST-link issue, fixable by FW update of ST-link. It is a new product, it is very complex, so yeah, I'm expecting some newborn pains.

Because - when I move my stack into the DTCM, it works.

What does not work for me is ITCM. I just cannot make it run no matter what. I have my code and approach from STM32H7:

Create section in linker for ITCM funcs
Create symbols at the section boundary
Load the ITCM mem from ROM (right now in N6_LRUN.ld loading from RAM) at startup.
Done

This has always worked for me. But not in N6 case, I just always get a HardFault when calling any func placed at ITCM.

filipxsikora · ‎2025-01-04

Also tried accessing the memory through STM32CubeProgrammer. I can access basically any address:

0x24000000/0x34000000 - AXI RAM NSEC/SEC
0x40000000/0x50000000 - Peripherals NSEC/SEC
0x70000000 - XSPI Flash
0x08000000/0x18000000 - BootROM NSEC/SEC

But, when I try to access 0x00000000 or 0x10000000 or 0x20000000 or 0x30000000 it gives me an error.

So I can't check or confirm, what is in the ITCM (if anything).

Please ST, have a look at this.

filipxsikora · ‎2025-01-04

Update: I have also tried to load the ITCM mem "directly", without the intermediate step of loading from ROM/RAM and this works, surprisingly. When debugging, I can see the SP switch to 0x10000000 and executes the code in ITCM normally. Really, really weird and unpredictable behaviour. Also it is useless.

Edit: Even better, when I load the ITCM directly, it looks like the "connection" to the ITCM gets somehow "enabled" and as long as I don't disconnect the power, I can then do the normal approach with loading the contents to the ITCM from RAM/ROM. And it works. But, when I disconnect the power, reconnect and upload exactly the same code again -> HardFault.

RomainR. · ‎2025-01-04

Hi @filipxsikora

A Cortex-M55 is still different from a Cortex-M7. Hardfaults are probably not caused by optimization.

First you should check the Auxiliary Fault Status Register (AFSR) and check if your code does not cause a PECC Precise fault caused by uncorrectable ECC error. It is in AFSR bit 17. https://developer.arm.com/documentation/101273/0101/Cortex-M55-Processor-level-components-and-system-registers---Reference-Material/System-Control-and-Implementation-Control-Block/Auxiliary-Fault-Status-Register?lang=en

There is ECC in the TCM memory and it is under the control of the CPU. Please refer to the Arm Cortex-M55 Processor Technical Reference Manual r0p2. See at the bottom of the page below (Error processing in the TCMs): https://developer.arm.com/documentation/101051/0002/Reliability--Availability--and-Serviceability-Extension-support/ECC-memory-protection-behavior/Error-detection-and-processing

The TCM needs to be initialized before use it. Here is a snipped assembler code that you will need to add in the reset handler of your startup_stm32n657xx.s file (replace the Reset_Handler function)
It will allow to correctly initialize at 0 your DTCM - Baseline region at address 0x30000000 and allow LDR operations without PECC error and then hardfault.

Reset_Handler:
  ldr   r0, =_sstack
  msr   MSPLIM, r0
  ldr   r0, =_estack
  mov   sp, r0          /* set stack pointer */
/* Clear D-TCM */
  ldr R0, = 0x30000000
  ldr R1, = 0x30020000
  mov R2, #0
clear_dtcm:
  str R2, [R0]
  add R0, R0, #4
  cmp R0, R1
  bcc clear_dtcm
/* Call the clock system initialization function.*/
  bl  SystemInit

And with my code:

/* USER CODE BEGIN PV */
uint16_t* _tcmData0 = (uint16_t*)0x30000000;
uint16_t* _tcmData1 = (uint16_t*)0x30000010;
uint16_t* _tcmData2 = (uint16_t*)0x30000020;
/* USER CODE END PV */
...
int main(void)
{

  /* USER CODE BEGIN 1 */
  /* USER CODE END 1 */

  /* MCU Configuration--------------------------------------------------------*/
  HAL_Init();

  /* USER CODE BEGIN Init */
  _tcmData0[0] = 0x1234;
  _tcmData1[0] = 0x1234;
  _tcmData2[0] = 0x1234;
  /* USER CODE END Init */

  /* Configure the system clock */
  SystemClock_Config();

After that, whatever you do with the TCM memory, you should also define these corresponding regions in the linker file. This is how it should be done.
Remember that on one side there is a STM32N6 device which is a complex SoC. And on the other side there are the compiler tools. For both to match and work, there are these best practices that must be applied.

I hope it will fix your hardfault issues when you use TCM with STM32N6. Let me know.

Best regards,

Romain,

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

filipxsikora · ‎2025-01-04

@RomainR. Thank you! This works.

I realize, that the M55 core is different from M7 core. But I just never knew that the ITCM/DTCM memories need some kind of "init". I've modified your snippet to also init ITCM memory

/* Clear I-TCM */
  ldr R0, = 0x10000000
  ldr R1, = 0x10010000
  mov R2, #0
clear_itcm:
  str R2, [R0]
  add R0, R0, #4
  cmp R0, R1
  bcc clear_itcm

And now DTCM and ITCM are both available, readable and accessible. Awesome, thanks again.

tjaekel · ‎2025-01-04

Great!
What is the conclusion? For my impression: TCMs have ECC, always on. On startup - it looks like - there is a mismatch between random RAM content and CRC. So, it needs a "clear" cycle, to initialize TCMs first before first use (to bring RAM content and ECC into "sync").
Anyway: an "interesting feature" (compared to H7 MCUs).

tjaekel · ‎2025-01-04

I cannot stop thinking about this "issue". I want to understand completely what is going on. What I come across is this:

Yes, you have to initialize DTCM and ITCM first (before you read back anything or execute code):
This seems to be done by initializing (writing) N numbers of words (assuming it is related to the row length for the ECC calculation, complete "pages" have to be written so that ECC is "in sync")
In my experience: it is important to write 32bit values (not uint_16), assuming the ECC "update" is done just on 32bit write cycles (otherwise is still crashes with HardFault_Handler).
I can initialize and use both, secure and non-secure memories (DTCM, ITCM): a bit strange, because I was expecting that my main() code runs in DEV boot option also as non-secure (but it seems to be executed as secure or TrustZone is disabled in DEV debug mode). Never mind: in DEV boot my main() can access both memory regions (secure and non-secure).
ATT: when the GCC compiler sees an access to address 0x00000000 - it will generate a UDF instruction (undefined instruction trigger). This triggers a HardFault_Handler. The code looks completely different (with UDF) when GCC sees access to address 0x00000000.
(I compile and debug in assembly code mode, with no debug and -O3)
But: I can write to address 0x00000000 and I can even execute code on this address (as Thumb code, +1).
It is just a bit tricky to tell the compiler that 0x00000000 is a valid memory address:
It creates a warning for "index out of range" when base pointer is 0. It creates an UDF instruction when compiler see that address 0x00000000 is used (for any operation, also when reading from it, but not for writing).
I guess: the GCC watches if you try to access 0x00000000. If so: it generates code with UDF instruction (resulting in HardFault_Handler). It assumes that address 0x00000000 is for SP value, not for any instruction or data (assuming the vector table is there).
It could become important when you load code on address 0x00000000 (ITCM) and you want to let it execute from there.

Here the test code I use (in main(), booting in DEV mode = not reading and not booting this code from external octal SPI flash):

  {
	  int i;
	  uint32_t* _dtcmData = (uint32_t*)0x30000000;	//0x30000000 is secure memory, 0x20000000 is non-secure - both OK
	  for (i = 0; i < 512; i++)
		  _dtcmData[i] = 0x1234;
	  if (_dtcmData[0] == 0x1234)
		  testVar = 1;
	  uint32_t* _itcmData = (uint32_t*)0x00000000;	//0x00000000 is code non-secure, 0x10000000 is code secure - both OK
	  for (i = 0; i < 512; i++)
		  _itcmData[i] = 0x47702001;
	  ////if (_itcmData[0] == 0x1234)				//HardFault_Handler - WHY? --> it is an UDF instruction generated when 0x00000000!
	  if (_itcmData[1] == 0x47702001)				//THIS WORKS! compiler generates warning about pointer = 0 and any index in array
		  testVar = 2;
	  /* it works with address 0x00000004 - but not with reading from address 0x00000000! - compiler creates UDF instruction! */
	  uint32_t* xPtr = (uint32_t*)0x00000004;		//0x00000000 not possible - it generates UDF instruction!
	  /* but possible to modify register and read from address 0x0 - with debugger */
	  if (*xPtr == 0x47702001)						//0x4770, 0x2001 is MOV R0, #1, BX LR instruction code - see below calling it
		  testVar = 3;

	  {
		  /* this works: code can be executed at address 0x00000000 - but not generated (compiler) */
		  FPTR fptr = (FPTR)0x00000001;				//it must be thumb code address!
		  if (fptr() == 1)
		  {
			  testVar = 4;							//it works!
		  }
	  }
  }

OK, much clearer now for me (and fine, as long as we have the "constraints" in mind).