2025-02-17 03:03 AM
Hi everyone!
I’m facing random NMI errors in my project using the STM32H563 MCU, and I’m trying to identify the root cause.
System Overview
I am by no means an expert in NMI errors, so I want to ensure I am not overlooking other possible causes.
These errors may occur due to custom memory partitioning?
MEMORY
{
/* RAM ( xrw) : ORIGIN = 0x20000000 , LENGTH = 640K */
RAM1 ( xrw) : ORIGIN = 0x20000000 , LENGTH = 100K
RAM_APP ( xrw) : ORIGIN = 0x20019000 , LENGTH = 210K
RAM_NEXDUO ( xrw) : ORIGIN = 0x2004D800 , LENGTH = 330K
FLASH ( rx ) : ORIGIN = 0x08000000, LENGTH = 946K
COLORS ( rx ) : ORIGIN = 0x080EC800, LENGTH = 70K
ROOT_CA ( rx ) : ORIGIN = 0x080FE000, LENGTH = 4K
DEV_CERT ( rx ) : ORIGIN = 0x080FF000, LENGTH = 4K
}
How can I debug the exact source of the NMI?
I’d really appreciate any insights, suggestions, or debugging strategies to help pinpoint the issue. Thanks in advance for your help!
I will provide any feedback or relevant information!
2025-02-17 03:41 AM - edited 2025-02-17 05:33 AM
Hello,
From your linker file, it seems you are handling some Flash stuff (read/write) in your application:
FLASH ( rx ) : ORIGIN = 0x08000000, LENGTH = 946K
COLORS ( rx ) : ORIGIN = 0x080EC800, LENGTH = 70K
ROOT_CA ( rx ) : ORIGIN = 0x080FE000, LENGTH = 4K
DEV_CERT ( rx ) : ORIGIN = 0x080FF000, LENGTH = 4K
}
So first check if you have Flash ECCD (double ECC error) errors detected:
2025-02-17 05:30 AM
Hello @massimoperdigo,
Also, from searching the RM0481, NMIs are linked to 3 cases:
In each case, you need to monitor the status of the relevant registers when the NMI occurs!
Also, you can check this article in case you'll be using MPU in the future: How to avoid a HardFault when ICACHE is enabled on... - STMicroelectronics Community
To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.
2025-02-17 07:36 AM
Hello,
Thank you for your quick response.
Why would I have this type of error?
one way to check it would be?:
void NMI_Handler(void) {
// Check if an ECC double error occurred in SRAM2, SRAM3, or BKPSRAM
// although i do not have this type of RAM activated
if (RAMCFG->MISR & RAMCFG_MISR_DED) {
uint32_t error_address = RAMCFG->MDEAR; // Read the failing address
RAMCFG->MICR |= RAMCFG_MICR_CDED; // Clear the ECC error flag
SEGGER_RTT_printf(0, "SRAM ECC Double Error at 0x%08X\n", error_address);
}
// If the NMI was caused by HSE clock security failure
// the same as before, I am using HSI
if (RCC->CIFR & RCC_CIFR_HSECSSF) {
RCC->CICR |= RCC_CICR_HSECSSC; // Clear HSE clock security flag
}
// Handle FLASH ECC Double Error (ECCD)
if (FLASH->ECCR & FLASH_ECCR_ECCD) {
uint32_t error_address = FLASH->ECCDR; // Read failing address from ECCDR
// Clear the FLASH ECC error flag
FLASH->ECCR |= FLASH_ECCR_ECCD;
SEGGER_RTT_printf(0, "FLASH ECC Double Error at 0x%08X\n", error_address);
// If the error is caused by an unprogrammed OTP read:
if ((error_address >= 0x08FFF000) && (error_address <= 0x08FFF7FF)) {
SEGGER_RTT_printf(0, "Virgin OTP read error detected!\n");
}
}
}
Thank you!
2025-02-17 07:39 AM
Hello,
@massimoperdigo wrote:
Why would I have this type of error?
Maybe you written to the Flash without erasing it.
Also refer to this article pointed out by @Sarra.S
2025-02-17 07:42 AM
Hello, Sarra,
Thanks for your point of view!
I was wondering how can I monitor the last error:
Thank you!
2025-02-17 07:55 AM
Oh, thank you for this article!
Do you mean writing the Flash when flashing with the debugger?
In the end, everything is stored in the Flash with attributes, except for the ROOT_CA, which the first 6 bytes are flashed with jlink tools like:
device STM32H563VG
SelectInterface SWD
Speed 4000
erase
; Load application
loadfile build/Program.hex
; Load MAC address binary file (6 bytes)
loadfile build/mac_address.bin, 0x080FE000
; Verify contents in memory
mem 0x080FE000, 20
As you can see, I always do an erase before flashing.
2025-02-19 08:02 AM
hey, Sarra
Could you please point if I am accessing the registers correctly with this implementation?
Is there something that is not correct or missing?
void NMI_Handler(void) {
// Check if an ECC double error occurred in SRAM2, SRAM3, or BKPSRAM
// although i do not have this type of RAM activated
if (RAMCFG->MISR & RAMCFG_MISR_DED) {
uint32_t error_address = RAMCFG->MDEAR; // Read the failing address
RAMCFG->MICR |= RAMCFG_MICR_CDED; // Clear the ECC error flag
SEGGER_RTT_printf(0, "SRAM ECC Double Error at 0x%08X\n", error_address);
}
// If the NMI was caused by HSE clock security failure
// the same as before, I am using HSI
if (RCC->CIFR & RCC_CIFR_HSECSSF) {
RCC->CICR |= RCC_CICR_HSECSSC; // Clear HSE clock security flag
}
// Handle FLASH ECC Double Error (ECCD)
if (FLASH->ECCR & FLASH_ECCR_ECCD) {
uint32_t error_address = FLASH->ECCDR; // Read failing address from ECCDR
// Clear the FLASH ECC error flag
FLASH->ECCR |= FLASH_ECCR_ECCD;
SEGGER_RTT_printf(0, "FLASH ECC Double Error at 0x%08X\n", error_address);
// If the error is caused by an unprogrammed OTP read:
if ((error_address >= 0x08FFF000) && (error_address <= 0x08FFF7FF)) {
SEGGER_RTT_printf(0, "Virgin OTP read error detected!\n");
}
}
}
Thanks
2025-02-19 08:30 AM
Hi, SofLit.
I have tried what Sarra explain in they post:
void NMI_Handler(void)
{
if((FLASH->ECCDR && 0xFF))
{
//the memory is empty
//ECC error due to access to uninitialized memory
//Clear the ECCD flag
FLASH->ECCDETR |= (1<<31);
}
else
{
//ECC error detected a true failure
while (1)
{
}
}
}
However, when the NMI comes in, the first condition is not met, and my program gets stuck.
However, I have tried to force the clear of the ECCD flag, the NMI seems to not pop up again.
void NMI_Handler(void)
{
FLASH->ECCDETR |= (1<<31);
}
Could you please explain me this to me? I am not able to fully understand it.
Thanks!