2023-07-30 11:55 PM
Hi,
We develop a device based on STM32H7B0. We use this microcontroller with QSPI external flash. In the internal flash we keep our bootloader and in the external flash resides application code. We are now testing our devices and we had 3 cases when the internal flash got erased. In result the devices act "dead" becuase there is no code to initialize and start the application from the external flash.
Why does it happen? Nowhere in the bootloader nor in the application code we use instructions to unlock the internal flash, not speaking of instructions to erase it. Is there any other option for the internal flash to get erased?
Regards, Piotr
Solved! Go to Solution.
2023-08-01 01:22 AM
>found out that only the first sector of the internal flash got erased (first 8KB of flash).
Are we still talking about STM32H7B0 ? So, the first 8K of the 128K sector turned into FFs?
2023-08-01 01:58 AM
Exactly. You understood correctly.
2023-08-01 02:32 AM - edited 2023-08-01 02:34 AM
Ok. While waiting for more helpful replies, can you recall anything else? High ambient temperature, EMI, radiation, ultrasound?
Consider the brown-out detector (in the option bytes)
2023-08-01 02:37 AM
Not really. The only thing that comes to my mind is power supply problems. It sounds to me totally strange, because I considered flash memory to be nonvolatile. I would understand if these power supply drops took place during some flash operations such as programming, erasing but nothing like this happens. We erase and program the internal flash once during production and that's all. There is no place in the firmware where we unlock the internal flash to perform such operations.
2023-08-01 05:46 AM
Hi @FBL,
What do you mean by "share your software inbox"?
2023-08-01 06:03 AM - edited 2023-08-01 06:06 AM
Allow sending you private messages in the forum settings -> Private messenger-> turn on private mesages
2023-08-01 09:14 AM
If you're seeing this a lot, and can induce it relatively easily, clearly warrants more investigation.
You could eliminate potential causes, say a cascading failure of resetting and unpredictable execution, by having an effective POR with a hard threshold. Say 3.0 or 2.7V below which the MCU is clamped in reset, so no code or operation is possible, at all.
Unfortunately the firewall / chinese-wall (metaphor of barriers) for lock/unlock codes being present of the MCU is violated by the System ROM.
Any way to instrument or log Flash ECC errors? ie devices under long-term observation flag issues via a GPIO which is recorded by external/independent monitoring equipment. Monitor the clamped reset state, and perhaps voltage if that can be done in a relatively non-invasive way.
2023-08-07 01:18 PM
Hi @Tesla DeLorean ,
Thank you for your answer.
Unfortunately the issue is hard to reproduce. I already tried plugging and unplugging the device from the power source for a several hours and finally I managed to induct it.
Using an oscilloscope I plotted the MCU supply voltage. At the end of the battery discharge it goes below 2,6V. At the very end the voltage start to oscillate. I guess that during that time the uC starts and stops several times. Sometimes it ends with a erased memory.
As I wrote I managed to reproduce this issue on my desk. Again the first sector of the microcontroller's internal flash was erased (0x08000000 - 0x08002000). The STM32H7 has option byte that sets the start address of the program. I've changed it to 0x08010000 hoping that event when the issue happens again the erased memory won't matter. Unfortunately I was wrong. The erased happened again and now the 0x08010000 sector got erased.
Conclusion - not the first sector of the memory but the first sector of the program gets erased. I guess that during lack of power uC reads the first sector of the memory and then it stops because of the voltage drop. And again and again. Somehow it leads to flash memory corruption.
Unfortunately again we have already produced thousands of PCB and they are waiting for soldering. Because of that any hardware modification is not possible.
Is there any hope for us? Any way to bypass this issue?
Thank you for your help,
Piotr
2023-08-07 01:52 PM
You can do a somewhat dumb test - make an absolutely minimal blinky, run it and try to catch the issue. But erase the whole FLASH and put the blinky as the only code on the MCU - no custom bootloaders or anything else.
2023-08-07 02:27 PM
Are there more aggressive BOR / POR setting you can configure in the Option Bytes?
Most of the defensive design stuff I can think of needs to be done externally. Even if you can't modify more than a handful of boards it might highlight or eliminate a particular cause from being something to investigate further.
You should I think start a conversation with a local ST support engineer to perhaps review the issue, and if it is more prevalent. I see occasion reporting here, so would agree it's a thing that does/can occur, but don't have any specific insight into the causes or frequency of the failure.