cancel
Showing results for 
Search instead for 
Did you mean: 

STM32F417 flash spontaneously erased?

CBrad.2
Associate

Hi,

We have a product in the field using STM32F417ZG, and it was returned to us after it mysteriously stopped working. On receiving the board, we noticed that nothing was running on the STM32F417. We read out the flash memory and it appeared to be erased (0xFFs). Important to note that we don't enable read-out protection. We reprogrammed the MCU and it started working again as normal, and a read-out of the flash was as expected. Does anyone know how this might have occurred?

We have also encountered one other device where the read-out protection was enabled but we have never explicitly set it during programming, although we can't rule out that someone enabled it accidentally during initial programming.

We are using the built in DFU bootloader via USB and originally were using the DfUSe Demo app before STM32CubeProgrammer was released.

Thanks,

Chris

11 REPLIES 11
TDK
Guru

Seems like the most likely scenario is that someone inadvertently erased the flash. If you have this functionality built into your program, could have been caused by a program bug.

If you feel a post has answered your question, please click "Accept as Solution".
CBrad.2
Associate

There is no ability to do this in our program. The circuit board is also physically secured behind a lock and key.

It's very mysterious. There may be a power outage in the logs but we fail to see how this could erase the flash.

Seem to get sporadic reports of this type of thing.

The ROM System Loader has code capable of doing a mass-erase.

Most probable cause is indeterminate execution at low voltages (core runs at 1.2V, semi-viable below that).

Would make sure BOOT0 is pulled low, and you have a thresholding POR device(s) ensuring all supplies at up and within 5-10% of nominal.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
MHAJJ
Associate II

Hello CBrad,

Were the option bytes corrupted too ?

A mass erase is triggered when switching the RDP level from 1 to 0, so maybe that.

Regards,

Mahmoud.

PChao.2
Associate II

Have you ever figured out what caused the flash erase on your STM32F417ZG? We are seeing the same problem now from several customers RMA returning a bunch of our products running STM32F407VGT6. We see the exact same issue with the returned unit having all 0xFFs for the flash memory. Reloading the firmware brings the board back to life so it's an easy fix.

About 80% of the RMA units coming back to us have this issue.  It's an easy thing for us to deal with, but customers can't load the FW without the SWD interface connection.  The board designer also never exposes BOOT0 to a button, so a field-programmable option via USB is not possible.   We have added a BOOT0 switch to all the new boards we design to address this problem, but we still have thousands of these old board designs out there and no easy solution for the end users to do their own FW loading.

I wish we understood how it could happen out in the field.

In our case, we have a software interface program that updates the flash data via USB HID.  Maybe an invalid report write could erase the flash?  I don't understand how the entire flash memory could be written to 0xFF though.

IMHO it could likely occur by RDP regression from level 1 - which could occur because of tampering with option bytes ?

 

 

 

And these things had worked properly at some point since delivery?

Any commonality in the failure use cases? Brands of power supply, method of supply, location?

Failure during other update or maintenance procedure?

 

I do encounter this occasionally on L4 parts I get back, but the incidence is very low. Like I said there is code on these device that can mass erase them, and there's not a good interlock to preclude it. Other devices I work with need the loaders to be furnished with keys externally to unlock flash, but there's no such firewall here.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

The majority of the devices probably work fine (or at least we don't get any RMAs for them). So far, we have received maybe 200 or so back from the 4,000 units out there. About 160 of these 200 units had this problem.  It's less than 5% of the total units we manufactured and sold. So, it's not too horrible but it's still concerning that we have this issue at all.

The majority of the RMAs came from the same research group that conducts field studies out in the national forests in California.  They have used the devices since 2019 and sent the units back in 2021 and 2022.  So, the devices have been working well for many years.   Maybe something to do with the last software that interfaced with the device via USB HID protocol.  Not sure how it would trigger a device erase though.

Our new devices are using the L4 and we haven't seen this issue.  We are using the same USB HID connection protocol.

In any case, the F4 is now a legacy product so it's not a big deal but the problem just makes me curious what the root cause could be.