STM32F756NG - cannot write every other 128 bits of flash?

anotherandrew · ‎2019-05-01

I have multiple prototypes using the STM32F756NGH6. Some (4/10) have an odd STM32 flash failure where I can erase flash and I can write flash, but only the first 128 bits of every 256 bits. The "upper" 128 bits of flash are all 1s and cannot be written.

Using OpenOCD with a J-Link Pro, this is what the first bit of flash looks like:

> mdw 0x200000 0x40
0x00200000: 20050000 00201129 00201f49 00201ef9 ffffffff ffffffff ffffffff ffffffff
0x00200020: 00000000 00000000 00000000 00204eb1 ffffffff ffffffff ffffffff ffffffff
0x00200040: 00201179 00201179 00201179 00201179 ffffffff ffffffff ffffffff ffffffff
0x00200060: 00201179 00201179 00201179 00201179 ffffffff ffffffff ffffffff ffffffff
0x00200080: 00201179 00201179 00201179 00201179 ffffffff ffffffff ffffffff ffffffff
0x002000a0: 00201179 00201179 00201179 00201179 ffffffff ffffffff ffffffff ffffffff
0x002000c0: 00201179 00201179 00201179 00201179 ffffffff ffffffff ffffffff ffffffff
0x002000e0: 00201179 00201179 00201179 00201179 ffffffff ffffffff ffffffff ffffffff

The lower four dwords are correct (this is just the vector table for the application which is why so many ISRs are pointing to 0x00201179), but the upper four dwords are all 0xffffffff. This pattern (unable to write to the upper 128 bits of every 256 bits) persists throughout the entire 1MB region.

Suspecting poor soldering, I reflow the STM32, taking the time to use a board heater to get the 6 layer board up to a reasonable temperature to ensure good solder. Most times this does not work, sometimes causing RDP level 1 (read protection) to become enabled. Other times it does nothing. I can usually undo the read protection which will allow me to try to write again, but the same failure occurs.

If I replace the STM32 with a new one then the problem may or may not be present.

I fear I am damaging these devices when soldering them, but I'd like to know if anyone has seen this issue before.

A few notes about the specific wiring of the STM32:

All supplies are 3.3V, well-bypassed with 100nF 100V X5R capacitors (less than 50mV ripple on the 3.3V supply)
Due to an Altium vault component error, I have mistakenly wired PDR_ON (ball E5) to GND and BYPASS_REG (ball L5) to 3.3V. I have corrected this by using a local 1.2V supply for the nearby FPGA on VCAP_1 and VCAP_2
Also since BYPASS_REG is disabling the internal regulator, I have PA0 strapped to 3.3V as per 2.18.2 in the datasheet.
I'm using a 12MHz CMOS oscillator for clock on PH0 (ball G1). There is no 32kHz oscillator.
I'm using SWD for all debug and flash operations

Any suggestions or hints or tests you'd like me to run to try to correct this would be very much appreciated.

Thanks,

-A.

waclawek.jan · ‎2019-05-01

Sounds out of normal, but so does

> Due to an Altium vault component error, I have mistakenly wired PDR_ON (ball E5) to GND and BYPASS_REG (ball L5) to 3.3V.

> I have corrected this by using a local 1.2V supply for the nearby FPGA on VCAP_1 and VCAP_2

I don't pretend I understand what is going on. Just some random thoughts:

are all ground pins connected (including analog ones)?
the 1.2V is exactly how much, including ripple?
has correct reset been applied, when all power supplies were on?
does the programming application attempt to run the mcu at frequency higher than 144MHz? (see V12 in General operating conditions table for Regulator OFF, in the datasheet)
what parallelism is used for programming?

JW

anotherandrew · ‎2019-05-02

Yes, all supply pins are connected; all supplies (analog, digital, vbat, vref) are connected to 3.3V. Supply ripple on the 3.3V rail is under 50mV pk-pk, and under 10mV on the 1.2V rail. I do have a proper reset on NRST, but PA0 (which is the 1.2V domain reset) is just connected to 3.3V. This isn't necessarily right, but I did put a slow RC on it to try to give an actual sequenced reset without any change in behaviour.

The devices have never been (successfully) programmed; this was initial bringup and programming using a Segger J-Link Pro. I use both the J-Link Commander executable as well as using the J-Link as a "dumb" SWD interface with OpenOCD. I'm not sure what parallelism is being used by their RAM loaders, but they work on all the other boards so I suspect it's not a part issue.

I did receive some help from an ST FAE. He mentioned to make sure that 1.2V did not come up before 3.3V to prevent current injection which may damage the chip. My 1.2V is sourced from the 3.3V supply so it's unlikely it will come up before, although it is entirely likely that it may come up before the 3.3V has reached its minimum voltage.

What is interesting is that after digging out my old STLinkv2 and upgrading the firmware so it works with the STM32CubeProgrammer, I was able to get the last two stubborn boards to work. They both were showing the weird flashing issue described in my original post, sometimes showing up as read locked. One board the STLinkv2 was able to unlock and program right away; the other would not work even after another reflow attempt. That second board would be read-protected and would not unlock. I let the board sit powered but ignored for an hour and tried again... it unlocked, but had the flash issue. Cycled power a few times but it wouldn't budge. Left it again for another while (probably 40-60m, wasn't paying attention) and unlocked it and this time it accepted the programming.

I can't explain it. They're both stable now (they aren't reverting to this weird behaviour, even after power cycling), they accept new programming... Who knows?