2024-04-23 03:41 AM
Hi all,
We have a product containing the STM32G030K8T6 MCU. The code is working like expected and the product is meeting all specifications. The product is battery powered and can be charged via USB. So far so good.
If the product is assembled it will go into a burn-in test where 70 plus products will be charged and de-charged between 3 to 10 times. During this test 1 to 3 products are failing, and with failing I mean the MCU just stops operation.
It stops at the following line of code:
We found a lot of high dV/dt on the powerline and think this is causing the MCU to hang although we cannot reproduce on a single piece.
If we test products randomly on bench or at our desk it doens't have any issues.
Currently we are at a dead end, does anybody has any suggestions?
2024-04-23 03:50 AM
Make sure HardFault_Handler and Error_Handler output actionable data and assert a GPIO or LED rather than just while(1) silently.
Make sure you're immediately ready to deal with interrupts coming from the TIM or other sources as you initialize pins and clocks.
2024-04-23 04:20 AM - edited 2024-04-23 04:23 AM
What development tools do you have?
It should be possible to connect to a running target without downloading & resetting:
@PLER wrote:During this test 1 to 3 products are failing
Is consistently the same 1-3 which exhibit this "hang" ?
Is it consistent where in the test sequence the "hangs" start occurring?
@PLER wrote:We found a lot of high dV/dt on the powerline and think this is causing the MCU to hang
Have you tried addressing that?
ie, if you prevent the high dV/dt, do the failures stop?
@PLER wrote:products will be charged and de-charged between 3 to 10 times.
So, apart from high dV/dt, you could also be having brown-outs? Have you looked into that?
EDIT:
What does it take to recover from a "hang" - just a reset? a full power-cycle? other?
Please use this button to properly post source code:
2024-04-23 05:51 AM
In line 423, are you sure that pin named STATUS_LED_PIN should be in mode AF1_TIM3?
2024-04-23 07:25 AM
@PLER wrote:During this test 1 to 3 products are failing
Is consistently the same 1-3 which exhibit this "hang" ?
Is it consistent where in the test sequence the "hangs" start occurring?
It is random, there is no pattern.
@PLER wrote:During this test 1 to 3 products are failing
Is consistently the same 1-3 which exhibit this "hang" ?
Is it consistent where in the test sequence the "hangs" start occurring?
@PLER wrote:We found a lot of high dV/dt on the powerline and think this is causing the MCU to hang
Have you tried addressing that?
ie, if you prevent the high dV/dt, do the failures stop?
@PLER wrote:products will be charged and de-charged between 3 to 10 times.
So, apart from high dV/dt, you could also be having brown-outs? Have you looked into that?
EDIT:
What does it take to recover from a "hang" - just a reset? a full power-cycle? other?
Please use this button to properly post source code:
If I supply high dV/dt to a single product nothing happens to the product. The problem only arises at the burn in facility (which is not the same location), where they are connected in group to the same mains.
At our lab I made a test setup with the same simple USB charger. Interrupting the mains on an irregular base generates spikes which ripple further via the power path into the system. The ringing is visible on the Vdd of the MCU. Besides the ringing, the supply voltage is flat and stable.
In case the MCU comes into a hang state the only way to get out is by re-flashing the code. Resetting or power-cycling the MCU are not doing the job.
The occurrence of the error is very random, no pattern is visible. This makes it a very hard error to find.
The burn-in cannot be changed.
2024-04-23 07:45 AM
How are option bytes set? Namely, BOR_EN and BOR_LEV bits.
JW
2024-04-23 07:54 AM
@PLER wrote:In case the MCU comes into a hang state the only way to get out is by re-flashing the code. Resetting or power-cycling the MCU are not doing the job.
Can you read-back the flash content of a "hung" chip?
If so, does it verify against the original code image?
Does the code write to flash?
2024-04-23 08:43 AM
Yes we can read back. In one reported case there was a mismatch between what was read back and the orignal file.
Yes it writes to flash.
2024-04-23 09:19 AM - edited 2024-04-23 09:23 AM
BOOT0 pulled LOW or Floating?
Hung? Like a Latch-Up condition, or inexplicably dead in a while(1) loop, or in an infinite IRQ storm because the source hasn't been cleared? Ponder how you might determine this.
2024-04-23 09:22 AM - edited 2024-04-23 09:26 AM
Can you be more specific?
How much different? Which bytes, how many, an entire page/sector?
Do you see this same self-destructive situation if you remove the flash writing for the purposes of diagnosing / bisecting the failure mode?