cancel
Showing results for 
Search instead for 
Did you mean: 

The MCU requires NRST to be toggled after power on

RobertK
Associate III

Hi all, I have a problem with a new version of an existing PCB using the STM32F334C8T6 microcontroller.

Known working code is loaded onto the new PCB, power is applied and the MCU starts up the clock and then seems to sit idle. If the NRST pin is then driven low and released the MCU starts up and begins operating normally. Holding NRST low on power up and then releasing it after a duration does not work, the MCU must attempt to start up and fail first. NRST is connected to a 10kΩ to +3V3 and a 100nF to 0V. BOOT0 is connected to a 10kΩ to 0V.

PXL_20230922_103355658.jpg

The signals here (top to bottom) are:
old PCB +3V3
old PCB NRST
new PCB +3V3
new PCB NRST
While both boards were powered at the same time the new board has an additional SMPS stage which adds a small delay to the rise of the +3V3 rail.

PXL_20230927_100625674.jpg

The signals here (top to bottom) are:
old PCB NRST
old PCB 8.000MHz crystal
new PCB NRST
new PCB 8.000MHz crystal
I've gone through a couple of stages of being suspicious of the clock as it gets enabled by the code, but then the code seems to stop/stall. However the scope traces don't show any significant problems. I've also compared the clock signal after the reset with the MCU running code and it looks identical to the above scope trace where the MCU is stalled. 

The new PCB does have a different 8.000MHz crystal so I may have got the load capacitors wrong. The new board is using this crystal from JLC with 2x 22pF 0402 C0G load capacitors. Is this correct? Should I be using a different value?

There are other minor differences between the two boards, but nothing that would suggest this issue. e.g. some pins no longer have a 1kΩ resistor linking them even though one pin was never enabled/driven in the code.

I'm running out of hair to tear out so welcome any suggestions of what the issue might be or further debugging steps. Thanks!

38 REPLIES 38

I haven't had time to go through the whole thread yet. At first I only noticed the layout, which reminded me a lot of similar cases of non-functioning systems.

In order to give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.
RobertK
Associate III

Hi Peter BENSCH,

Again I'm extremely leery of claims my PCB layout is the source of my problems. While I've not slavishly followed  AN2867 I feel I've got the gist of it and it's 100 times better than what was previously designed. Also to confirm this is the HSE oscillator, 8MHz.

After my previous post I bodged the crystal from my predecessors design, an HC-49/U-S SMD package onto the crystal pads of a working board. It has been tested to start up on power on with 8.2pF, 15pF and 47pF. I assume it'd work with the intermediate capacitors too. In my mind this rules out problems with the rest of the board, and with the layout.

The biggest difference after the package size seems to be the ESR. I've been using 120Ω and 500Ω crystals with minimal success but the 60Ω crystal just seems to work regardless. So my takeaways so far is that you need crystal Cload to be around 16-18pF and ESR to be <100Ω. Unfortunately that doesn't seem to be a combo that's available in the SMD package size of 3.2x2.5mm.

The biggest difference after the package size seems to be the ESR.

 

Very Interesting. This is the Crystal  used in the NUCLEO-F334R8. ESR 50Ohm (for 8Mhz). So that's another data point in support.

 

AN2867 talks about negative resistance and the recommended safety factor for ESR based on it, but I've not found where the negative resistance of the internal oscillator circuit is actually specified in the DS. AN2867 only describes a procedure for measuring it.

 

> So my takeaways so far is that you need crystal Cload to be around 16-18pF and ESR to be <100Ω.

> Unfortunately that doesn't seem to be a combo that's available in the SMD package size of 3.2x2.5mm.

 

Abracon seems to offer several in 5x3.2mm, which is still much smaller than the previous design.

 

 

- If someone's post helped resolve your issue, please thank them by clicking "Accept as Solution".
- Please post an update with details once you've solved your issue. Your experience may help others.
Peter BENSCH
ST Employee

I have recalculated for the STM32F334: assuming the specifications in the Chinese data sheet are correct (120ohms, C0=7pF, CL=6...20pF), you should have sufficient margin for it to run stably even with 20pF capacitors, assuming CL=20pF. Always assuming the layout is OK and the crystal has the promised parameters. With unchanged capacitors, an ESR of up to 500ohms should still work, smaller is always better.

Even if you have serious doubts: the guard ring mentioned by @BarryWhit is an important part of a stable oscillator. The guard ring should surround the crystal as much as possible and be underlaid by the separate GND area shown in AN2867. Both together must then be connected to the next GND pin of the STM32, not necessarily to two GND pins. This procedure at least prevents many possible faults. Differential routing of the crystal tracks is good, but not necessary, as these are not steep-edged digital signals.

What do you think about putting a crystal and its CL on a small test board that follows the rule mentioned before (if necessary, you can realise the underlaying GND with some copper foil or solder and connect the guard ring to it with several vias along the ring) and connecting this board to pins 5, 6 and 8 with short wires (of course cutting the tracks to the original crystal)?

In order to give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.
BarryWhit
Lead II

A less solid but possibly easier (or less destructive) test may be simply to eliminate (Hold reset/disconnect VDD/remove) any switching chips from the board (perhaps large decaps for them as well?), and see if the problem disappears. If it does, you've learned something, if it doesn't, you don't really know. 

 

If the assumption is that dI/dt from switching currents in the ground plane beneath the crystal package is causing issues, perhaps eliminating the elements that generate those current from the board might cure the issue and serve as proof.

 

You could make the argument that the OP description is consistent with this. Power-up means charging up the decaps, for example, which can cause large currents right at power-up. By the time you toggle reset a few seconds later, the transient currents from startup have settled down, so the environment is quieter. Admittedly, this is a rather "hand-waving" argument.

- If someone's post helped resolve your issue, please thank them by clicking "Accept as Solution".
- Please post an update with details once you've solved your issue. Your experience may help others.
RobertK
Associate III

Trying a 5.0x3.2mm crystal is a nice idea BarryWhit I would love to try that however I have run out of time and will have to put the research/experimentational down and go with the old, known working, HC-49/U-S package crystal. An unsatisfactory end but I need working boards sooner rather than later. I may be able to come back to this later in 2025.

Thanks for running the calculations for my original replacement crystal Peter BENSCH. I tested it and 22pF didn't work, not all all, 30/30 failures. I too originally assumed the crystal had a Cload of ~20pF but, as I've mentioned a few times, Cload is 8pF. Even with this info, going to 6.8pF gets me 20/30 working boards, and 8.2pF gets me 25/30 working boards. If I order 200 I can't really afford to have 60 boards requiring rework. And even with the rework I've found it to not be all that reliable with boards with 8.2pF sometimes still requiring a reset.

With respect to power up disturbances I originally tried holding the MCU in reset on power up for longer and longer time periods (I eventually got up to minutes before I got bored). But the MCU needed to try to come up and then be reset, holding it in reset did nothing but delay that first come up attempt. In my mind this rules out any power supply or start up disturbances. It's a good suggestion though and was my first thought/hope.

 

Thanks again,

RobertK

 

BarryWhit
Lead II

I hope you've tested extensively with the old crystal before committing.

 

The one thing that really bothers me is that no explanation has been found for why a manual reset after a botched power-up caused the MCU to successfully start up. Yes, ESR might explain why it won't start up at power up, but it doesn't explain why a manual reset fixes it. Because of this, I'm still not sure the answer has been found.

From experience, when you really find the root cause of an issue everything else seems to miraculously fall into place, every previously strange observation suddenly makes sense. That's how you know you've really figured out what the problem was.

 

But I realize you're under time constraints, not to mention the fact that several of us here have looked a this and have not come up with a convincing answer, so I guess you have no choice but to embrace optimism. I hope the new board will give you no further trouble.

 

If you ever figure out the answer to this riddle (and truly, the answer sometimes falls in your lap by sheer coincidence 3 years later), please do post an update. 

- If someone's post helped resolve your issue, please thank them by clicking "Accept as Solution".
- Please post an update with details once you've solved your issue. Your experience may help others.
BarryWhit
Lead II

@RobertK , I'm still thinking about this/

BarryWhit_0-1721827107605.png

One thing you didn't mention was whether you compared the oscillator startup waveform between power-up (cold reset) and NRST (warm reset). The RM snippet shown suggests that it's possible for the clock to be running, but for the STM32 clock unit to deem it "unstable" and therefore gating the clock from feeding the rest of the chip. You can test for this condition by  configuring an HSE clock, but clocking the processor from HSI16. If you see that the value of HSERDY really is stuck at 0 after power up, but not after NRST, that's something concrete which can be debugged. ST could possibly then provide more specific guidance/explanation. Since this flag is supposed to identify/wait until the clock becoming stable, comparing the startup waveform between the two cases would be an obvious follow-up experiment.

 

Perhaps you've moved on and if so - I understand. But personally, I suffer from a medical condition that prevents me from leaving mysteries unresolved. If you get a chance (and still have an interest) in testing this idea, please do and give us an update. No rush.

 

Update: One thing I'm noticing now in the scope shot from the OP, is that in the old board the start-up oscillation amplitude grows monotonically. But in the new board, it has a "lump in its throat". It grows quickly, then diminishes quickly, and only then increases monotonically to steady state. That's not how it's supposed to look. And, not only is  the startup profile odd but, as a consequence, the effective start-up time is also 2-3 times larger in the new board vs. the old.

- If someone's post helped resolve your issue, please thank them by clicking "Accept as Solution".
- Please post an update with details once you've solved your issue. Your experience may help others.
BarryWhit
Lead II

Still thinking about this from time to time.

The screenshot in OP shows a Rigol DS1104 ZPlus 100Mhz Scope.The standard probe that comes with this scope is the RIGOL pvp3150 passive probe. Its input capacitance is 10pf +/-5pF. Compared to the load capacitor values used, that's far too much to treat the screenshot shown in the OP as reliable. 

 

I suggested that the oscillation be measured monitored via the MCO pin, but unfortunately this never happened. It would also have been interesting to zoom into the waveform to see if the oscillation itself is distorted, as this might have given us an additional hint.

 

The new crystal used is specified with a typical drive level of 10-50uV. The original crystal is specified as 50uV typical, and 300uV max. It's at least possible that old crystal was more tolerant of a higher drive level (it was also physically much  larger). According to this Crystal Vendor's page on choosing STM32 crytsals:
"""
Overdriving is an issue for the stability of the oscillations and is manifested in no oscillation or high jitter. Equally we need to consider drive through the crystal. The spec will detail the max drive level, and testing should be done to see these criteria are not exceeded.
"""

and too high a drive level can also damage the crystal.

The drive level is proportional to crystal ESR, so a lower ESR crystal also lowers the drive level which
may also be another factor (beyond increased gain margin, which was ample to begin with) why a lower
ESR crystal works better.

 

it would have been interesting to try and follow AN2867 to see if adding a limiting resistor Rext would
have fixed the issue.

BarryWhit_0-1724012259805.png

 

I'm happily still learning new things by chasing this thread... :face_savoring_food:

- If someone's post helped resolve your issue, please thank them by clicking "Accept as Solution".
- Please post an update with details once you've solved your issue. Your experience may help others.