2026-03-16 3:49 PM
Hello all,
I've designed a board around a STM32H743 (LQFP-100) and am running into the following issue.
On 2 boards out of 10, the HSE seems not to start properly. The HSE_RDY flag is never set after enabling HSE. I have checked it's not in bypass mode. On the other boards, it works perfectly.
The crystal is a ABM3B-8.000MHZ-10-1-U-T and the load caps are 10 pF. We checked possible soldering issues, tested different load cap values just to see, and even replaced the crystal on the failing boards - to no avail.
Looking at the OSC_IN/OSC_OUT signals (PH0/PH1) on a scope does not show a significant difference between the working and non-working boards. Maybe just a very slight difference in p-p amplitude.
I have looked at the AN2867 app note, and the gain margin seems sufficient according to the formulas given there. I have heard that this crystal may have a bit too high ESR (200 ohm) for the STM32H7 HSE, but again, the gain margin seems alright.
Any idea? I have tested with a minimal firmware that just sets up the power config, turns on HSE and waits (forever) for the HSE_RDY flag. It never gets out of the loop on the failing boards (works fine on the working boards). If using HSI instead on the failing boards, everything seems to work ok, so it's definitely just a problem with the HSE.
Thanks!
Solved! Go to Solution.
2026-03-17 10:11 AM
You :
>Looking at the OSC_IN/OSC_OUT signals (PH0/PH1) on a scope does not show a significant difference between the working and non-working boards.
But then on both -working and non-working boards- you see the 8M signal ??
-> so all have working clocks , just the css "is_ready" check not working... - right ?
2026-03-17 12:33 PM
@AScha.3 wrote:You :
>Looking at the OSC_IN/OSC_OUT signals (PH0/PH1) on a scope does not show a significant difference between the working and non-working boards.
But then on both -working and non-working boards- you see the 8M signal ??
-> so all have working clocks , just the css "is_ready" check not working... - right ?
Well, after more testing, things get more interesting: it turns out that HSE_RDY gets set and the flag itself seems "stable" after enabling HSE, but that's the following switch to HSE as system clock that gets stuck.
Basically just that:
LL_RCC_SetSysClkSource(LL_RCC_SYS_CLKSOURCE_HSE);
while (LL_RCC_GetSysClkSource() != LL_RCC_SYS_CLKSOURCE_HSE) {}The busy loop gets stuck.
And, I have also enabled HSE on MCO2, and it seems to output it properly.
So: MCU runs fine on HSI, HSE is enabled, but LL_RCC_GetSysClkSource() never returns LL_RCC_SYS_CLKSOURCE_HSE after LL_RCC_SetSysClkSource(LL_RCC_SYS_CLKSOURCE_HSE);
So the switching to HSE for system clock fails, but the HSE otherwise "seems" to operate normally.
Curious to understand what is really happening.
2026-03-17 12:34 PM
@TDK wrote:Reflowing the pins or swapping the MCU between working and non-working boards seems like a logical next step as well.
Reflowing has been attempted, but swapping the MCU, not yet. I agree it would be an interesting next step.
2026-03-17 1:43 PM - edited 2026-03-17 1:45 PM
Did you try just the "standard" clock/RCC init with HAL (not LL) ?
As i always use it , never had a problem with H743 (on maybe 1000 boards, or more).
And here seems to be no problem with the hardware , cpu , just with the used lib.
btw
Why using LL lib here ? speed ? :)
2026-03-18 3:01 AM - edited 2026-03-18 3:15 AM
LL_RCC_SetSysClkSource(LL_RCC_SYS_CLKSOURCE_HSE);
while (LL_RCC_GetSysClkSource() != LL_RCC_SYS_CLKSOURCE_HSE) {}What do these incantations do?
What if you write directly to registers without these?
What does the RM say about switching system clock?
JW
[EDIT] I looked up for you answer to the first question. The highlighed line is not a coincidence.
2026-03-18 5:16 AM
What Jan said, don't use the obscure LL_ functions, but read / write the registers directly.
Or at least check the used LL_ functions.
That it fails with LL_ might indicate there's a #define problem, there are lots of #if in the basic controller defines for registers and bits. Maybe you copied stuff from another STM32 type?
2026-03-18 11:14 AM - edited 2026-03-18 11:17 AM
So - a bit of follow-up, not in any particular order.
Regarding LL libraries, they are not "obscure" and have proven to work well while being lightweight (compared to the very bloated HAL). Directly using registers is always possible, but takes a lot more time and the LL layer helps making porting to other STM32 families much easier while also being much easier to read than directly accessing registers. So, that part is irrelevant. Of course, like all libraries, they can have their own quirks, such as these macros not being symmetric between LL_RCC_GetSysClkSource() and LL_RCC_SetSysClkSource(). All libraries have quirks.
Just to give a bit more context, I did not have direct access to the "failing" boards, so that made it more difficult to debug remotely. As mentioned, I first suspected HSE_RDY not getting set (while the oscillator was looking like it at least oscillated), but it turned out to be something else. Indeed the
while (LL_RCC_GetSysClkSource() != LL_RCC_SYS_CLKSOURCE_HSE)
test used the wrong macro, which should have been LL_RCC_SYS_CLKSOURCE_STATUS_HSE rather than LL_RCC_SYS_CLKSOURCE_HSE. Obvious in hindsight but close enough not to be picked up when you're right in the middle of debugging something. Typos happen.
So that explained why the switch to HSE appeared not to work in the minimal test firmware, and that was misleading. But after fixing that, the root issue (that triggered the write of this minimal test firmware to begin with) was finally found. As I said, not having direct access to the boards made it a bit hard to locate exactly the cause. It turned out to be unrelated to the HSE or PLLs, but to the LSE.
Indeed, in the original firmware, there was the initialization of the LSE, but on the tested boards, we had not equipped them with LSE crystals, the pins were just meant to be used as spare GPIOs. So when enabling (due to a config error) the LSE, the OSC32_IN/OUT pins were just floating. The unexpected here (that was misleading as to the cause) is that in 8 boards out of 10, the LSE init code was getting out of the wait loop (for LSE_RDY) without a problem even without any crystal on the OSC32 pins. I certainly wasn't expecting that. My best guess as of now is that slight differences in MCU silicon and possibly flux residues on the boards,etc, could make it pick up noise on the apparently "good" boards that would make the LSE logic set the LSE_RDY flag, even if just from noise, while on other boards (the "failing" ones), LSE_RDY was never set. While it could be just seen as an oddity, could be interesting for someone to test it and see what happens (attempting to enable the LSE with nothing connected to OSC32_IN/OUT) on STM32H7's. Not that it should be seen as a problem either way, just again an oddity but that made me miss this issue at first.
And while debugging this, as we can read various stories of HSE oscillator designs being marginal and not working on all boards, and even that the STM32H7 was supposedly more finicky with its HSE, I also investigated this part, but it was unrelated in my case. I can even tell you that that it looks more forgiving than many seem to say.
So, case closed. With a bit of this odd LSE_RDY thing, which, while it was misleading during this debug session, is of course of no consequence otherwise.
2026-03-18 11:44 AM
Hold on, you said you're using this code:
while (! LL_RCC_HSE_IsReady()) {} // Would block here on non-working boards.which definitely uses the right macros:
But then you actually were using something different? Why post code you're not using?
Defend LL if you want, but the obfuscation of what it's doing certainly contributed to the problem here. HAL would have worked correctly, "bloated" or not. Direct register access with CMSIS would have also worked.
2026-03-18 11:57 AM
You're confusing two steps. Enabling the LSE (and checking for readiness) and switching the system clock to HSE. One comes obviously before the other. Enabling the LSE and checking with LL_RCC_HSE_IsReady() worked (I was mistakenly assuming it didn't on some boards due to having to "remotely" debug this, but it did work, as I explained earlier). It's the test after switching to HSE for sys clock LL_RCC_SetSysClkSource() -> while (LL_RCC_GetSysClkSource() != LL_RCC_SYS_CLKSOURCE_HSE) which was using the wrong macro, and that's actually where it was blocking, in the minimal test firmware.
The root cause was unrelated to this, so this was a small typo that made the minimal test look misleading, but it eventually made me point to something else entirely which was the LSE, as I explained above. That happens.
As to CMSIS, I could have used the wrong macro for checking RCC_SYS as well as the "status" and "set" bits are separate. That would have made no difference. Typos happen.
2026-03-19 2:33 AM
Hi @OpusOne ,
Thanks for coming back with the solution. Please mark your post as Solution so that the thread is marked as solved.
I also want to congratulate you to finding the raw cause. Remote debugging is quite a challenge and I personally always struggle with that. It's so much easier when things are at one's desk... :)
With regards to Cube/LL, IMO it's mostly just renaming of the terms and registers/bits/bitfield names in RM, creating an unnecessary obstacle (and in that sense, indeed, obscuring) to mapping actions from code to RM and vice versa.
The real reason why users do use it, is, IMO, that ST provides examples and the clicky generator for it.
Unfortunately, ST refuses to provide raw examples, in part pointing out that resources are being spent on the Cubes. The illusion of easy migration and quick coding by clicking of course also makes a nice appeal to the managerial side of their customers. And of course there's also the factor of locking in the users in some way.
But then I despise Cube/HAL, for much the same reasons, too. :)
JW