2024-12-22 05:10 AM
Hey together,
I discovered a super strange a critical behaviour of the STM32H750 chip. Maybe you can help me to prove this issue.
We have a firmware which is split into 2 parts (bootloader, application). Both BL and APP are using the UARTs to communicate internally. We created a new APP which changes some clock mux settings for the UART for testing. Important is the change from PCLK2 to PLLQ2, in the last APP we used in both PCLK2.
But now when we jump from the APP to the BL with NRST, we stuck from time to time at the "HAL_UART_Init()" in the line 371
return (UART_CheckIdleState(huart));
The UART is wating for the TEACK flag after enabling the TE flag of the UART. This polling is important as the TE flag is set by the CPU while the TEACK flag is set by the UART peripheral which is clocked by the clock mux. Here is a link to the description:
EFTON - STM32 gotchas - Gotcha 141
But when you stuck here, you can do whatever you want, pull NRST down as long as you want, or even reflash the bootloader, you will stuck here. The only way to get out of this is either by power cycle the full chip or reflash the bootloader so that it is using the same clock mux setting the UART was using in the APP before (in this case change it to PLLQ2). Then the UART can reinitialize and go back to work. I proved this on multiple PCBs and different clock settings.
This is easy to prove:
- Create a BL which is using PCLK2
- Create an APP which is using PLLQ2 and jump with a GPIO into the BL.
It does not happen always, but often enough to catch it.
So it seems that NRST is not completly reseting the UART peripheral and then a change of the clock source bricks the peripheral from setting the TEACK. This is super dangerous, as this makes the BL dependent from the APP.
Just imagine during a firmware update, there is a corrupted file programmed which uses PLLQ2. You want to run a new firmware update, jumping into the BL and you stuck in the UART initialisation.
Did anyone else discovered this behaviour?
HAL Version STM32H7 v1.11.1-0
Solved! Go to Solution.
2024-12-23 01:52 AM - edited 2024-12-23 01:59 AM
Hello,
Most probably this is due to this behavior (from RM0433) :
Try to switch the clock source from PCLK2 (clock source of the BL) to PLLQ2 before jumping to the application.
2024-12-22 05:19 AM
Well the H7 does treat power cycle and software reset differently.
But consider other things that don't have an async reset like external oscillators.
Instrument so you can see initial settings and how they differ. That clocks and PLLs start as expected. Unpack clock gearing. And have HardFault_Handler and Error_Handler that output actionable detail.
2024-12-22 05:23 AM
Hey @Tesla DeLorean,
you are totally right, we have to measure a bit more. But even using the HSI inside the BL as clock source for the UART does not help when you used (for example) PLLQ2 in the application before.
This dependency makes me nervous and shouldnt be as this makes the BL dependent from the APP. Or what do you think about it?
2024-12-22 06:03 AM
You're using NVIC_SystemReset to enter Boot Loader?
Do you connect NRST pin to anything else in your circuit?
2024-12-22 06:59 AM
2024-12-23 01:52 AM - edited 2024-12-23 01:59 AM
Hello,
Most probably this is due to this behavior (from RM0433) :
Try to switch the clock source from PCLK2 (clock source of the BL) to PLLQ2 before jumping to the application.
2024-12-23 03:20 AM
Hey @SofLit,
this is it, thanks a lot for your reply. But this is a very dangerous dependency, right? Meaning that we have to provide on any used peripheral (UARTS in this case) all clock sources in the bootloader, just to ensure that we can get out of this if a malicious firmware used a different clock source.
Is this the same for other peripherals like Quadspi or so?
Best regards,
Eric
2024-12-23 03:33 AM
This is related to RCC not to the peripherals.
So you need to take care of that behavior when switching from clock source to another no matter the peripheral is.
2024-12-23 03:45 AM
Hey @SofLit,
okay that’s for the hint. I found the section you showed above. So we have to verify that all peripherals used in the boot loader get any clock source. Just to provide always the possibility to recover from it if the application, for whatever reason, switched to a different clocksource. In the BL we did not activate the PLL2, that’s why we get stuck.
Just imagine the application enables, due to firmware issue CSI or LSE, we would have to provide it in the bootloader to avoid that we are stuck.
I still find it very dangerous but now we know how to deal with it.
I wish your a merry Christmas,
Eric