Clock config fails if using debug and external loader (HAL_RCC_OscConfig() problem?)

TMack.1 · ‎2022-09-08

I have a touchGFX project (created using the STM32G071RB Nucleo AZ1 board setup) which, if run under debug (cubeIDE), will fail during SystemClock_Config(). This needs an external loader to program the external flash.

If I perform a hardware reset the code runs fine.

After a bit of debugging I have found that the problem appears to be with HAL_RCC_OscConfig(). The application is setting the clock to HSI with PLL. If HAL_RCC_OscConfig detects that HSI with PLL is already set then it only checks if the PLL is set correctly and fails with HAL_ERROR if it is not. It would appear that something (I assume the external loader) is also using HSI with PLL but with different PLL settings. So the application clock setup fails.

If I do a hardware reset then the loader is obviously not used and the clocks are in the reset state so HAL_RCC_OscConfig works ok.

To fix this I have set the clock source to HSI (no pll) before SystemClock_Config (eg RCC->CFGR = 0;) to force HAL_RCC_OscConfig to update the RCC correctly.

Am I missing something? Is this a bug in HAL_RCC_OscConfig or is there a reason it behaves this way?

Thanks, Toby

Tesla DeLorean · ‎2022-09-08

Probably because the PLL can't be changed on-the-fly.

Rather than make the libraries massively complex for all corner cases, there's an expectation that you'll call it in Reset conditions, or walk it through some safe intermediate states, based on your understanding of the available clocks, and if a boot loader already brought the system up.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

View solution in original post

Tesla DeLorean · ‎2022-09-08

Probably because the PLL can't be changed on-the-fly.

Rather than make the libraries massively complex for all corner cases, there's an expectation that you'll call it in Reset conditions, or walk it through some safe intermediate states, based on your understanding of the available clocks, and if a boot loader already brought the system up.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

TMack.1 · ‎2022-09-08

That makes sense, thanks.

fondue1 · ‎2024-09-21

Hi

Luckily I just stumbled upon this older discussion since I had the same issue. Note that setting RCC->CFGR = 0; before calling SystemClock_Config() solved my issues but I do have some follow-up questions.

I am using a NUCLEO-L552ZE-Q (default configuration of the jumpers) and I flash/debug with the ST-Link that's included in the nucleo board.

The firmware is generated in STM32CubeMX where I selected the NUCLEO-L552ZE-Q board and generated the code (i.e. I am using the default clock configurations). Afterwards, I only added two lines in the main loop to toggle the blue user LED.

When I run this firmware, no hard-fault is encountered and the user LED does indeed blink forever. However, when I debug (SWD with CLion and open-ocd), I encounter exactly the same issue that the original author described.

Can you explain in a bit more detail, why this is the case and how the debugger interferes with the STM's clock configuration? Additionally, I'm interested in the reason, why this seems to be a rare issue even though I followed the simplest approach of debugging the "example firmware" (i.e. just generating the code in STMCubeMX for the nucleo board) and with the "easiest hardware" (nucleo board instead of custom PCB). Is there something that's recommended to do differently in order to avoid this issue?

Tesla DeLorean · ‎2024-09-21

I think this is a different issue to the OP, where an External Loader has been used by the debug to instantiate the external memory, via memory mapping and hardware initialization. Part of that Init() process was to bring up clocks and PLL, so a second call to set up the PLL failed, as it was already set and running. In that situation you'd need a intermediate step not using the PLL, so clocking from HSI, MSI or HSE directly.

In your case look to see if any Debug Scripts are being run in the bring-up process. Often the debugger has it's own list/agenda when it might bring up RCC, GPIOA, DBGMCU, PWR, etc to facilitate connectivity. You could dump some registers to understand what's different.

Also the debugger is going to take some time,and distort time, during the initially run code. It might also cause some significant amount of code to be run before it gains control, stops the processor, and sets SP/PC back to initial states.

Watch what you code is doing, perhaps put a million cycle spin loop in Reset_Handler, see if that has more predictable results.

Initialize a USART in Reset_Handler, implement a HardFault_Handler() that outputs actionable data, implement an Error_Handler() with file/line output so you know how it got there.

https://github.com/cturvey/RandomNinjaChef/blob/main/KeilHardFault.c

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

fondue1 · ‎2024-09-21

Thanks for the fast reply. I know that the hard fault comes from very end of the function

HAL_RCC_OscConfig()

Where the following if statement is true:

// ... inside HAL_RCC_OscConfig()...

/* Do not return HAL_ERROR if request repeats the current configuration */
if ((READ_BIT(pll_config, RCC_PLLCFGR_PLLSRC)  != RCC_OscInitStruct->PLL.PLLSource) ||
    (READ_BIT(pll_config, RCC_PLLCFGR_PLLM)    != ((RCC_OscInitStruct->PLL.PLLM - 1U) << RCC_PLLCFGR_PLLM_Pos)) ||
    (READ_BIT(pll_config, RCC_PLLCFGR_PLLN)    != (RCC_OscInitStruct->PLL.PLLN << RCC_PLLCFGR_PLLN_Pos)) ||
    (READ_BIT(pll_config, RCC_PLLCFGR_PLLPDIV) != (RCC_OscInitStruct->PLL.PLLP << RCC_PLLCFGR_PLLPDIV_Pos)) ||
    (READ_BIT(pll_config, RCC_PLLCFGR_PLLQ)    != ((((RCC_OscInitStruct->PLL.PLLQ) >> 1U) - 1U) << RCC_PLLCFGR_PLLQ_Pos)) ||
    (READ_BIT(pll_config, RCC_PLLCFGR_PLLR)    != ((((RCC_OscInitStruct->PLL.PLLR) >> 1U) - 1U) << RCC_PLLCFGR_PLLR_Pos)))
{
  return HAL_ERROR;
}

I also understand that the debugger probably does quite a lot before main.c gets executed but I don't know what and how I should approach this question since I obviously can't debug anything before the debugger runs.

Can you maybe give me some insights on how the debugger handles clock management? Do you think there will be more issues in the future if I just stick with RCC->CFGR = 0; before calling SystemClock_Config() instead of really diving into this rabbit hole? I am more of a beginner and most things beyond user code "above HAL" are black magic to me.