cancel
Showing results for 
Search instead for 
Did you mean: 

Application runs very slow using the debugger

Debugging via SWD using either STLink or JLinkEdu from STM32CubeIDE to a STM32F405, I've noticed the application runs around 16 times slower when its run in the debugger.

Also the USB (Virtual Com device) does not even appear on my host PC when the debugger is launched from inside STM32CubeIDE

To attempt to isolate the problem, I've tried a STLink as well as JLinkEdu and the result was the same.

If I run the JLinkGDBServerCLI using the command line cut and past from the Debug Configuations in STM32CubeIDE and separately run GDB (arm-none-eabi-gdb), the application runs at full speed, and the USB device works fine.

I've looked at the Debug configuation settings in STM32Cube, and tried changing various things e.g. Live Expressions, but this made no difference.

Can anyone advise how to resolve this problem.

29 REPLIES 29

@mattias norlander

Was this the information you needed ?

It doesn't look to dump or decode the registers, just shows some general code.

He was asking for information in the fast/slow cases.

Dump out the RCC registers.

What board are you using? Custom?

BOOT0 pulled LOW?

A TCXO type clock source?

Perhaps unpack what the MCU thinks it's doing

 printf("\n\nCore=%d, %d MHz\n", SystemCoreClock, SystemCoreClock / 1000000);

 printf("APB1=%d\n", HAL_RCC_GetPCLK1Freq());

 printf("APB2=%d\n", HAL_RCC_GetPCLK2Freq());

Perhaps pipe the internal clocks to the PA8/MCO pin so you can observe with a scope

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

Its custom hardware.

We are repurposing some Chinese made hardware by writing our own firmware for the hardware.

Those values are quite interesting

After initial launch I see these values

Core=16000000, 16 MHz

APB1=1000000

APB2=1000000

If I press the suspend button and then the restart button I get these values

Core=72000000, 72 MHz

APB1=18000000

APB2=36000000

Surely these values are being configured by the code generated by the STM32CubeIDE configuration code.

Why would they be different immediatly after debugging is launched, regardless of whether the application code needs to be sent to the MCU by the debugger.

As previously posted, I've not tested using OpenODC to debug, but I have tried STLink and JLink and they both behave the same way.

I've tried calling

 SystemClock_Config();

twice, to see if that made any difference, but it doesn't

16 MHz would be the HSI clock the processor starts with, subsequent code would need to bring up the external clocks, and the PLL.

Typically if the HAL code fails you end up in the Error_Handler() in a while loop. If the error is ignored it will continue to run at the original speed.

If you're using an external crystal then you shouldn't be using BYPASS mode, that's for XO sources like TCXO, OCXO, or clock piped from another device as a digital logic level clock. The ST-LINK on the DISCO/NUCLEO boards typically exports a clock saving a component

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

0693W00000FCigtQAD.png 

I just used the default settings when I initially created the project

The clock is an 8Mhz crystal connected across the OSC_OUT and OSC_IN pins (12 and 13), using the normal boiler plate design for external crystals on a STM32F4

i.e there are various resistors and capacitors as part of the oscillator circuit.

Let's just sort out that the expectations are correct.

If I do this:

  1. l use CubeMX to generate a project for the STM32F407VG (close enough to the F405).
  2. Modify the clock tree to get HCLK = 168MHz
  3. Modify the main() function to look like this:
int main(void)
{
  /* USER CODE BEGIN 1 */
	int coreclock = HAL_RCC_GetHCLKFreq();	/* Expected to have 16MHz since no clock tree configuration yet executed */
  /* USER CODE END 1 */
 
  /* MCU Configuration--------------------------------------------------------*/
 
  /* Reset of all peripherals, Initializes the Flash interface and the Systick. */
  HAL_Init();
 
  /* USER CODE BEGIN Init */
 
  /* USER CODE END Init */
 
  /* Configure the system clock */
  SystemClock_Config();						/* Clock tree config done. After this we expecte 168MHz */
 
  /* USER CODE BEGIN SysInit */
  coreclock = HAL_RCC_GetHCLKFreq();		/* Will return 168MHz */
  int apb1clock = HAL_RCC_GetPCLK1Freq();
  int apb2clock = HAL_RCC_GetPCLK2Freq();
  /* USER CODE END SysInit */
 
  /* Initialize all configured peripherals */
  /* USER CODE BEGIN 2 */
 
  /* USER CODE END 2 */
 
  /* Infinite loop */
  /* USER CODE BEGIN WHILE */
  while (1)
  {
    /* USER CODE END WHILE */
 
    /* USER CODE BEGIN 3 */
  }
  /* USER CODE END 3 */
}

Then the right expectation is that the Core (HCLK) is running at 16MHz until line 17 has been executed.

After line 17, the core will be running at 168MHz.

If I suspend and click reset. Then a system reset is applied. This implies resetting the core and all peripherals (default behavior). So the clock should now go back to 16MHz. The reset button does NOT only set the PC to the first line of Reset_Handler or main.

Now, next question: There is the concept of "restart configurations". This will allow you to modify how a reset/restart is applied during debug. If you or a colleague has tampered with this configuration, then maybe this explains the issue?

Another question: What if you create a new project for this MCU configure the clocks in MCUs like in the real application but keep it minimal. Can you still get the same issue (running slower when debugger connected vs without debugger)? Would be nice to know if the problem is somewhere in your application code or even reproducible in a minimal example.

OK.

I'll need to make a new project from scratch.

However, one problem is always making a valid clock configuration.

Setting the external clock frequency and then running the automatic configuation never seems to result in the clocks being set to full speed.

The only way I found to get the automatic configuation to get full speed clocks, is to set one of the primary dividers so that the core clock is far too high, and then the automatic clock configurator seems to try various combinations of dividers and multipliers until a vaid set of clock paramaters is found.

And even then, this doesn't always seem to work.

I looked through all the HAL examples that the STM32CubeIDE can download and use as the base for a project, but I could not see a STM32F4 example of any kind which used the .ioc config file.

All the newer examples for the 7 Series etc had .ioc based projects but the F4 ones don't seem to have been updated

If there an example F4 project I could use, this would rules out all the other unknowns, because if I create a new project from scratch the clock config could still be miss-generated by the config tool

The STM32F4 is not a new product series, consequently the STM32CubeF4 is in maintenance mode. No plan (to my knowledge) to add any example projects with ioc-files.

You could try to play with different clock sources to see if you can re-produce the "debug slow-down" only with some clock sources/configurations while not with other. I am afraid that this is a tricky issue to help out with, since we are unable to re-produce...

Could you make another more simple test instead.

In the debug configuration > Startup > Load Image and Symbols

Disable the Download of the binary. Unplug the board, re-plug it and launch debug without re-flashing the application.

Does it work better now? One hypothesis is that the flash loader shipped with CubeProgrammer puts the application in an unanticipated state. We have seen some other weird debug behaviours which could have this root-cause. So, please try and let us know if this could be the root-cause.

Sorry its taken so long to reply.

I'm not sure if the passwords on this site expire, but I have been unable to login, and ended up resetting my password etc.

Anyway.

It appears that if I "debug" with Download disabled, the application runs at the correct speed.

So I think your hypothesis is correct.