cancel
Showing results for 
Search instead for 
Did you mean: 

What would cause an application to crash the debugger and not be able to reset in the debugger, without exiting the debugger?

KiptonM
Lead

I am just about finished with my application I have been working on for a few months.

The processor is a STM32G051K8T and I have upgraded to STMCubeIDE 1.10.1 and am using the ST-LINK V2 (Blue pill) with my custom board.

All of a sudden it is hanging up in the main program.

At first it hung up in the middle of a sprintf() statement.

The statement was

          char buf[120];
	  sprintf(buf,"Calibration Factor: %lu\r\n",ADC1->CALFACT); // dies on this statement
	  print_debug_str("Hello World\r\n");

char buf[120] was originally char buf[40] but I made it longer in the unlikely case it was overrunning the buffer. Made no difference.

I changed "ADC1-CALFACT" to "16l" or "(uint32_t) 16" made no difference. it just hung on that command.

Since I new sprintf probably used the heap, I went into STM32G051K8TX.ld to maje the minimum heap 0x300 from 0x200. And it made no difference.

My first thought was it is happening at a certain memory location. So I put 100 __NOP(); statements before the code that had the sprintf() in it. Did not matter, it still hung up on the sprintf()

I commented out the sprintf() statement and changed the  print_debug_str(buf); to

 print_debug_str("Hello World\r\n"); and it hung up in the print_debug str() routine.

void print_debug_str(void * s)
{
	static uint8_t  db[256];
	volatile uint32_t i = 0;
	volatile uint32_t j;
	uint16_t len = strlen(s);
	while (UART1Busy) i++;
	memmove(db,s,len);
	UART1Busy = true;
	for (j=0;j<2500;j++) i++;  // now stopping here
	HAL_UART_Transmit_DMA(&huart1,db,len);
}
 
void HAL_UART_TxCpltCallback(UART_HandleTypeDef *huart)
{
	UART1Busy = false;
	UNUSED(huart);
}

That made me wonder if it was not memory based but time based.

So I put the 100 __NOP(); in a for loop;

{
	  uint16_t i;
	  for (i=0;i<30000;i++)
	  {
		  __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); // 20
		  __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); // 40
		  __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); // 60
		  __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); // 80
		  __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); // 100
	  }
  }

Now it stops in this code before the sprintf() statement.

I am now guessing it is a time issue. It happens at a certain time.

I have never setup the watchdog timer and I do use three timers (TIM2, TIM3, TIM14) later in the program. So just incase they are now not working like they used to, I decided to comment them out, which caused a warning about the routines not being called.

Here is the current version of the code. Until it breaks:

int main(void)
{
  /* USER CODE BEGIN 1 */
 
  /* USER CODE END 1 */
 
  /* MCU Configuration--------------------------------------------------------*/
 
  /* Reset of all peripherals, Initializes the Flash interface and the Systick. */
  HAL_Init();
 
  /* USER CODE BEGIN Init */
 
  /* USER CODE END Init */
 
  /* Configure the system clock */
  SystemClock_Config();
 
  /* USER CODE BEGIN SysInit */
 
  /* USER CODE END SysInit */
 
  /* Initialize all configured peripherals */
  MX_GPIO_Init();
  MX_DMA_Init();
  MX_ADC1_Init();
  MX_I2C1_Init();
  //MX_TIM2_Init();
  //MX_TIM3_Init();
  //MX_TIM14_Init();
  MX_USART1_UART_Init();
  MX_USART2_UART_Init();
 
  /* Initialize interrupts */
  MX_NVIC_Init();
  /* USER CODE BEGIN 2 */
 
  /*
  ENCODER_POWER_OFF; // Make sure encoder is off so it can start correctly.
  print_debug_str("Encoder Power Off\r\n");
 
 
 
  TASK_PIN_L;
  TP2_L;
  DIO2_L;
  DIO3_L;
  DIO4_L;
 
 if (isDebug(DEBUG_verify_memory_structures))
  {
	  verify_memory_structures();
  }
*/
  __NOP();
  {
 
	  uint16_t i;
	  for (i=0;i<30000;i++)
	  {
		  __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); // 20
		  __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); // 40
		  __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); // 60
		  __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); // 80
		  __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); __NOP(); // 100
	  }
  }
  //ADC_init();
  __NOP();
 
  if (HAL_ADCEx_Calibration_Start(&hadc1) != HAL_OK)
  {
	  print_debug_str("Did not calibrate right!\r\n\r\n");
  }
  else
  {
	  char buf[120];
	  sprintf(buf,"Calibration Factor: %lu\r\n",ADC1->CALFACT); // dies on this statement
	  //sprintf(buf,"Calibration Factor: %lu\r\n",(uint32_t) 0xA5A5A5A5); // dies on this statement
	  print_debug_str(buf);
  }

Maybe I need to explain what happens.

I run the debugger to the __NOP(); on line 55 where I have a break point, I also have a break point at the __NOP(); on line 69. When I tell it to Resume after the first breakpoint, it never returns and breaks at line 69.

Suspend does not appear to really work. It goes off and resume turns on, but I cannot examine variables. When I try to look at i it says "Error: Multiple errors reported.\ Failed to execute MI command: -var-create - * i Error message from debugger back end: -var-create: unable to create variable object\ Unable to create variable object"

trying the reset button the resume button goes out, and the suspend button lights up, but it does not go back to the beginning of main() like normal,

As a result, I have to use the Terminate button to leave the debugger, then Run -> Debug to get back into the debugger.

I upgraded to STM32CubeIDE 1.10.1 recently. Is this one of those known problems for ST in this version?

Did version 1.10.1 turn on something like the watchdog timer by default?

Any suggestions how to get past this? This has been working everyday for the past 2-3 months.

I have attached the whole main.c code so you can see how the peripherals were setup before this code was executed.

I am at a loss as to what happened. This was working. Could have something happened to the processor? I do not want to swap processors if I do not need to.

Thanks,

Kip

1 ACCEPTED SOLUTION

Accepted Solutions
KiptonM
Lead

With help from KnarfB and Piranha we finally figured out the issue in another question.

Feed Detail (st.com)

To make a long story short KnarfB noticed in one of my debugging logs that PC was wrong.

Download verified successfully 

 ------ Switching context ----- 

COM frequency = 4000 kHz

Target connection mode: Under reset

Reading ROM table for AP 0 @0xf0000fd0

Hardware watchpoint supported by the target 

ST-LINK Firmware version : V2J40S7

Device ID: 0x456

PC: 0x1fff1654

And that was in system memory not flash.

So somehow the fuses got changed. I had to load the STM32CubeProgrammer and it lets you change the fuses. Unfortunately that did not work.

Piranha noticed the fuse descriptions in the Programmer were wrong and sent me to the right place in RM0444 Paragraph 2.5, Table 8. I set the fuses based on that info and it started working. And the PC was in the right memory space.

Download verified successfully 

 ------ Switching context ----- 

COM frequency = 4000 kHz

Target connection mode: Under reset

Reading ROM table for AP 0 @0xf0000fd0

Hardware watchpoint supported by the target 

ST-LINK Firmware version : V2J40S7

Device ID: 0x456

PC: 0x8006398

Thanks all for the help.

View solution in original post

14 REPLIES 14
KiptonM
Lead

That is curious. I commented out everything I was not using.

int main(void)
{
  /* USER CODE BEGIN 1 */
 
  /* USER CODE END 1 */
 
  /* MCU Configuration--------------------------------------------------------*/
 
  /* Reset of all peripherals, Initializes the Flash interface and the Systick. */
  HAL_Init();
 
  /* USER CODE BEGIN Init */
 
  /* USER CODE END Init */
 
  /* Configure the system clock */
  SystemClock_Config();
 
  /* USER CODE BEGIN SysInit */
 
  /* USER CODE END SysInit */
 
  /* Initialize all configured peripherals */
  MX_GPIO_Init();
  //MX_DMA_Init();
  //MX_ADC1_Init();
  //MX_I2C1_Init();
  //MX_TIM2_Init();
  //MX_TIM3_Init();
  //MX_TIM14_Init();
  MX_USART1_UART_Init();
  //MX_USART2_UART_Init();
 
  /* Initialize interrupts */
  //MX_NVIC_Init();
  /* USER CODE BEGIN 2 */

it still crashes in the __NOP() loop.

sprintf() or snprintf() really don't need to use the heap, you pass them a buffer.

The ST-LINK are not, in my opinion, robust on USB Hubs, or similar arrangement on docking stations. So USB comms issues are a problem. Use of quality cables is also strongly advised.

There are no efficiency prizes for using uint16_t, the 32-bit int will be optimal for loop counts.

Do you have a Hard Fault Handler that reports failure details? The CM0(+) has a no tolerance of unaligned accesses.

Would definitely monitor stack depth. Heap probably not going to be used much of anywhere, things should fail elegantly if malloc() returns NULL. Watch for the heap/stack colliding. Large local/auto variable frequently a headache. See if problems come/go with "static" directive on larger locals. See if altering optimization levels alters behaviour, often a predictor on latent coding issues/expectations.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
KiptonM
Lead

Some new information. I double clicked on the .ioc file to bring up the MX plug-in and I have an icon spinning, and it does not bring up the MX program.

Finally it came up with a window:

0693W00000QLLwmQAH.pngThis is interesting. I did the migration, then went into the MX program to make sure everything looked O,K. It did and then I Generated the Code and recompiled, with a lot of hope.

Still stops in the __NOP(); loop.

S.Ma
Principal

Put a 500msec delay before init the peripherals.

Try your code with interrupt disable

Create interrupt routine for all unused interruptd and stick there with while 1, and put a breakpoint in there.

Check if a low power mode cut swd activity

Check if code reco figure SWD pins for non debug use

Check the stack size, increase it.

I dont use lib print, I make my own. I avoid float which is memory costly especially on cortex M0+, fixed point or Q31 format is old practicr. NO FLOAT/DOUBLE allowed within interrupts (linux and droid phone drivers know this fact). Sometime i just use int Temp_degC_x1000 to workaround float.

MM..1
Chief III

Show here your it file. and other init used in MX user parts.

From quick read your UART debug is little chaos DMA .

And you test with or without ?

  ENCODER_POWER_OFF; // Make sure encoder is off so it can start correctly.
  print_debug_str("Encoder Power Off\r\n");

Plus i mean UNUSED need be first line in callback.

I do not understand your comments.

Show here your it file.

What is "it" file?

and other init used in MX user parts.

I attached the main.c which has the MX generated initialization routines. What else do you want? I do not understand.

From quick read your UART debug is little chaos DMA .

I do not know what that means. the print_debug_str() function has been working without problem for over a year. I can print at 3047619 bps with it without an issue. I like the fast baud rate because it does not slow other things down waiting for the print.

And you test with or without ?

I am finding the problem before it is called. It appears to be time related. At first I thought it was memory position related, until I looped on the __NOP(); and it quit in there.

UNUSED() must be first line.

I have to look at the UNUSED() routine. I thought all it did was fool the compiler to think a value was used when it was not. So it did not matter where in the routine it was placed as long as the warning was not generated. I guess once the variable is used, the compiler can reuse the register in the later code when it optimizes, but in a routine so small, I doubt if you will gain more than a couple of cycles.

I am working on your suggestions. Thanks for giving me something to try.

Put a 500msec delay before init the peripherals.

I have not needed this before, but I will try.

I put a loop in before the HAL_Init();

{

uint16_t i, j;

for (i=0;i<525;i++)

{

for (j=0;j<60000;j++) __NOP();

}

}

This is closer to 11 seconds. But it did not affect it.

Try your code with interrupt disable

I removed the delay before the initializations.

I placed __disable_irq(); after the MX_ initializations in the code, before the __NOP(); loops.

It did not hang in the __NOP(); loop. So we know now it is interrupt related.

I already tried not initializing the timers. So it must ne something else.

I commented out most of the initializations: Like so

/* Reset of all peripherals, Initializes the Flash interface and the Systick. */
  HAL_Init();
 
  /* USER CODE BEGIN Init */
 
  /* USER CODE END Init */
 
  /* Configure the system clock */
  SystemClock_Config();
 
  /* USER CODE BEGIN SysInit */
 
  /* USER CODE END SysInit */
 
  /* Initialize all configured peripherals */
  MX_GPIO_Init();
 // MX_DMA_Init();
  MX_ADC1_Init();
  MX_I2C1_Init();
//  MX_TIM2_Init();
//  MX_TIM3_Init();
//  MX_TIM14_Init();
//  MX_USART1_UART_Init();
//  MX_USART2_UART_Init();
 
  /* Initialize interrupts */
  MX_NVIC_Init();
  /* USER CODE BEGIN 2 */

It still is happening. I have not started I2C or ADC1 yet so I did not think they could be the issue.

So I commented them out also.

Still happening.

I commented out the MX_NVIC_Init();

Still happens.

Only thing left is MX_GPIO_Init();

Commented it out, still happens,

So now the only thing active is HAL_Init() and SystemClock_Config();

/* MCU Configuration--------------------------------------------------------*/
 
  /* Reset of all peripherals, Initializes the Flash interface and the Systick. */
  HAL_Init();
 
  /* USER CODE BEGIN Init */
 
  /* USER CODE END Init */
 
  /* Configure the system clock */
  SystemClock_Config();
 
  /* USER CODE BEGIN SysInit */
 
  /* USER CODE END SysInit */
 
  /* Initialize all configured peripherals */
//  MX_GPIO_Init();
 // MX_DMA_Init();
//  MX_ADC1_Init();
//  MX_I2C1_Init();
//  MX_TIM2_Init();
//  MX_TIM3_Init();
//  MX_TIM14_Init();
//  MX_USART1_UART_Init();
//  MX_USART2_UART_Init();
 
  /* Initialize interrupts */
//  MX_NVIC_Init();
  /* USER CODE BEGIN 2 */

I commented out SystemClock_Config(); It still happens.

I commented out HAL_Init() It does not stop in the loop. It looks like some interrupt is firing some time after HAL_Init(); is run.

But which one?

So I am single stepping through HAL_Init();

We know the problem happens when interrupts are enabled and HAL_Init() runs.

If HAL_Init() runs and interrupts are disabled we do not have the issue.

I looks like the following things happen in HAL_Init().

__HAL_FLASH_PREFETCH_BUFFER_ENABLE();

HAL_InitTick(TICK_INT_PRIORITY)

HAL_MspInit()

As I was writing this and the system was paused at the return status for HAL_Init() the debugger stopped working. The reset button would not work.

For some reason I am thinking it is HAL_InitTick().

Because that will generate an interrupt after about 1 ms.

I started stepping through HAL_InitTick(

__weak HAL_StatusTypeDef HAL_InitTick(uint32_t TickPriority)
{
  HAL_StatusTypeDef  status = HAL_OK;
 
  /* Check uwTickFreq for MisraC 2012 (even if uwTickFreq is a enum type that doesn't take the value zero)*/ 
  if ((uint32_t)uwTickFreq != 0U)
  {
    /*Configure the SysTick to have interrupt in 1ms time basis*/
    if (HAL_SYSTICK_Config(SystemCoreClock / (1000U /(uint32_t)uwTickFreq)) == 0U)
    {
      /* Configure the SysTick IRQ priority */
      if (TickPriority < (1UL << __NVIC_PRIO_BITS))
      {
        HAL_NVIC_SetPriority(SysTick_IRQn, TickPriority, 0U);
        uwTickPrio = TickPriority;
      }
      else
      {
        status = HAL_ERROR;
      }
    }
    else
    {
      status = HAL_ERROR;
    }
  }
  else
  {
    status = HAL_ERROR;
  }
 
  /* Return function status */
  return status;
}

And the debugger stopped working at the uwTickPrio = TickPriority; line.

Looking at the HAL_IncTic(void) routine it says:

/**
  * @brief This function is called to increment  a global variable "uwTick"
  *        used as application time base.
  * @note In the default implementation, this variable is incremented each 1ms
  *       in SysTick ISR.
  * @note This function is declared as __weak to be overwritten in case of other
  *      implementations in user file.
  * @retval None
  */
__weak void HAL_IncTick(void)
{
  uwTick += (uint32_t)uwTickFreq;
}

I do not know what the interrupt routine name is for the SysTick ISR.

So I did a global search for HAL_IncTick figuring the SysTick ISR would would be calling it.

It looks like sysTick_Handler(void) in stm32g0xx_it.c is the only place that uses HAL_IncTic();

/**
  * @brief This function handles System tick timer.
  */
void SysTick_Handler(void)
{
  /* USER CODE BEGIN SysTick_IRQn 0 */
 
  /* USER CODE END SysTick_IRQn 0 */
  HAL_IncTick();
  /* USER CODE BEGIN SysTick_IRQn 1 */
 
  /* USER CODE END SysTick_IRQn 1 */
}

But I cannot find where the interrupt vector table is set. So it knows to call this routine when the interrupt is called.