cancel
Showing results for 
Search instead for 
Did you mean: 

STM32WL IPC communication failure hangs the application from being able to send LORA messages

ravisha96
Associate II

Setup

  • STM32WL55JC1 embedded on a custom PCB with other peripherals
  • STM32CubeIDE
  • Lora gateway - Multitech MTCDT3AC model with a built-in network server, join server, packet forwarder and gateway.
  • STM32CubeWL f/w version 1.3.0

Application hangs because MBMUXIF_LoraSendCmd() command stuck on Sem_MbLoRaRespRcv sometimes

I have an application built around LoRaWAN_End_Node_DualCoreFreeRTOS example provided in the firmware. My application on CM4 sends telemetry roughly every 4-5 minutes. It will run well for a few days and suddenly the MBMUXIF_LoraSendCmd() gets stuck waiting on Sem_MbLoRaRespRcv. Reading more on how dual-core system works I figured that if a response is not received through the IPCC channels, the semaphore is never released. This is a potential pitfall for me because my application requires telemetry to be sent continuously at the 4/5 minute rate.

I cannot think of reasons why a Resp might not have been received by the CM4 core for any telemetry send Cmd.

How to reproduce the bug

At this time, I cannot pinpoint how to reproduce this bug. In my view it happens randomly at different times. Sometimes the system runs for a few days and the bug occurs or sometimes it happens right away.

Additional context

I have set up an rtos queue to not bombard the send API with messages. However, my queue gets full when this issue and no messages are sent.

** Code Snippet **

void MBMUXIF_LoraSendCmd(void)
{
  /* USER CODE BEGIN MBMUXIF_LoraSendCmd_1 */

  /* USER CODE END MBMUXIF_LoraSendCmd_1 */
  if (MBMUX_CommandSnd(FEAT_INFO_LORAWAN_ID) == 0)
  {
    osSemaphoreAcquire(Sem_MbLoRaRespRcv, osWaitForever);
  }
  else
  {
    Error_Handler();
  }
  /* USER CODE BEGIN MBMUXIF_LoraSendCmd_Last */

  /* USER CODE END MBMUXIF_LoraSendCmd_Last */
}
 

** Additional Info/questions **
I think by design this system waits forever on this semaphore. If at all a response is not heard back, can we have some retry mechanism or show it as a communication error callback/ retry mechanism of some kind?

1 ACCEPTED SOLUTION

Accepted Solutions
STTwo-32
ST Employee

Hello @ravisha96 

You have to add critical sections to the functions in radio_driver.c like below, and test again. They implemented these, and good news is they haven’t encountered any issue so far. They will continue to test on several devices over a period of 10 days since the application was running a few days and was hanging suddenly. They will let us know if adding critical sections solve the issue or not.


void SUBGRF_WriteRegister( uint16_t addr, uint8_t data )
{
CRITICAL_SECTION_BEGIN();
HAL_SUBGHZ_WriteRegisters( &hsubghz, addr, (uint8_t*)&data, 1 );
CRITICAL_SECTION_END();
}

uint8_t SUBGRF_ReadRegister( uint16_t addr )
{
uint8_t data;
CRITICAL_SECTION_BEGIN();
HAL_SUBGHZ_ReadRegisters( &hsubghz, addr, &data, 1 );
CRITICAL_SECTION_END();
return data;
}


https://github.com/STMicroelectronics/STM32CubeWL/blob/44ecf7c04341aa37a06bcc761415e2a1461872be/Middlewares/Third_Party/SubGHz_Phy/stm32_radio_driver/radio_driver.c#L952

https://github.com/STMicroelectronics/STM32CubeWL/blob/44ecf7c04341aa37a06bcc761415e2a1461872be/Middlewares/Third_Party/SubGHz_Phy/stm32_radio_driver/radio_driver.c#L957


Best Regards.

STTwo-32

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

View solution in original post

2 REPLIES 2
STTwo-32
ST Employee

Hello @ravisha96 

This same case has been reported on GitHub (I think is you). So, my college escalated internally for more investigation and support (under internal ticket number 176222). We will be back to you (here or on the GitHub) as soon as we have any news.

Best Regards.

STTwo-32

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

STTwo-32
ST Employee

Hello @ravisha96 

You have to add critical sections to the functions in radio_driver.c like below, and test again. They implemented these, and good news is they haven’t encountered any issue so far. They will continue to test on several devices over a period of 10 days since the application was running a few days and was hanging suddenly. They will let us know if adding critical sections solve the issue or not.


void SUBGRF_WriteRegister( uint16_t addr, uint8_t data )
{
CRITICAL_SECTION_BEGIN();
HAL_SUBGHZ_WriteRegisters( &hsubghz, addr, (uint8_t*)&data, 1 );
CRITICAL_SECTION_END();
}

uint8_t SUBGRF_ReadRegister( uint16_t addr )
{
uint8_t data;
CRITICAL_SECTION_BEGIN();
HAL_SUBGHZ_ReadRegisters( &hsubghz, addr, &data, 1 );
CRITICAL_SECTION_END();
return data;
}


https://github.com/STMicroelectronics/STM32CubeWL/blob/44ecf7c04341aa37a06bcc761415e2a1461872be/Middlewares/Third_Party/SubGHz_Phy/stm32_radio_driver/radio_driver.c#L952

https://github.com/STMicroelectronics/STM32CubeWL/blob/44ecf7c04341aa37a06bcc761415e2a1461872be/Middlewares/Third_Party/SubGHz_Phy/stm32_radio_driver/radio_driver.c#L957


Best Regards.

STTwo-32

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.