2024-05-29 12:28 PM - last edited on 2024-05-30 12:26 AM by Peter BENSCH
Setup
Application hangs because MBMUXIF_LoraSendCmd() command stuck on Sem_MbLoRaRespRcv sometimes
I have an application built around LoRaWAN_End_Node_DualCoreFreeRTOS example provided in the firmware. My application on CM4 sends telemetry roughly every 4-5 minutes. It will run well for a few days and suddenly the MBMUXIF_LoraSendCmd() gets stuck waiting on Sem_MbLoRaRespRcv. Reading more on how dual-core system works I figured that if a response is not received through the IPCC channels, the semaphore is never released. This is a potential pitfall for me because my application requires telemetry to be sent continuously at the 4/5 minute rate.
I cannot think of reasons why a Resp might not have been received by the CM4 core for any telemetry send Cmd.
How to reproduce the bug
At this time, I cannot pinpoint how to reproduce this bug. In my view it happens randomly at different times. Sometimes the system runs for a few days and the bug occurs or sometimes it happens right away.
Additional context
I have set up an rtos queue to not bombard the send API with messages. However, my queue gets full when this issue and no messages are sent.
** Code Snippet **
void MBMUXIF_LoraSendCmd(void) { /* USER CODE BEGIN MBMUXIF_LoraSendCmd_1 */ /* USER CODE END MBMUXIF_LoraSendCmd_1 */ if (MBMUX_CommandSnd(FEAT_INFO_LORAWAN_ID) == 0) { osSemaphoreAcquire(Sem_MbLoRaRespRcv, osWaitForever); } else { Error_Handler(); } /* USER CODE BEGIN MBMUXIF_LoraSendCmd_Last */ /* USER CODE END MBMUXIF_LoraSendCmd_Last */ }
** Additional Info/questions **
I think by design this system waits forever on this semaphore. If at all a response is not heard back, can we have some retry mechanism or show it as a communication error callback/ retry mechanism of some kind?
Solved! Go to Solution.
2024-07-16 08:16 AM
Hello @ravisha96
You have to add critical sections to the functions in radio_driver.c like below, and test again. They implemented these, and good news is they haven’t encountered any issue so far. They will continue to test on several devices over a period of 10 days since the application was running a few days and was hanging suddenly. They will let us know if adding critical sections solve the issue or not.
void SUBGRF_WriteRegister( uint16_t addr, uint8_t data )
{
CRITICAL_SECTION_BEGIN();
HAL_SUBGHZ_WriteRegisters( &hsubghz, addr, (uint8_t*)&data, 1 );
CRITICAL_SECTION_END();
}
uint8_t SUBGRF_ReadRegister( uint16_t addr )
{
uint8_t data;
CRITICAL_SECTION_BEGIN();
HAL_SUBGHZ_ReadRegisters( &hsubghz, addr, &data, 1 );
CRITICAL_SECTION_END();
return data;
}
Best Regards.
STTwo-32
To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.
2024-06-27 10:29 AM
Hello @ravisha96
This same case has been reported on GitHub (I think is you). So, my college escalated internally for more investigation and support (under internal ticket number 176222). We will be back to you (here or on the GitHub) as soon as we have any news.
Best Regards.
STTwo-32
To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.
2024-07-16 08:16 AM
Hello @ravisha96
You have to add critical sections to the functions in radio_driver.c like below, and test again. They implemented these, and good news is they haven’t encountered any issue so far. They will continue to test on several devices over a period of 10 days since the application was running a few days and was hanging suddenly. They will let us know if adding critical sections solve the issue or not.
void SUBGRF_WriteRegister( uint16_t addr, uint8_t data )
{
CRITICAL_SECTION_BEGIN();
HAL_SUBGHZ_WriteRegisters( &hsubghz, addr, (uint8_t*)&data, 1 );
CRITICAL_SECTION_END();
}
uint8_t SUBGRF_ReadRegister( uint16_t addr )
{
uint8_t data;
CRITICAL_SECTION_BEGIN();
HAL_SUBGHZ_ReadRegisters( &hsubghz, addr, &data, 1 );
CRITICAL_SECTION_END();
return data;
}
Best Regards.
STTwo-32
To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.