2025-10-14 5:51 AM - last edited on 2025-10-14 6:48 AM by Andrew Neil
We have a battery-powered device using:
Our communication pattern:
This means we re-enable/disable UART Idle Line interrupt on every transaction (potentially hundreds of times per day).
#define RING_BUFFER_SIZE 2048
#define ISR_BUFFER_SIZE 1024
typedef struct {
lwrb_t rb; // lwrb ring buffer
volatile bool rx_data_ready_flag; // Flag set by ISR
uint8_t rb_buffer[RING_BUFFER_SIZE]; // Ring buffer storage
uint8_t isr_buffer[ISR_BUFFER_SIZE]; // Intermediate buffer for Idle Line ISR
} uart_rb_t;
uart_rb_t modem_rb;
void HAL_UARTEx_RxEventCallback(UART_HandleTypeDef *huart, uint16_t Size) {
if (huart == &MODEM_UART) {
// Save data from ISR buffer to ring buffer
lwrb_write(&modem_rb.rb, modem_rb.isr_buffer, Size);
modem_rb.rx_data_ready_flag = true;
// Re-enable Idle Line interrupt
int retries = 10;
do {
if (HAL_UARTEx_ReceiveToIdle_IT(&MODEM_UART,
modem_rb.isr_buffer,
sizeof(modem_rb.isr_buffer)) == HAL_OK) {
break;
}
retries--;
} while (retries > 0);
}
}
After ~60 days of continuous operation, the device completely froze with no recovery.
<<EndParse <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>>StartParse >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
"AT+QISTATE=1,1" --> [response received OK]
<<EndParse <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>>StartParse >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
"AT+QISTATE=1,1" --> [response received OK]
<<EndParse <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>>StartParse >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
"AT+QISTATE=1,1" --> [response received OK]
<<EndParse <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>>StartParse >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[SYSTEM FROZE HERE - no further output]
Notice: The command string was not printed in the last call, suggesting the code hung before that point.
uint32_t tick = HAL_GetTick();
do {
return_code = modem_send_command_wait_parse_result(
"AT+QISTATE=1,1",
"+QISTATE:",
/* parsing params */,
300 /* timeout ms */
);
if (condition_met) break;
HAL_Delay(100);
} while (HAL_GetTick() - tick < 15000); // 15 second outer timeout
int modem_send_command_wait_parse_result(..., int timeout, ...) {
// Local buffers
char formatted_command[1024] = {0};
char buffer_final[2048] = {0};
unsigned int total_bytes = 0;
printf(">>StartParse >>>>\n");
if (command_to_send != NULL) {
printf("\"%s\" --> ", command_to_send);
}
// Enable UART RX Interrupt with Idle-line detection
int retries = 10;
do {
if (HAL_UARTEx_ReceiveToIdle_IT(&MODEM_UART,
modem_rb.isr_buffer,
sizeof(modem_rb.isr_buffer)) == HAL_OK) {
break;
}
retries--;
} while (retries > 0);
if (retries == 0) {
return_code = -1;
}
uint32_t tick = HAL_GetTick();
// Main receive loop with timeout
while (return_code > 0 && ((HAL_GetTick() - tick) < timeout)) {
// Check flag and read from ring buffer
if (modem_rb.rx_data_ready_flag) {
modem_rb.rx_data_ready_flag = false;
int bytes = lwrb_read(&modem_rb.rb,
&buffer_final[total_bytes],
sizeof(buffer_final) - total_bytes);
total_bytes += bytes;
}
// Parse response, check for expected strings, etc.
// ...
}
// Cleanup
HAL_UART_Abort_IT(&MODEM_UART);
lwrb_reset(&modem_rb.rb);
return return_code;
}
// Main loop (non-atomic):
if (modem_rb.rx_data_ready_flag) { // Read
modem_rb.rx_data_ready_flag = false; // Write - ISR could interrupt here!
}
2025-10-14 6:36 AM
You should be able to attach a debugger without resetting the device to examine its state.
There is nothing inherent to the device which stops working after X days. It has to be a bug in the code somewhere.
1ms * 2^32 is 49.7 days. Possible you have an issue with a timeout that uses systick, but the HAL functions handle this overflow correctly.
2025-10-14 6:46 AM
Is this just one isolated instance on one particular device, or are you seeing many such occurrences?
2025-10-14 6:51 AM - edited 2025-10-14 6:53 AM
@GR88_gregni wrote:
- Are we using UART Idle Line correctly for this use case (repeated enable/disable)?
So what, exactly, is the purpose of the Idle Line interrupt in your system?
You seem to be just doing AT commands - that can be (usually is?) done without Idle Line detection...
2025-10-14 6:52 AM
This is the first occurrence we've seen. We have 3 devices in the field for 2 months, and this is the only one that froze after 60 days. However, we're concerned it may be a latent bug that will affect others.
2025-10-14 6:56 AM
Did you resolver your Long Blocking Operations Dilemma ?
Having very long blocking delays sounds risky...
2025-10-14 6:56 AM
We use Idle Line interrupt to detect when the BG95 modem has finished transmitting its response, since AT command responses are variable-length and we don't know in advance how many bytes to expect.
2025-10-14 6:57 AM
Yes I have resolved that.
2025-10-14 6:59 AM
Then please feed-back in that thread and mark the solution.
2025-10-14 7:00 AM
@GR88_gregni wrote:AT command responses are variable-length and we don't know in advance how many bytes to expect.
But they have well-defined termination criteria.
The usual approach is to look for the termination.