2025-10-14 5:51 AM - last edited on 2025-10-14 6:48 AM by Andrew Neil
We have a battery-powered device using:
Our communication pattern:
This means we re-enable/disable UART Idle Line interrupt on every transaction (potentially hundreds of times per day).
#define RING_BUFFER_SIZE 2048
#define ISR_BUFFER_SIZE  1024
typedef struct {
    lwrb_t rb;                              // lwrb ring buffer
    volatile bool rx_data_ready_flag;       // Flag set by ISR
    uint8_t rb_buffer[RING_BUFFER_SIZE];    // Ring buffer storage
    uint8_t isr_buffer[ISR_BUFFER_SIZE];    // Intermediate buffer for Idle Line ISR
} uart_rb_t;
uart_rb_t modem_rb;
void HAL_UARTEx_RxEventCallback(UART_HandleTypeDef *huart, uint16_t Size) {
    if (huart == &MODEM_UART) {
        // Save data from ISR buffer to ring buffer
        lwrb_write(&modem_rb.rb, modem_rb.isr_buffer, Size);
        modem_rb.rx_data_ready_flag = true;
        
        // Re-enable Idle Line interrupt
        int retries = 10;
        do {
            if (HAL_UARTEx_ReceiveToIdle_IT(&MODEM_UART, 
                                            modem_rb.isr_buffer,
                                            sizeof(modem_rb.isr_buffer)) == HAL_OK) {
                break;
            }
            retries--;
        } while (retries > 0);
    }
}
After ~60 days of continuous operation, the device completely froze with no recovery.
<<EndParse <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>>StartParse >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
"AT+QISTATE=1,1" --> [response received OK]
<<EndParse <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>>StartParse >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
"AT+QISTATE=1,1" --> [response received OK]
<<EndParse <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>>StartParse >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
"AT+QISTATE=1,1" --> [response received OK]
<<EndParse <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>>StartParse >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[SYSTEM FROZE HERE - no further output]Notice: The command string was not printed in the last call, suggesting the code hung before that point.
uint32_t tick = HAL_GetTick();
do {
    return_code = modem_send_command_wait_parse_result(
        "AT+QISTATE=1,1", 
        "+QISTATE:", 
        /* parsing params */,
        300 /* timeout ms */
    );
    
    if (condition_met) break;
    HAL_Delay(100);
    
} while (HAL_GetTick() - tick < 15000);  // 15 second outer timeout
int modem_send_command_wait_parse_result(..., int timeout, ...) {
    // Local buffers
    char formatted_command[1024] = {0};
    char buffer_final[2048] = {0};
    unsigned int total_bytes = 0;
    
    printf(">>StartParse >>>>\n");
    if (command_to_send != NULL) {
        printf("\"%s\" --> ", command_to_send);
    }
    
    // Enable UART RX Interrupt with Idle-line detection
    int retries = 10;
    do {
        if (HAL_UARTEx_ReceiveToIdle_IT(&MODEM_UART, 
                                        modem_rb.isr_buffer,
                                        sizeof(modem_rb.isr_buffer)) == HAL_OK) {
            break;
        }
        retries--;
    } while (retries > 0);
    
    if (retries == 0) {
        return_code = -1;
    }
    
    uint32_t tick = HAL_GetTick();
    
    // Main receive loop with timeout
    while (return_code > 0 && ((HAL_GetTick() - tick) < timeout)) {
        // Check flag and read from ring buffer
        if (modem_rb.rx_data_ready_flag) {
            modem_rb.rx_data_ready_flag = false;
            
            int bytes = lwrb_read(&modem_rb.rb, 
                                  &buffer_final[total_bytes],
                                  sizeof(buffer_final) - total_bytes);
            total_bytes += bytes;
        }
        
        // Parse response, check for expected strings, etc.
        // ...
    }
    
    // Cleanup
    HAL_UART_Abort_IT(&MODEM_UART);
    lwrb_reset(&modem_rb.rb);
    
    return return_code;
}
// Main loop (non-atomic):
   if (modem_rb.rx_data_ready_flag) {        // Read
       modem_rb.rx_data_ready_flag = false;  // Write - ISR could interrupt here!
   }2025-10-14 6:36 AM
You should be able to attach a debugger without resetting the device to examine its state.
There is nothing inherent to the device which stops working after X days. It has to be a bug in the code somewhere.
1ms * 2^32 is 49.7 days. Possible you have an issue with a timeout that uses systick, but the HAL functions handle this overflow correctly.
2025-10-14 6:46 AM
Is this just one isolated instance on one particular device, or are you seeing many such occurrences?
2025-10-14 6:51 AM - edited 2025-10-14 6:53 AM
@GR88_gregni wrote:
- Are we using UART Idle Line correctly for this use case (repeated enable/disable)?
 
So what, exactly, is the purpose of the Idle Line interrupt in your system?
You seem to be just doing AT commands - that can be (usually is?) done without Idle Line detection...
2025-10-14 6:52 AM
This is the first occurrence we've seen. We have 3 devices in the field for 2 months, and this is the only one that froze after 60 days. However, we're concerned it may be a latent bug that will affect others.
2025-10-14 6:56 AM
Did you resolver your Long Blocking Operations Dilemma ?
Having very long blocking delays sounds risky...
2025-10-14 6:56 AM
We use Idle Line interrupt to detect when the BG95 modem has finished transmitting its response, since AT command responses are variable-length and we don't know in advance how many bytes to expect.
2025-10-14 6:57 AM
Yes I have resolved that.
2025-10-14 6:59 AM
Then please feed-back in that thread and mark the solution.
2025-10-14 7:00 AM
@GR88_gregni wrote:AT command responses are variable-length and we don't know in advance how many bytes to expect.
But they have well-defined termination criteria.
The usual approach is to look for the termination.