cancel
Showing results for 
Search instead for 
Did you mean: 

STM32WB55: Correct way to clear I2C3 interrupt at the peripheral

d0
Associate

Dear community,

We are currently developing custom firmware for an upcoming device based on STM32WB55. It's written from scratch, including a very simple cooperative scheduler. Recently, we rewrote our I2C driver. Ever since, we occasionally see one of our asserts failing.

The bug sequence is as follows:

  • Handler for I2C3_EV (i2c_handle_event) is called
  • Checks that there is a non-NULL transaction associated with I2C3 (this is a global volatile variable)
  • The handler finds NACKF set in the corresponding I2C_ISR
  • Another function (i2c_handle_error) is called to deal with the NACK:
  • Masks the corresponding I2C_ISR to obtain error flags
  • Acknowledges the interrupts by setting the I2C_ICR register
  • Stops current transaction and sets current transaction global variable to NULL
  • If another transaction is pended, starts it

Occasionally, the assert that current transaction is non-NULL fails. I am pretty sure that the code is semantically correct and I have ruled out the usual suspects.

I have observed the following:

  • The behavior is completely deterministic: for a given binary of the firmware, either the assert will crash the system every single time in exactly the same way, or it will never crash the system.
  • Whether the crash happens or not seems to depend exclusively on instruciton timing. I can "fix" the issue by adding at least 9 nop instructions to the end of i2c_handle error.
  • The current version of the code "works", but will break again when I make seemingly unrelated changes in any code that touches I2C, and thus affects the timing of events.
  • I have a working and a broken version of the firware; the difference is whether the call to i2c_handle_error is tail-call optimized by GCC or not. The difference between these two binaries is exactly three instructions. Otherwise everything is exactly the same, including all memory addresses (and thus alignment). The tail-call optimized version is the one which doesn't work.

I have read ARM AN321 and my current theory is that when I clear the interrupt pending at the peripheral (via I2C_ICR), it takes some time for this to propagate to the NVIC_ISPRx. When we leave the i2c_handle_event ISR too quickly (or perhaps just at the unfortunate moment), the interrupt is still pending and the ISR is called again (for the same NACK occurrence) finds the current transaction set to NULL.

Is this possible? The AN recommends (4.9 Disabling interrupts at peripherals) to contact the manufacturer directly for recommendations how to make sure that the interrupt at the peripheral is cleared before leaving the ISR. So, here I am.

I have two questions:

  • Can the issue happen as described?
  • What is the correct sequence to make sure that prior to leaving the ISR, I2C3_EV is no longer pending?

Thanks!

1 REPLY 1
TDK
Guru

You should clear the flag as soon as possible. A few clock cycles is typically sufficient to avoid the interrupt being re-entered. You can also poll the NVIC pending interrupt bits, which may work.

if (NACKF) {
    // clear NACKF here, first
    // other code
}

However, you should also be checking for flags within the interrupt before acting (i.e. before checking for a NULL transaction), which should obviate this from being an issue. You will not find any flags set when the interrupt re-enters like this.

If you feel a post has answered your question, please click "Accept as Solution".