STM32L431 I2C master transmitter freezes if SCL is pulled down externally when it starts

waclawek.jan · ‎2023-12-07

When I2C master transaction (both Rx and Tx) is started while SCL is pulled to 0 externally, it goes into BUSY mode (I2C_ISR=0x8001) and remains there indefinitely, without setting any interrupt-generating flag, even after the external SCL->0 is removed.

This is extremely simple to reproduce, well, just short SCL to ground and write any START-containing "command" to I2C_CR2; and then remove the short, while observing I2C_ISR. It's enough to do it in debugger with stopped processor, the result is obvious enough.

I expected, that either

1. I2C throws a BERR (I would prefer this, but BERR's description says its scope is limited to a very particular bus error involving only STOP and START - and the BERR-related erratum dismisses its real-world usability entirely), or,

2. upon removal of SCL->0 I2C proceeds with the transaction as normally.

Documentation says, it's 2.:

This issue - which I consider to be a flaw - makes implementation of a sturdy I2C master based entirely on I2C interrupts impossible; there must be an additional timeout implemented. Together with TXIS-when-NACK issue, it makes proper I2C software significantly more complex than if the I2C would be correctly implemented according to documentation.

JW

FBL · ‎2023-12-14

Hello @waclawek.jan

Thank you for your feedback.

If the SCL is tied to ground while the START is set in the I2C control register, the internal counter that normally brings SCL back high after start condition is not started, leaving the SCL low indefinitely.

The solution is not to set START when the bus is taken (SCL kept low). There are generally 2 ways to do that, on I2C interface with SMBUS features it’s possible to use TIMEOUT with TIDLE bit set to probe the bus, on all products it’s possible to simply look at the respective GPIO IDR bit to see if its value is 1.

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

waclawek.jan · ‎2023-12-14

Hi @FBL ,

> If the SCL is tied to ground while the START is set in the I2C control register, the internal counter that normally brings SCL back high after start condition is not started, leaving the SCL low indefinitely.

Yes, that's the expected internal mechanism of this hardware bug; but knowing that won't help.

Problem is, that the SCL can be pulled down externally during START by any of the slaves, or due to noise.

In I2C mode, the SMBUS timeout is not available and user has to write code to generate timeouts in some other way; and then reset the I2C module to release SCL.

The second workaround you've mentioned does not work, if the pulldown occurs between reading GPIOx_IDR and writing to I2Cx_CR2 register.

Also, please note, that the observed behaviour contradicts the documentation, as I've highlighted above.

Can you please address these concerns.

Thanks,

Jan Waclawek

FBL · ‎2023-12-15

Hello @waclawek.jan

I think I couldn't fully understand the issue. Could you please correct me if I'm mistaken?

Slave has no place pulling SCL unless it's a stretching. If it's noise, within the tolerance spec of I2C, it's taken care of by our noise filters.

SCL pulled down during START is a violation of the I2C specification. The timeout is an SMBus feature, but there's no problem to use it in I2C. The dividing line between I2C and SMBus is not so strict, many customers use SMBus features to enhance the I2C usability.

Regarding the GPIO check, if implemented well, the time window for the SCL to be pulled in between is just few system clock cycles. It lowers the probability of failure by a lot.

About your last point, the reference manuals will be updated.

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

waclawek.jan · ‎2023-12-16

Hi @FBL ,

> SCL pulled down during START is a violation of the I2C specification.

Yes, it is. However, real world is not black and white.

I2C is "special" in that it is multidrop, open-collector, and its impedance depends on its state. Also, there are many deviant devices out there. So, to the question, "from hardware standpoint, is what happens on I2C bus as reliable as say SPI or UART (when used the normal push-pull outputs)" the answer is "no, far from it".

And, if asked, "okay then, thigs may happen on I2C, so what happens if there is a transient beyond the spec, will we fail gracefully?", the answer "oh, we just hang indefinitely" is not exactly the expected answer.

That things out of specs may happen on I2C is even acknowledged directly those specs, by the requirement for filtering (which is vague enough to leave leeway, but for a reason, outlined above); and also by the suggestion for the 9-SCL-pulse recovery, which indicates, that out of sync master and slave is to be expected. One scenario for SCL "hanging" low temporarily is - as you've written yourself - slave clock-stretching; and if this meets some scenario where master and slave get out of sync (e.g. by resetting one of them). (Another scenario may involve multimasters starting simultaneously close to each other and the minute details of implementation of the bus-occupancy details; proper analysis of that would require access to the exact internal design of both involved masters).

So, there's little point in arguing that "this may not happen as that's specs violation" here. Specs get violated, accept it, accomodate for it.

-------- start of GPIO check rant

> Regarding the GPIO check, if implemented well, the time window for the SCL to be pulled in between is just few system clock cycles. It lowers the probability of failure by a lot.

You can't base design on "lowering of probability" like that. For example, what if the source of problematic SCL interference in question is a circuitry, which is related to functionality of slave controlled by SCL, and which gets activated in sync with the START of communication to that slave, thus the SCL interference is not stochastic but synchronized? Then your "lowers the probability" goes out of the window by principle.

Not to mention "implemented well", Cube/HAL, cough, cough :)

No, if you want real sturdy I2C implementation given this hardware bug, there's no way you can avoid designing the timeout, including all the ramifications and possible ripple effects of that (so, what we do exactly after timeout, how does that impact our software I2C state machine, what effect does it have at higher level of design etc.) Btw. the SCL-GPIO-detect has similar system-wide ramifications (except the test itself is often simpler to do properly, not having to rely on some source of timing). And, at the end of the day, timeout may be needed anyway (foreshadowing the third installment of this saga ;‑) )

---- end of GPIO check rant

> About your last point, the reference manuals will be updated.

Thanks. I am looking forward to that.

Thanks for your support.

Jan

FBL · ‎2023-12-18

Thank you, Jan, for your feedback,

A question about GPIO check implementation, indeed, the pulldown should occur between reading GPIOx_IDR and writing to I2Cx_CR2 register, could you check using the timeout feature with TIDLE=0 or implement timeout using for example Systick. If the I2C is detected to be stuck, toggling the PE will release it.

One more thing about noise, do you consider using digital filter to increase the capability?

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

waclawek.jan · ‎2023-12-18

Hi @FBL ,

You can filter just that much noise/interference, it's a matter of tradeoff.

I of course know how to implement timeouts in a system. I may explore SMBUS's timeout, but honestly at this point I'm hesitant to open another can.

The point is, that this bug prevents a robust I2C implementation based purely on I2C interrupts, which was obviously the intention with this module. Now bugs happen, that's normal; but IMO it's important to propagate this information throughout documentation so that others are aware of it and can deal with it in their software.

Thanks again for your continued support.

Jan

FBL · ‎2023-12-19

Thank you @waclawek.jan

I have shared your feedback with dedicated team ever since, and I am waiting for documentation expert to select the best way to provide the information in (RM and/or ES)

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.