I2C error handling

ECost · ‎2020-01-15

I am developing with STM32F030Cx, using CubeMX for the configuration. The application is basically a simple full cooperative scheduler (no RTOS, no context switching) that calls several "tasks" in a sequence. I am using the blocking functions of the HAL I2C driver.

STM32 I2C will work in master mode only.

I would like some directions on how to handle the communication errors (BERR, ARLO, timeout etc.) or to reference material such as a flow chart or source code of a library or RTOS.

Maybe the handling of these errors is simple and it is only a matter of following the reference manual, but while researching for questions/answers on the subject, I saw some messages on the I2C channel hanging when a BERR occurred and the fix required some workarounds. So, I would appreciate some advice on how to do to guarantee my application is capable of self recovering when some of these errors occur.

Thanks in advance.

S.Ma · ‎2020-01-15

There are few options. Usually, when doing I2C Master in blocking mode, SW I2C with GPIO can do the job well enough.

Here is what I do and when I2C bus goes though connectors and plug unplug, it still works reliably:

Just before wanting to generate the first start bit on the bus, check if the bus is idle.

Bus is idle if both SDA and SCL are high level (master pins are not forcing low level at this time)

If this is not the case, using SDA and SCL as GPIO, forcinbly generate with GPIO 9 STOP bits (no clock stretching), and retry the transaction.

This will make your I2C an order of magnitude more rugged than without, and it won't hold back your software.

// This function can be called upon any I2C bus error condition.
static int32_t ErrorRecovery (I2C_MasterIO_t* pM) 
{
  // blindly generate 9 stop bits to flush any stuck situation
  // non-invasive, no side effects.
  GenerateStop(pM);
  GenerateStop(pM);
  GenerateStop(pM);
  GenerateStop(pM);
  GenerateStop(pM);
  GenerateStop(pM);
  GenerateStop(pM);
  GenerateStop(pM);	// flush the bus if it is stuck
  return 0;
}
 
static int32_t GenerateStart (I2C_MasterIO_t* pM, uint8_t SlaveAdr)
{
  IO_PinSetHigh(pM->SDA);//dir_I2C_SDA_IN;	// to check if I2C is idle... or stuck
  WaitHere(pM,1);
  if(IO_PinGet(pM->SDA)==0) {
    ErrorRecovery(pM);
    if(IO_PinGet(pM->SDA)==0) { // to debug with hot plug if glitch could code (try twice?)
      HAL_Delay(10);
      ErrorRecovery(pM); // can't recover (or try again with delay?)
    }
  };
 
  if((SlaveAdr & 0x01) == 0) // if it is a write address, we start a transaction, hence we clear ackfail.
    pM->AckFail = 0;
  
  IO_PinSetHigh(pM->SCL);//bit_I2C_SCL_HIGH;
  WaitHere(pM,1);					
 
  // Fixed violation on Start hold time
  IO_PinSetLow(pM->SDA);//bit_I2C_SDA_LOW;
  WaitHere(pM,1);
 
  IO_PinSetLow(pM->SCL);//bit_I2C_SCL_LOW;
  WaitHere(pM,1);
 
  return TransmitByte (pM,SlaveAdr);				// Send the slave address
}
 
 
static int32_t GenerateStop (I2C_MasterIO_t* pM) {
  
  IO_PinSetLow(pM->SCL);//bit_I2C_SCL_LOW;
  WaitHere(pM,1);
  IO_PinSetLow(pM->SDA);//bit_I2C_SDA_LOW;
  WaitHere(pM,1);							// Extra to make sure delay is ok
  
  IO_PinSetHigh(pM->SCL);//bit_I2C_SCL_HIGH;
  WaitHere(pM,1);
 
  IO_PinSetHigh(pM->SDA);//bit_I2C_SDA_HIGH;
  WaitHere(pM,1);  
  return 0;
}

KnarfB · ‎2020-01-15

Take a look at the NXP UM10204 "I2C-bus specification and user manual" Rev. 6 — 4 April 2014. It describes among others a "bus clear" procedure which is similar, but not identical to the above answer.

hth

KnarfB

ECost · ‎2020-01-17

Thank you both for your answers. I Iwill implement it. Any suggestion on how I can force a BERR on a controlled/repetitive way so I can automate this test?

KnarfB · ‎2020-01-17

Not really. In a testbed, you could implement an ill-behaving bit-banging slave. IMHO, if the bus is electrically sound, one risk is that low-end slaves dont have their own clock. So, when the master does not complete a transaction (watchdog reset, brown-out reset,...), the slave is not reset and its state machine might be not in the idle state. Then, the first master transaction after reset may be mis-interpreted by the slave.

ECost · ‎2020-01-17

Hello, KnarfB,

I was thinking more on the line of self generating the errors, i.e., with the microcontroller itself.

Right now I am thinking of short-circuiting a GPIO to force SDA and/or SCL to low level to simulate things that can go wrong. I am using the two I2C channels in my design so I can connect them for testing purposes.

In my case, PB6 and PB7 pins may be configured as either I2C or outputs of TIM16/17 so I could connect them to the pins of the other I2C channel to inject pulses on a periodic way and see if on the long term the channel may self recover. This is for unit tests, of course. Does that make sense? Is it worth pursuing this path?

Of course, I may simply inject the error manually but automating it may allow me to leave the testing setup running a long time, collecting statistics of the errors and hopefully being able to detect some failure that only appears in a stress test. I also consider using another board (e.g. Discovery) to do the same error injection on the I2C channel but the one-board-approach would still be prefferrable to simplify the test setup and make it easily reproduceable in the future.

Maybe I am overcomplicating things unnecessaryly?

Suggestions? Ideas?

KnarfB · ‎2020-01-17

Yes, you can use two GPIOs on the MCU in open-drain output mode (!) and inject some errors on the I2C bus. I wouldn't use a second I2C for this, but you can use pins which can be programmed to either GPIO or I2C2 and then implement a well-behaving slave using I2C2 and a ill-behaving slave using the GPIOs.

Usually, when hardening a system, one makes a risk analysis. There may be other risks besides the I2C error flags like in the electrical/mechanical layers (voltage levels, parasitic bus capacities, environmental conditions,...) depending on your system.

S.Ma · ‎2020-01-17

The old fashion way was to have I2C bus on 3 wires wrapped around a driller generating nasty noise.

Another way is to have I2C on a connector and hot plug unplug. If your slave is an EEPROM, fill it with zero and this will help to get into a stuck situation when constantly reading data from it.

ECost · ‎2020-01-17

KnarfB, thank you for your comments. So am not not (that) crazy. 🙂

I2C is onboard only, only UART goes outside and still not really exposed to the environment. Still I want to make the firmware as robust as possible for those cases that we do not anticipate and that do not occurr even during verification and certification tests. Of course, trying to avoid unnecessary complexity.

I did some tests right before leaving the office, just shorting SDA or SCL to GND with a tweezer and I learned the I2C enters in a failure state, returning TIMEOUT every time. So, for starters, the simple twezer "method" is good enough for me to get started. On Monday I will implement the scheme you and "." suggested to trying to recover from this failure and progress from there. I think I will handle some other errors returned by the I2C the same way for simplicity.

ECost · ‎2020-01-17

I am working with a Discovery connected to an EVM of one of the components I am going to use. So, no EEPROM for now but I may write to a pair of registers and read them back. The idea of using the timers connected to the output pins, configured as open drain is to simulate the drill in a more repetitive way. 🙂

Also see my answer above; I will adapt the code you provided to my application and proceed from there.