cancel
Showing results for 
Search instead for 
Did you mean: 

I2C execution speed, with vs without DMA using HAL driver

primozic
Associate II
Posted on December 01, 2014 at 12:56

Hello, I am interfacing SC620 LED driver using I2C. 

Here is a function where writing to the LED driver occurs.

/**

 * Write data to SC620 led driver register.

 *

 * @param[in] reg      Select register.

 * @param[in] value   Data to write in the above register.

 * @return                 HAL status (ok, error, busy or timeout).

 */

static HAL_StatusTypeDef hmi_setRegister(SC620_Register reg, uint8_t value)

{

  uint8_t ledBuf[2];

  ledBuf[0] = reg;    // register address

  ledBuf[1] = value;  // data

//  return HAL_I2C_Master_Transmit_DMA(&I2cHandle, SC620_ADDRESS, ledBuf, sizeof(ledBuf));  //DMA tranfer

  return HAL_I2C_Master_Transmit(&I2cHandle, SC620_ADDRESS, ledBuf, sizeof(ledBuf),I2C_TIMEOUT);  // pooling

}

There are two ways writing can happen. First way is usign DMA, and second is using pooling method. Methods have only been taken over from peripheral driver and have not been changed. 

The strange thing is the execution time with DMA transfer is cca 112 microseconds, and with pooling is 106 microseconds. 

The execution time is measured by pin setting inside ISR.

void TIM9_IRQHandler(void)

{

  if(TIM9->SR & TIM_SR_UIF) // if UIF flag is set

  {

    TIM9->SR &= ~TIM_SR_UIF; // clear UIF flag

    gpio_setPin(ISR_DEBUG,1); // heartbeat

    hmi_evaluateHMI();

    gpio_setPin(ISR_DEBUG,0); 

  }

}

Why does DMA take so long to execute? I would expect to see drastic decrease in time execution with DMA, i.e. when data is ready to be tranfered via I2C, CPU would enable DMA transfer and return to do other things, and let DMA do the actual transfer. However, it turns out the DMA is actually slower than pooling.

Can the size of the data I am trying to send be the cause? Would DMA be better only if I have to transfer say 30 bytes, instead of only 2 I am transferring right now? 

I have checked what is using the most time inside of the HAL_I2C_Master_Transmit_DMA function, and it turns out that out of 112 microsecs, __HAL_I2C_CLEAR_ADDRFLAG(hi2c); takes 58 microseconds, and in fact it is reading the SR2 that is taking all this time. Why does reading a register take so much time?

Thanks in advance!
8 REPLIES 8
stm322399
Senior
Posted on December 01, 2014 at 14:17

I don't know exactly the HAL about I2C, but I am pretty sure that DMA does not help to get faster on I2C transfer, neither on SPI nor UART. DMA only helps to be able to use the CPU for something else during the transfer, for example, because DMA reduces the number of events to serve.

By the way, if the software just spins into a while loop, waiting for the end of a DMA transfer, the transfer itself is not faster.

So why did you experienced a slower transfer using DMA ? I assume that there is a lot of code involved after a DMA transfer has terminated, to clear registers, to call a callback etc ... 6 µs overhead can easily be explained by running non-optimized software.

stm322399
Senior
Posted on December 01, 2014 at 14:30

Well, I fired my first answer a bit fast.

It is true that using a DMA for two bytes can be counter-productive, because DMA initialization implies a lot more register initialization than the polling method, which delays the transfer.

Regarding the time to read SR2, I am surprised that it takes so long, I hope that there is another explanation. As for an example, in 'multi-purpose' libraries like HAL, a function that checks a flag might do several pre-checking (is the flag in the supported range ?) and sometime multiplexes flag from several registers (the function has a lot of if-then-else statements to select the target flag), and running all of those instructions takes time .... a lot of time.

By the way, you made the assumption that the HAL fires a DMA and let the cpu return to other activities ... it might not be the way it works, make sure that nobody spins in while loop waiting the end of the transfer.

primozic
Associate II
Posted on December 01, 2014 at 14:48

Sorry if I expressed myself wrong before, DMA does not make execution faster (after all, I2C speed I am running at is 400KHz, and transfer cannot be faster than that), but like you said, it frees the CPU to do other things. That is what I was hoping to see on the scope, HAL_I2C_Master_Transmit_DMA would initiate the transfer, and when function returns, DMA does the transfer and CPU moves on. There is no spinning in while loop in my case, the above function takes 112 microseconds to execute. Reading I2C status reg 2 takes 58us, sending slave address about 35 us, which adds up to 93 and the rest (112-93=19) just other code inside of the function. So, no spinning, no waiting in a while loop - the biggest mystery is why reading status register takes so long.

The overhead after DMA has terminated might be there, but I does not affect me (nor did I measure it, I only measured the execution time it takes to initiate DMA transfer and return from that function). 

Also, I tried replacing st code to clear ADDR flag with my code (simply reading SR1 and SR2, according to the data sheet), but no improvement in speed. 

I will investigate further and come back if I get to new discoveries. 

stm322399
Senior
Posted on December 01, 2014 at 14:56

Reading a register shall not be that long (unless you really under clocked the bus, or the device itself, which is likely not the case).

Can you tell me more about the way you measured the time taken to read the register (make sure to give me the detail so that I can check whether the measure has a bias in it) ? Generally I would bet that the read itself is pretty fast, and the bulk around eats up CPU time.

primozic
Associate II
Posted on December 01, 2014 at 15:10

Inside of HAL_I2C_Master_Transmit_DMA function, there is a line to clear the ADDR flag. I set the pin before that line, and clear it right after. Signal is connected to the scope.

    gpio_setPin(GPIOA,12,1); 

    __HAL_I2C_CLEAR_ADDRFLAG(hi2c);

    gpio_setPin(GPIOA,12,0); 

I never changed the bus speed. Also, reading SR1 takes virtually no time compared to SR2. So, two register which are very similar, reading one causes no problem, but reading the other last too long.

stm322399
Senior
Posted on December 01, 2014 at 15:38

Very interesting ...

Assuming you are working on F4, I had a look to HAL. Surprise ! (not really indeed!) HAL_I2C_Master_Transmit_DMA calls I2C_MasterRequestWrite, this last has 'builtin' while loops to wait for I2C state machine to step forward, so forget about 'fire-and-go-doing-something-else'. This design will only brings you a benefit for large transfers.

Of this does not explain why reading from SR2 takes so long. This situation deserves more investigation, I can only guess what is happening: reading e register is fast, but peripheral register are not memory, it is likely that side effects associated to register read can delay the completion of the read access until some internal status is reached. By the way 58µs is very long, I have no explanation. Could it be that clearing ADDR flag fires an interrupt (so adding extra processing time between your GPIO toggling) ?

primozic
Associate II
Posted on December 01, 2014 at 16:47

It is L151, but there should be not too much difference. Correct, it calls I2C_MasterRequestWrite, but only to see it device is there (35microsecs I mentioned before). DMA request is enabled directly below. 

The idea that clearing ADDR fires an interrupt is very interesting one. Right now I explicitly disables all I2C interrupts (event, error and buffer enable), and unfortunately it did not solve the problem. But now when you mentioned it, I do see flickering on the scope, which tells me something is getting in the way, eg. interrupt. I will do more testing tomorrow and get back to you. Thanks for all your help!

primozic
Associate II
Posted on December 02, 2014 at 10:06

Update. The flickering on the scope seen when pin was toggled when ADDR flag was cleared (and why it was taking 58us) was because half transfer interrupt was enabled. Disabling that, makes flickering goes away and reading status registers 1 and 2 takes only couple of microsecs, as it should be.

However, because only 2 bytes are transferred, DMA transfer is nothing faster than pooling because of all the overhead it takes to execute DMA interrupt.