Checking for finished SPI / DMA transfer - missing data?

Miro Nohaj · ‎2014-02-18

Posted on February 18, 2014 at 09:01

Hello all,

I've run through couple of SPI DMA threads here, but I didn't find what I was looking for... On my STM32F103 I'm sending data to host over SPI: - my STM32 acts as SPI slave - SPI transfer is done using DMA - data is transferred in blocks of ~520 bytes (260 words (word = 16 bits)), each block is a separate DMA transfer Everything works mostly fine, but from time to time I loose the first word of the block, so I guess I'm not waiting correctly for SPI DMA transfer to finish before starting a new one. I set a flag at the end of DMA transfer in interrupt, so I have DMA1_Channel2_IRQHandler and DMA1_Channel3_IRQHandler. When both transfer interrupts arrive, I assume that the SPI DMA finished and I start a new SPI DMA transfer. I guess that in the problematic case the DMA has finished, but the SPI is still transferring the last WORD, so I added a check / loop like this after the received interrupts:

void waitForSPIidle(void)
{
while((SPI1->SR & 2) == 0); // wait while TXE flag is 0 (TX is not empty)
while((SPI1->SR & (1 << 7)) != 0); // wait while BSY flag is 1 (SPI is busy)
}

The lost WORD is missing (or maybe arrived instead of the last word in the previous block) on the host part, so checking the TXE and then BSY flag should be enough to see if nothing more is transmitted. The thing is that this doesn't help. What am I missing? I don't want to add mostly unnecessary delay before starting each DMA SPI transfer, I would rather check some flags... Jookie #spi-dma #spi-dma

Miro Nohaj · ‎2014-02-19

Posted on February 20, 2014 at 04:56

Hello all,

I just want to let you know, that I sort of found a workaround / solution for the problem, and it's an extra IF between stopping DMA and enabling DMA again, it just checks if some WORD wasn't stuck in the SPI data receiving register, and if it is, then it reads it:

if((SPI1->SR & SPI_SR_RXNE) != 0) {
WORD dummy = SPI1->DR;
}

I'm not sure whether this emptying of that register helps the case, or is it just the extra delay that this produces between DMA stop and DMA start so it still has time to finish properly (although it shouldn't have anything to do as both interrupts on transfer complete has been already received at that time). Some thread here mentions that there's a difference between stopping the DMA by clearing the bit and having the DMA really stopped, so this might be also my case. For now it seems to be working better than it was.

waclawek.jan · ‎2014-02-20

Posted on February 20, 2014 at 09:06

You shouldn't need anything like that. The DMA won't throw the ''complete'' interrupt before all data were transferred into the transmitter. The data may be still in transmission at that moment, though; however, that shouldn't corrupt the first data in the next transmission.

Without seeing the relevant portion of your code and more detailed description of the symptoms it's hard to judge what is the cause of your problems.

JW

jpeacock2399 · ‎2014-02-20

Posted on February 20, 2014 at 15:50

Chances are your SPI port isn't clear when you enable DMA, that's why emptying the SPI buffer works. When using SPI and DMA I always use 2 DMA channels since SPI is inherently bi-directional. When sending I use a regular DMA configuration but for the RX I set up a DMA non-incrementing memory address as a dummy to drain the RX buffer loaded after the TX data clocks out. That way the SPI is always clear at the end of the transfer.

If you go with two DMA channels make sure TX DMA is always a higher priority than RX DMA so the transmit clock will be generated before receiving.

Jack Peacock

waclawek.jan · ‎2014-02-20

Posted on February 20, 2014 at 18:40

A similarly sounding issue appeared in the parallel sub-forum at https://my.st.com/public/STe2ecommunities/mcu/Lists/STM32Discovery/Flat.aspx?RootFolder=/public/STe2ecommunities/mcu/Lists/STM32Discovery/DMA%20FIFO%20Flush%20Issue&currentviews=3

Miro Nohaj · ‎2014-02-20

Posted on February 21, 2014 at 08:40

waclawek.jan wrote:

''Without seeing the relevant portion of your code and more detailed description of the symptoms it's hard to judge what is the cause of your problems.''

The whole code is too large to spot the problem there, and it also relies on the SPI master (other ARM), so you couldn't test it anyway. Pasting it here would just fill a couple of pages...

Jack Peacock wrote:

''

Chances are your SPI port isn't clear when you enable DMA, that's why emptying the SPI buffer works.''

Yeah, that definitely is the case, as I'm using a weird protocol - my STM32F103 is a SPI slave, SPI works as bidirectional using 2 DMA channels, but the amount of data sent (TX) and data received (RX) is almost everytime different (TX tells host what data should be transfered, RX then might receive more data or even none after that). But the issue is visible from time to time on the side of SPI master, that the data is lost in my STM32 from the TX buffer, that's why it's weird that clearing RX buffer could help in my case, and that's why I also suspect that it might be just the extra delay that this causes that it's now better, not what the code does.

waclawek.jan wrote:

''A similarly sounding issue appeared in the parallel sub-forum''

Well, if I would like to cross-post on the forum, I would do it probably under the same name, but I don't like to do that anyway.

jpeacock2399 · ‎2014-02-21

Posted on February 21, 2014 at 15:47

''the amount of data sent (TX) and data received (RX) is almost everytime different''

For an SPI data transfer the count should always be exactly the same. Data is clocked in and out on different edges of the same clock pulse. If you don't see equal DMA counts then you have a program bug.

Jack Peacock

waclawek.jan · ‎2014-02-21

Posted on February 21, 2014 at 16:30

> waclawek.jan wrote:

> ''Without seeing the relevant portion of your code and more detailed description of the

> symptoms it's hard to judge what is the cause of your problems.''

>

> The whole code is too large to spot the problem there, and it also relies on the SPI master

> (other ARM),

Then, as I suggested in that other thread, you might want to write a simplified test case, to demonstrate the problem, possibly for a DISCOVERY board so that others can test it, using other SPI on the same chip as the master.

>

waclawek.jan wrote:

> ''A similarly sounding issue appeared in the parallel sub-forum''

>

> Well, if I would like to cross-post on the forum, I would do it probably under the same name, but I don't like to do that anyway.

I wasn't implying that. Rather, I meant, it might be beneficial to know of the other thread, should a solution come up.

JW

Miro Nohaj · ‎2014-02-21

Posted on February 22, 2014 at 00:03

''For an SPI data transfer the count should always be exactly the same. Data is clocked in and out on different edges of the same clock pulse. If you don't see equal DMA counts then you have a program bug.''

I wouldn't call it a bug, because this is intentional 😉 One example why I need to do this is that my slave tells the master (using 8 TX words) that he needs another block of data of size 260 WORDs, and as I need a good throughput, I already have a RX buffer prepared to receive that block. So I set up the DMA as 8 words for TX, 260 + 8 words for RX, enable it and when it all finishes, I already have the response from master in the RX buffer. The TX buffer has last from those 8 WORDs set to 0, so when it stops and it won't feed the SPI TX register, the last value there is 0, so the TX to host will be 8 meaningfull WORDs and 260 WORDs of zeros. I considered that having a TX buffer with size 268 words filled mostly with zeros would be waste of RAM. So the underflow of SPI TX is expected result. The same goes the other way around, when I need to TX 268 WORDs, but I don't have to receive anything, so I set up RX with only single WORD transfer and I won't mind that the SPI RX will overrun, as there won't be any meaningful data I would loose.

So the thing I needed to do was a good recovery from these underflow and overrun states. I know that to have it all correctly I could set up the TX and RX buffers of the same size, but I have a couple of different ones prepared with some data at the start, so I don't have to set the content completely during run to save some cycles.

waclawek.jan · ‎2014-02-24

Posted on February 24, 2014 at 10:06

Okay, and if you try to avoid the underruns by allocating big enough buffer for both rx and tx, will your problem persist?

JW