Question on STM32 DMA FIFO

apiccianoit · ‎2014-02-14

Posted on February 14, 2014 at 18:01

Hi All,

reading the reference manual of the STM32, I have a question related to a transfer peripheral to memory using DMA, direct mode ( no FIFO )

In section 9.3.12, Direct Mode paragraph, i found:

Direct mode

By default, the FIFO

operates

in direct mode (DMDIS bit in the DMA_SxFCR is reset) and

the FIFO threshold level is not used. This mode is useful when the system requires an

immediate and single transfer to or from the memory after each DMA request.

When the DMA is configured in direct mode (FIFO

disabled

), to transfer data in memory-toperipheral mode, the DMA preloads one data from the memory to the internal FIFO to

ensure an immediate data transfer as soon as a DMA request is triggered by a peripheral.

To avoid

saturating the FIFO

, itis recommended to configure the corresponding stream with

a high priority.

My question is: even if the FIFO threshold is not used, is the FIFO still active? To say with other words: if the peripheral is sending bytes at a very high speed, it can happen that I lost a byte or the FIFO helps me to store the previous bytes?

Thank you very much,

Antonio

#dma-fifo #fifo #direct-mode #dma

Tesla DeLorean · ‎2014-02-14

Posted on February 14, 2014 at 18:41

In the context of your question,

Define STM32

Define Very High Speed

The FIFO is there to mitigate memory contention issues wrt source and target devices.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

jpeacock2399 · ‎2014-02-14

Posted on February 15, 2014 at 01:01

If your DMA source and destination sizes are different then the FIFO will be used to pack/unpack data between the memory bus and the peripheral, even in direct mode. This is useful if you know data aligns right as it can substantialy cut down on bus contention (as much as 4 to 1 for example if transferring data from memory at 32 bit to USART at 8 bit). How much this affects a program depends on the memory map and how you place DMA buffers in SRAM1 or SRAM 2 banks.

If the FIFO is enabled then it is filled in DMA bus access bursts of one to four words at a time, also useful if you want to limit DMA arbitration.

Jack Peacock

waclawek.jan · ‎2014-02-17

Posted on February 17, 2014 at 10:31

> If your DMA source and destination sizes are different then the FIFO will be used to pack/unpack data between the memory bus and the peripheral, even in direct mode.

No.

RM0090, ver.6, p.311, 10.3.10:

In direct mode (DMDIS = 0 in the DMA_SxFCR register), the packing/unpacking of data is

not possible. In this case, it is not allowed to have different source and destination transfer

data widths: both are equal and defined by the PSIZE bits in the DMA_SxCR MSIZE bits are

don’t care).

(Although Clive hinted, that the DMA in different STM32 models may differ in details, I don't believe they would differ in this particular one. AFAIK, only the 'F2/'F4 DMA has FIFO).

Antonio: The documentation is crappy. Forget about FIFO being used in direct mode.

JW

apiccianoit · ‎2014-02-17

Posted on February 17, 2014 at 11:23

The model is STM32F205.

With very high speed I mean a peripheral sending bytes very fast. E.g. UART @921600 ( a byte every 10us )

thank you very much

apiccianoit · ‎2014-02-17

Posted on February 17, 2014 at 11:31

Thank you for your answer.

My use case is a byte level transfer: a byte arriving from the peripheral ( UART @921600 ) and that byte moved into a memory array.

I don't want to use the FIFO mode in the DMA, but I want to be sure that, if for any reason there is any slow down, the internal FIFO of the DMA can store one or 2 bytes and transfer them as soon as possible.

Thank you

waclawek.jan · ‎2014-02-17

Posted on February 17, 2014 at 11:55

> I don't want to use the FIFO mode in the DMA,

Why? I can't think of any adverse effect of using the FIFO.

> but I want to be sure that, if for any reason there is any slow down, the internal FIFO of the DMA can store one or 2 bytes and transfer them as soon as possible.

If you don't use the FIFO mode, the FIFO can't store anything.

JW

apiccianoit · ‎2014-02-17

Posted on February 17, 2014 at 12:12

The reason is that the bytes arriving from the UART is unknown.

Suppose I set the FIFO threshold to 4 bytes, but only 2 or 3 bytes arriving from the UART.

I need to transfer them as soon as possible, and not wait the reaching of the treshold.

So my choice is to use the direct mode, but the problem, in that case, is the speed: the DMA transfer is fast enough to transfer each byte as soon as it arrives, or can I occurr in some overwriting?

thank you

waclawek.jan · ‎2014-02-17

Posted on February 17, 2014 at 13:14

> The reason is that the bytes arriving from the UART is unknown.

> I need to transfer them as soon as possible, and not wait the reaching of the treshold.

Well, this might be seen as a drawback of the FIFO, I admit... :)

OK, no FIFO. But that means, you can't count on it; that's it.

> So my choice is to use the direct mode, but the problem, in that case, is the speed: the DMA

> transfer is fast enough to transfer each byte as soon as it arrives, or can I occurr in some

> overwriting?

This depends on many factors. First, within the DMA module itself, the DMA stream in question might wait for other DMA stream to finish its transfer - this can be mitigated by setting the priority of this DMA stream high and others' lower; so if you set it so, it's a no problem.

Then, the DMA controller might wait on its ''peripheral'' port to read out the received byte from UART. This happens through the AHB-APB bridge, which arbitrates between the requests from its two AHB ports; and of course through the APB bus itself. There is no public information on that arbitration I know of; a sane assumption might be, that both AHB ports are given access in round-robin fashion, so if the processor-side AHB port made a request just before the DMA-side port, it would take to finish that first. Now I am not familiar with the APB bus timing, but let's just assume that for a processor read request there would be one APB cycle to resynchronize with the AHB, one cycle to transfer the address, and one cycle to transfer back the data. Similarly for the DMA request, so that would result in worst case of 6 APB cycles for the DMA to perform the read. Maybe some test could be devised to find out the worst case for this read. You may mitigate this by not allowing the processor to access the APB directly during the critical DMA transfer, but whether this is viable is upon your application.

On the ''memory'' port, the DMA again needs to wait until the arbiter of AHB bus where the target SRAM sits would give it access. Plus there might be other AHB masters fighting for that bus - again the processor, the other DMA, the ETH DMA, the USB HS DMA. And again, little is known of the arbiter here too. You can mitigate this by chosing different target SRAMs for different DMA transfers.

On the other hand, the double-buffering USART receiver allows for worst-case of almost 2 characters' time from Rx-complete signalling until the receiver character is fetched, so you have roughly 2us rather than 1us to complete the operation in the worst case, provided the total throughput is good enough to transfer the characters in average 1us each.

Generally, unless there is a very heavy data traffic going on from multiple peripheral sources, in a tens-of-MHz system even a 1MBps USART *transfer* presents only a few % of the total throughput. It's usually the *processing* side where the bottleneck occurs.

JW

apiccianoit · ‎2014-02-17

Posted on February 17, 2014 at 15:09

Thank you for your answer. I think I'll make some tests in Direct mode.