STM32F4 DMA Address Update penalty with double buffering

ribdp · ‎2024-05-21

So the background is essentially this -

I am trying to use one of the streams of STM32F4 DMA2 in circular, DBM mode.
The DMA stream is being used to move data out of the SPI DR register at regular intervals, as soon as data becomes available at SPI RX (the dma stream and channel respond to the SPI RX dma requests).
In DMA transaction complete callback, I am updating the M0 base address (which is allowed because in DBM mode, the DMA would switch to the memory region defined by M1).

I however see some SPI data being missed/skipped right after the first NDTR data.

Things I have tried -

Reducing the rate at which SPI is receiving data. (Brought it down from 1MSPS to 10KSPS).
1. I still see discontinuities right after NDTR data.
Modifying the transaction complete callback to a) return directly b) do some dummy operation on a different DMA2 stream and then return.
1. In both cases I don't see any discontinuities at the NDTR mark.
2. This leads me to conclude that writing to the M0AR incurs a time penalty where the DMA stream doesn't work?

(Figure showing the discontinuity)

My question is essentially whether this is documented behaviour, and if so, whether there's some heuristics available on the time penalty.

Any leads/pointers would be very helpful!

Thanks

Danish1 · ‎2024-05-21

You have to look at the CT bit in DMA_SxCR to know which of DMA_SxM0AR and DMA_SxM1AR you need to update on each transaction-complete callback.

ribdp · ‎2024-05-21

Hi @Danish1 , thanks for the reply!

Sorry, should've mentioned I registered a callback using the HAL API, so I'm positive the transfer-complete callback is invoked only when CT=1.

To be doubly sure, I also stored the CR register value inside the callback, first thing, and inspected it later - CT was indeed 1 as expected.

Danish1 · ‎2024-05-21

You'll have to delve into the HAL source-code as to how they try to make it work. CT might be 1 on the first time the callback happens. But it should toggle each time.

How it _should_ work (without HAL):

set up DMA_SxM0AR (and maybe DMA_SxM1AR)
Start the DMA
If you haven't already done so, set up DMA_SxM1AR. It must happen before step 4
DMA completes. Hardware automagically switches to using DMA_SxM1AR. Transfer Complete interrupt triggered.
In response to Transfer Complete interrupt, update DMA_SxM0AR. (You know that's the one to update by looking at the CT bit in DMA_SxCR). This must happen before step 6.
DMA completes. Hardware automagically switches to using DMA_SxM0AR. Transfer Complete interrupt triggered.
In response to Transfer Complete interrupt, update DMA_SxM1AR. (You know that's the one to update by looking at the CT bit in DMA_SxCR). This must happen before step 8.
As step 4.

I don't use HAL. But I do successfully use double-buffered DMA at 921600 baud on a UART.

ribdp · ‎2024-05-21

Thanks @Danish1 !

So, you can register two separate callbacks with HAL - one for M0-complete, and the other for M1-complete.

The callback I was referring to, above, is the M0-complete - here I hope to ultimately be able to update the M0AR.

There is a single IRQ line for DMA interrupts, and the HAL decides which callback to invoke after checking the CT value.
My understanding matches the steps you've enumerated.

But given that you were able to use double-buffered DMA with UART at that rate, I guess I could double-check my implementation.

But specifically on the CT bit - had I been updating the M0AR at the wrong CT value, the RM says DMA should've aborted and the stream disabled. That's clearly not what's happening in my case - as I still get proper data after the momentary glitch/discontinuity. Any thoughts on this?