cancel
Showing results for 
Search instead for 
Did you mean: 

Parallel synchronous transmission using DMA affected by CPU overhead

RCata
Associate III

Hi friends,

I have DMA2 moving data between a look-up-table to GPIOB. DMA is fired by TIM8 that generates constant sampling time.

The system works fine, but when main loop have some instructions (dummy instructions that doesn't affect DMA system), DMA goes working wrong... losing samples.

Have this a solution ?

Thanks.

15 REPLIES 15

0690X000008AHYtQAO.bmp

This is a oscilloscope capture of the problem. Upper trace is PB0 and lower trace is the strobe signal. Strobe is TIM8_CH4, and it fires DMA in H->L edge. Look-up table is 0,1,0,1.

The DMA Time between L->H edge in strobe signal and PB0 becomes updating, is 70ns in that case.The larger code in main loop increases DMA time.

I tried allocate Look up table in sram2, but result is the same.

Any idea ? Can I predict maximum DMA time ?

Alex R
Senior

Can you use the FMC Module instead (Flexible Memory controller)? The FMC can be used to implement parallel busses 8-16-32 bit wide with hardware strobe signals and controllable times to interface with a number of memories or devices (LDC's, etc).

Yes, you got it right!. DMA jitter increases when main loop code is larger with instructions that makes cpu to access the same bus.

Before testing your idea, I tried to enabling D-cache, moving Lookup table to sdram memory, or enabling ART accelerator, but none of that was successful.

Increasing strobe time as you says, all works fine.

That lets me to ask the last question ... What is the maximum DMA jitter I can expect? Now I am leaving 32 clock cycles to avoid the problem, that is 148ns (bus clock is 216MHz).

Thanks!

Thanks for your suggestion Alex. Configuring FMC to drive my external DAC seems a good idea, but seems difficult to use too. My external DAC is 10 bit-wide. Actually the TIM8+DMA+GPIO seems work thanks to waclawek suggest. If can I know the maximum DMA jitter, the problem is solved.

Also thanks.

> What is the maximum DMA jitter I can expect?

That's a very hard question to answer, and it depends on various things (I assume you've already read AN4031).

The latency adds up (thus jitter of these sources also adds up in worst case) from 1. latency in DMA between trigger and start of transfer, 2. delays due to conflicts on source bus, 3. delays on destination bus.

Latency 1 is given by other active streams in the DMA - even if this particular stream has the highest priority, it has to wait until the currently active stream finishes its job. So worst case delay/jitter is given by the longest lasting other-stream DMA transfer (which again has constituents 2. and 3.).

Item 2 is the easiest - if you put the samples array into say RAM2 and no other busmaster accesses RAM2 (i.e. there are no variables accessed by the processor, the stack is not there, no other DMA goes there), latency is low (maybe one cycles) and there's no jitter. Even if there would be some conflict, SRAMs are fast and latencies and jitters are few (1-2-3) cycles.

Item 3 is probably the trickiest, as GPIO sit on AHB1, which contains almost all peripherals (most of them indirectly behind AHB/APB bridges, but still they are accessed by the masters through AHB1). If say the processor tries to read some peripheral at APB, this request has to cross the AHB/APB bridge, slowed down by the resynchronization to the possibly slower APB bus, wait until the addressed peripheral returns the answer and wait until that answer propagates through the AHB/APB bridge again. Depending on the AHB/APB divider, this may take even dozens of cycles. And, there are even worse cases, maybe the most prominent is the RTC, where in certain cases the wait lasts several RTC clocks - and RTC is clocked from LSE, i.e. at 32kHz... But that's probably the most extreme case and probably there are no other similar there.

JW

RCata
Associate III

Hi Jan,

In advance, thank you for your detailed explanation.

Some Jitter sources, as you said, can add some few cycles. This don't care much because my hardware can tolerate delay times of some microseconds, and round-robin algorithm can ensure a reasonable time, i guess.

Now, Look up table is in sram2. No other variables with dma accessing are in this area.

The most worryng is latency 1 you said. Because I am using DMA2 to do memcopys, feed the CRC peripheral, and ADC's transfers. ADC's take a sample every 10us, that not seems a problem. But if, as you said, a Stream must finish its whole block transfer before another stream takes control of DMA, this is catastrophic! That means, if a memcopy or CRC are running.. the jitter will be enormous! This is right ?

In that case what can I do ?. Split memcopys and CRC in little blocks using interrupts ?

:\