cancel
Showing results for 
Search instead for 
Did you mean: 

STM32 (stm32f103c8t6) DMA performance vs bitbang

kyrreaa
Associate III
Posted on November 10, 2016 at 09:34

I've implemented SDRAM access in stm32f103c8t6 using DMA.

I notice that performance of the DMA when doing memory to peripheral transfer is quite a lot slower than I expected. The highest reliable clock I can run is 4 MHz while keeping both my command/address and data transfers in sync with the clock.

Using both my dma transfers I could get a maximum of 4.5Msps transfer rate.

The actual organization is 16 bit command/address DMA and a 8 bit data DMA where the 8 bit data DMA direction changes according to read or write situation.

Running a single transfer did not cross 6.54 Msps transfer which seems to indicate 11 clock cycles (running 72 MHz) per transfer. (Less with two DMA channels active that have to both access the bus.) To verify this rate I set up a single 16 bit transfer to GPIOB.ODR with mem2mem turned on to remove any timing constraints I may have placed upon the transfer.

This gave me a maximum of 7.2 Msps transfer rate suggesting 10 clock cycles per transfer and that the timer triggering cost that extra cycle.

I also tried 8/16/32 memory access as well as writing to the 32bit register PORTB.BSRR to see if there was any overhead in byte/halfword/word access or conversion/truncation. There did not appear to be any.

Finally I set up a hardcoded bitbang of the port trying to write to the port at max speed, checking ASM to see a minimum of instructions were used. My best efforts gave 8Msps, a total of 9 clock cycles per transfer.

So, down to my questions:

Is really the DMA slower than bitbanging?

How does this compare to the F2/F4 series with smarter DMAs (FIFOS, memory segments and dual access etc)?

#stm32-sdram-dma-speed-transfer
1 REPLY 1
troy1818
Senior
Posted on November 10, 2016 at 10:35

The main purpose with DMA is to unload the CPU, not to get higher performance with data transfers. I think that you can get better performance with data transfers using 100% of the CPU doing this task than with DMA. But then again, the CPU cannot do anything else...