Questions on calculating DMA latency on STM32H7 chips

BenDDMC · ‎2025-06-09

I found an application note which goes in depth on DMA transfer speeds and clocking for F7 chips, but haven't been able to find anything similar for H7 chips. I have a few questions:

How are each of the DMA controllers clocked on an H7xx chip? My assumption is MDMA is on the AXI clock and DMA1, DMA2 and BDMA are on the AHB clock. Is this correct?
How would I go about calculating the worst-case latency for a memory-to-memory transfer by a given DMA controller from a given location in memory to another (in the same bank, or a different one)? How much latency does each domain link add, and do the H7 memory buses follow the same round robin allocation scheme described in the F7's application note?

For example, if I were transferring 1 word of data from SRAM1 to AXI SRAM, how long would it take using each of the BDMA, DMA1, and MDMA controllers?

mƎALLEm · ‎2025-06-10

Hello,

If the DMA1/DMA2 transfer is performed inside D2 for example from/to SRAMx in D2, you need to follow the same calculation as provided in the AN4031 (same as F4, F7).

If outside, I think you need to add one AHB cycle to transfer from a domain to another.

For MDMA I don't have the data.

To give better visibility on the answered topics, please click on "Accept as Solution" on the reply which solved your issue or answered your question.

View solution in original post

mƎALLEm · ‎2025-06-10

Hello,

If the DMA1/DMA2 transfer is performed inside D2 for example from/to SRAMx in D2, you need to follow the same calculation as provided in the AN4031 (same as F4, F7).

If outside, I think you need to add one AHB cycle to transfer from a domain to another.

For MDMA I don't have the data.

To give better visibility on the answered topics, please click on "Accept as Solution" on the reply which solved your issue or answered your question.

MasterT · ‎2025-06-10

Location on buses , and consequently clock sources:

#define MDMA_BASE (D1_AHB1PERIPH_BASE + 0x0000UL)

#define DMA1_BASE (D2_AHB1PERIPH_BASE + 0x0000UL)
#define DMA2_BASE (D2_AHB1PERIPH_BASE + 0x0400UL)
#define DMAMUX1_BASE (D2_AHB1PERIPH_BASE + 0x0800UL)

#define BDMA_BASE (D3_AHB1PERIPH_BASE + 0x5400UL)
#define DMAMUX2_BASE (D3_AHB1PERIPH_BASE + 0x5800UL)

Regarding latency, better to test yourself, declare volatile 1k-10k array and check what numbers you get.

From my experience, doing tests of DAC and ADC in analog domain over dma, H7 running at 480 MHz could only guaranteed delivery data in time < 25 MHz, above this dma does "gaps" in stream. May not be an issue for MEM2MEM , but point is that many factors involved - uCPU load for example. So, there is always be two numbers one for "average" access time spread over all array of data, usually advertised, and another "guaranteed" access time, that may be 2-5 times slower than first.