cancel
Showing results for 
Search instead for 
Did you mean: 

STM32H7 - MDMA inits differenly in step-by-step or run mode

tarzan2
Associate III

Hello,

I'm trying to transfer a 4000 uint16_t buffer from SRAM4 to DTCM. This is working with a basic configuration of the MDMA module : 

- Buf size = 32

- Block len = 4000

- No burst mode

- in/out size = half word.

The MDMA needs 135µs to transfer the whole buffer, and it looks quite slow. I'm questioning about the different way to improve the tranfer, using burst mode and/or packed mode.

When implementing the burst mode, I have a strange behavior of the MDMA initialization. 

 
  hmdma_mdma_channel0_sw_0.Instance = MDMA_Channel0;
  hmdma_mdma_channel0_sw_0.Init.Request = MDMA_REQUEST_SW;
  hmdma_mdma_channel0_sw_0.Init.TransferTriggerMode = MDMA_BLOCK_TRANSFER;
  hmdma_mdma_channel0_sw_0.Init.Priority = MDMA_PRIORITY_VERY_HIGH;
  hmdma_mdma_channel0_sw_0.Init.Endianness = MDMA_LITTLE_ENDIANNESS_PRESERVE;
  hmdma_mdma_channel0_sw_0.Init.SourceInc = MDMA_SRC_INC_HALFWORD;
  hmdma_mdma_channel0_sw_0.Init.DestinationInc = MDMA_DEST_INC_HALFWORD;
  hmdma_mdma_channel0_sw_0.Init.SourceDataSize = MDMA_SRC_DATASIZE_HALFWORD;
  hmdma_mdma_channel0_sw_0.Init.DestDataSize = MDMA_DEST_DATASIZE_HALFWORD;
  hmdma_mdma_channel0_sw_0.Init.DataAlignment = MDMA_DATAALIGN_PACKENABLE;
  hmdma_mdma_channel0_sw_0.Init.BufferTransferLength = 32;
  hmdma_mdma_channel0_sw_0.Init.SourceBurst = MDMA_SOURCE_BURST_128BEATS;
  hmdma_mdma_channel0_sw_0.Init.DestBurst = MDMA_DEST_BURST_128BEATS;

If running in debug mode, the live watch of MDMA_C0TCR.SBURST/DBUST shows 0 and the transfer is slow.

If running the HAL init function of MDMA in step-by-step mode, MDMA_C0TCR is correctly initialized. After what running the program works fine and the transfer time is really reduced (85µs)

I have similar init issues if trying to use packet mode, and in all cases with the TRGM bit : The register init is ok when debuging step-by-step, and wrong if running the program.

Is there a real issue or is it another dirty trick of the complex superscalar Cortex-M7 architecture ?

I'm not watching the register immediately after the write, but a long time after, to get rid of the propagation time between trough the different busses.

Any help appreciated :)

Thanks

1 REPLY 1
TDK
Super User

When the program is running, MDMA_C0TCR may be being changed before you see it. Live watch isn't updating immediately.

> The MDMA needs 135µs

How are you timing this exactly? Be specific.

8 kB in 135 us is 475 Mbps. Reasonable. ST doesn't publish exact values here because it's too complicated.

If you feel a post has answered your question, please click "Accept as Solution".