cancel
Showing results for 
Search instead for 
Did you mean: 

STM32H743 (STM32H7) SDMMC DMA not configurable

caleb
Associate III

Hello,

I'm attempting to use the SDMMC with an SD card, and with DMA. However, when I use the STM32CubeMX IDE, there is no option to enable or use the DMA.

Using: STM32CubeMX version 5.3.0

Here's the configuration for SDMMC1 on the STM32H743

0690X00000AA4V8QAL.png

And here it is for SDMMC1on the STM32F7

0690X00000AA4VIQA1.png

Notice that there is no DMA configuration option for the H7, but there is for the F7.

Does this mean that ST doesn't have DMA based drivers on the H7? With the polling drivers, my performance is extremely slow -- something like 0.2MB/sec.

Any idea how to get reasonably high write speeds on the SDMMC interface on the H7?

BTW, I'm using a custom board without the transceiver.

Thanks,

-Caleb

1 ACCEPTED SOLUTION

Accepted Solutions

You need to write some multiple of the sector/cluster size, writing random and short lengths via f_write will be brutal, I'm writing 32KB blocks. The erase size on the cards probably pushing 128KB. You need to employ better buffering.

There is likely other code in the library dropping the clock setting further.

Y-Step STM32H743, 50 MHz wire clock

Core=400000000, 400 MHz

CPUID 411FC271 DEVID 450 REVID 1003

Cortex M7 r1p1

STM32H7xx

C0000018 2000BCC8 00000000

10110221 12000011 00000040

FPU-D Single-precision and Double-precision

I'm not using CubeMX, custom board, ported my NUCLEO-H7 SDMMC BSP over

Decoder Wheel for clock sources, run after test

    {
     uint32_t ck = SDMMC1->CLKCR & 0x3FF;
     uint32_t sdmmcsel = RCC->D1CCIPR & (1 << 16);
     PLL1_ClocksTypeDef PLL1_Clocks;
     PLL2_ClocksTypeDef PLL2_Clocks;
     uint32_t sdmmc_ker_ck;
     HAL_RCCEx_GetPLL1ClockFreq(&PLL1_Clocks);
     HAL_RCCEx_GetPLL2ClockFreq(&PLL2_Clocks);
     printf("PLL1_Q_CK=%9d, %6.2lf MHz\n", PLL1_Clocks.PLL1_Q_Frequency, (double)PLL1_Clocks.PLL1_Q_Frequency*1e-6);
     printf("PLL2_R_CK=%9d, %6.2lf MHz\n", PLL2_Clocks.PLL2_R_Frequency, (double)PLL2_Clocks.PLL2_R_Frequency*1e-6);
     sdmmc_ker_ck = (sdmmcsel ? PLL2_Clocks.PLL2_R_Frequency : PLL1_Clocks.PLL1_Q_Frequency);
     printf("SDMMC1_CK %9d, %6.2lf MHz\n", ck, ((double)sdmmc_ker_ck * 1e-6) / (double)(2.0 * ck));
    } // sourcer32@gmail.com

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

View solution in original post

13 REPLIES 13

>> With the polling drivers, my performance is extremely slow -- something like 0.2MB/sec.

You're doing something very wrong then, as the polled can hit basically the same numbers as DMA, except the DMA has less bus/interrupt loading issues.

POLLED, 16GB SanDisk Ultra PLUS, no transceiver, no DDR

HCLK=200000000, 200.00 MHz

APB1=100000000, 100.00 MHz

APB2=100000000, 100.00 MHz

PLL1_Q_CK=200000000, 200.00 MHz

PLL2_R_CK= 48000000, 48.00 MHz

SDMMC1_CK     2, 50.00 MHz

CRC32 A534026F Memory Image

32768000 Bytes, 1715150723 Cycles

 7.64 MBps Write (FatFs)

4288 ms run time

 7.64 MBps (Sanity Check)

20008E00 20008E00 2048 COUNTER.001

................................

CRC32 A534026F PKZIP 5ACBFD90 COUNTER.001

COUNTER.001

32768000 Bytes, 677958670 Cycles

19.33 MBps Read (FatFs)

1695 ms run time

19.33 MBps (Sanity Check)

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

Thanks Clive, good to know! I thought something must be very wrong.

Are you on an H7 or an F7, or something else.

My basic setup is this:

  • STM32H743 on a Nucleo 144 pin, jumpered over to a SD breakout board.
  • Using STM32CubeIDE to generate code
    • SYSCLK = 400
    • APB1 = 100
    • APB2 = 100
    • PLL_Q_CK = 50
    • SDMMC_CLK = 50 (running from PLL_Q)
    • actual clock speed at SD card is 12.5MHz
  • Configure SDMMC2 for 4 bits (no DMA because no DMA option seems to be available)
  • Generate code
  • Change USE_SD_TRANSCEIVER to 0 from 1 in stm32h7xx_hal_conf.h
  • Add the following bits to my main.c:
  FATFS fs;
  FIL file;
  check_err(f_mount(&fs, "", 1));
  check_err(f_open(&file, "headworn.txt", FA_WRITE|FA_CREATE_ALWAYS));
  const char msg[] =
		  "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789"
		  "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789"
		  "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789"
		  "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789"
		  "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789"
		  "01234567890\n";
  const unsigned bytes_to_write = strlen(msg);
  unsigned bytes_read;
  int i;
  int MEGABYTE = (1024*1024 / sizeof(msg));
  for (i = 0;i  < MEGABYTE * 1; i++) {
	  check_err(f_write(&file, msg, bytes_to_write, &bytes_read));
  }
//  HAL_GPIO_WritePin(RED_GPIO_Port, RED_Pin, 1);
  check_err(!(bytes_read == bytes_to_write));
  check_err(f_close(&file));
  check_err(f_mount(0, "", 0));

This all runs okay and actually writes to the SD card, but is mind-bogglingly slow at 0.2 MB/sec.

If I plug the same card into a USB SD reader and write files from my PC, it's runs at about 8MB/sec.

Any idea what I'm missing? Any chance you can share your code that gets performance numbers like this?

An interesting note is that it doesn't seem to matter what my SD clock speed is. I can slow it way down, and it still takes the same amount of time to run. This suggests to me that it's waiting somewhere on responses from the SD card itself.

Thanks again Clive for all your incredible help on this forum!

Cheers

-Caleb

You need to write some multiple of the sector/cluster size, writing random and short lengths via f_write will be brutal, I'm writing 32KB blocks. The erase size on the cards probably pushing 128KB. You need to employ better buffering.

There is likely other code in the library dropping the clock setting further.

Y-Step STM32H743, 50 MHz wire clock

Core=400000000, 400 MHz

CPUID 411FC271 DEVID 450 REVID 1003

Cortex M7 r1p1

STM32H7xx

C0000018 2000BCC8 00000000

10110221 12000011 00000040

FPU-D Single-precision and Double-precision

I'm not using CubeMX, custom board, ported my NUCLEO-H7 SDMMC BSP over

Decoder Wheel for clock sources, run after test

    {
     uint32_t ck = SDMMC1->CLKCR & 0x3FF;
     uint32_t sdmmcsel = RCC->D1CCIPR & (1 << 16);
     PLL1_ClocksTypeDef PLL1_Clocks;
     PLL2_ClocksTypeDef PLL2_Clocks;
     uint32_t sdmmc_ker_ck;
     HAL_RCCEx_GetPLL1ClockFreq(&PLL1_Clocks);
     HAL_RCCEx_GetPLL2ClockFreq(&PLL2_Clocks);
     printf("PLL1_Q_CK=%9d, %6.2lf MHz\n", PLL1_Clocks.PLL1_Q_Frequency, (double)PLL1_Clocks.PLL1_Q_Frequency*1e-6);
     printf("PLL2_R_CK=%9d, %6.2lf MHz\n", PLL2_Clocks.PLL2_R_Frequency, (double)PLL2_Clocks.PLL2_R_Frequency*1e-6);
     sdmmc_ker_ck = (sdmmcsel ? PLL2_Clocks.PLL2_R_Frequency : PLL1_Clocks.PLL1_Q_Frequency);
     printf("SDMMC1_CK %9d, %6.2lf MHz\n", ck, ((double)sdmmc_ker_ck * 1e-6) / (double)(2.0 * ck));
    } // sourcer32@gmail.com

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
Danish1
Lead II

My reading of the stm32h7 reference manual suggests that the SDMMC module has its own built-in DMA controller IDMA.

You can't use DMA1 or DMA2 so there's a lot less to configure.

(I think you can use the MDMA for SDMMC DMA).

Now how much of this is brought out to stm32Cube is down to ST. You might have to write your own code to use the SDMMC IDMA.

(My main use of STM32Cube is to explore how I might distribute my required peripherals amongst the ones present on a microcontroller, as there are a lot of shared pins and DMA channels/streams. I tend not to use its code-generation ability.)

Hope this helps,

Danish

Hi. "The SDMMC host interface embeds a dedicated DMA controller allowing high-speed transfers between the interface and the SRAM." - from STM32H743 datasheet.

You can use HAL_SD_ReadBlocks_DMA and HAL_SD_WriteBlocks_DMA functions (they just start DMA transfer). And you must wait for transactions. Check HAL_SD_TxCpltCallback or HAL_SD_RxCpltCallback.

caleb
Associate III

Ah! Writing much bigger blocks fixes the problem largely. Thanks for the tip! Now I'm getting over 4 megabytes/second write speed at 12.5MHz clocking. My voltage level translator probably can't keep up with 50 MHz clocking, so it doesn't work that fast. But 4 MBPS is good enough for this project.

The clock throttling (getting 12.5 MHz clock when expecting 50 MHz) is happening because SDMMC Clock Divide factor set in the CubeMX seems to be completely ignored by the driver code generated by CubeMX/IDE. My SDMMC clock was 50MHz, and the clock computations all assume 200 MHz clock

stm32h7xx_ll_sdmmc.h has the following defines which are the actual divide factors that get used.

/* SDMMC Initialization Frequency (400KHz max) for Peripheral CLK 200MHz*/                               #define SDMMC_INIT_CLK_DIV ((uint8_t)0xFA) 
/* SDMMC Default Speed Frequency (25Mhz max) for Peripheral CLK 200MHz*/
#define SDMMC_NSpeed_CLK_DIV ((uint8_t)0x4)
/* SDMMC High Speed Frequency (50Mhz max) for Peripheral CLK 200MHz*/
#define SDMMC_HSpeed_CLK_DIV ((uint8_t)0x2) 

Thanks again Clive, and everybody else with helpful answers.

Also, critically, I needed to hand-modify USE_SD_TRANSCEIVER in the generated code. This is kind of maddening. Is there any better library than the CubeMX library? For free or for pay?

Thanks,

-Caleb

Caleb

STM32CubeMX 5.3.0 has the option under SDMMC "Parameter Settings"

0690X00000AADhAQAX.png

The library throttles the clock based on the card technology. It assumes a 200-240 MHz PLL source, and divides by eight (4 x2) for SDHC cards to get 25 MHz

I just localize a copy of the SD/MMC library and adjust. All microSD should be rated to 50 MHz, and the library shouldn't make arbitrary decisions about the source PLL and frequency. Hence the decoder to see what cards it actually dealt you. If it is using a 50 MHz Q clock, or 48 MHz, your wire clock is likely to be closer to 6.25 MHz, not 12.5 MHz

200/8 = 25 MHz for ClockDiv = 4

200/4 = 50 MHz for ClockDiv = 2

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

Thought the top-post indicated you didn't have a transceiver?

Unhelpful code in, makes assumptions

STM32Cube_FW_H7_V1.5.0\Drivers\STM32H7xx_HAL_Driver\Src\stm32h7xx_hal_sd.c

    /* Check if user Clock div < Normal speed 25Mhz, no change in Clockdiv */
    if(hsd->Init.ClockDiv >= SDMMC_NSpeed_CLK_DIV)
    {
      Init.ClockDiv = hsd->Init.ClockDiv;
    }
    else if (hsd->SdCard.CardSpeed == CARD_ULTRA_HIGH_SPEED)
    {
      /* UltraHigh speed SD card,user Clock div */
      Init.ClockDiv = hsd->Init.ClockDiv;
    }
    else if (hsd->SdCard.CardSpeed == CARD_HIGH_SPEED)
    {
      /* High speed SD card, Max Frequency = 50Mhz */
      Init.ClockDiv = SDMMC_HSpeed_CLK_DIV;
    }
    else
    {
      /* No High speed SD card, Max Frequency = 25Mhz */
      Init.ClockDiv = SDMMC_NSpeed_CLK_DIV;
    }

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..