cancel
Showing results for 
Search instead for 
Did you mean: 

STM32F4 SDIO Wait_R state time

Robertson.Jamie
Associate II
Posted on December 17, 2013 at 18:58

I have an STM32F4 reading audio from a microSD, and am using the Standard Peripheral Library and the eval SDIO_SD.C code as a starting point. Things are working fine, but read rates are important for my application and I'm noticing significant differences in the cards that I've been testing. Using a scope, I've measured the time between sending the SD_ReadBlock() command (CMD17) and the start of the actual DMA transfer (by looking at D0), which I take to be the card's read access time. Not only does this time seem large, but it's substantially larger on a brand new SanDisk Ultra UHS-1 card (350us) than an old 2GB off-brand card (150us).

I tried successfully running the SDIO clock at 48MHz, but the actual transfer once it starts is only 42us (at 24MHz) so cutting this in half doesn't really help all that much. I guess I don't understand the hoopla over the higher speed class cards if the access time, which appears to be the major limiting factor, is as large as I'm seeing.

Am I missing something?

#stm32f4 #sdio
4 REPLIES 4
Posted on December 17, 2013 at 19:27

Latency != Throughput, I don't think I've even bothered measuring the former. Blocking devices work a lot more efficiently when larger block transfers are used.

128MB SD Reading @ 5.4 MBps, Writing @ 2.1 MBps

16MB SDHC Reading @ 10.6 MBps, Writing @ 7.4 MBps

If I set my clock optimally, I can get an SanDisk ''Extreme'' (45MBps?) Reading @ 15.8 MBps, Writing @ 9.7 MBps. I should perhaps rack up one of the ''Extreme Plus'' (80MBps) cards and see what that can do.
Tips, buy me a coffee, or three.. PayPal Venmo Up vote any posts that you find helpful, it shows what's working..
Robertson.Jamie
Associate II
Posted on December 17, 2013 at 19:54

Latency does affect throughput if it occurs on every 512-byte block. I guess what you're saying is that I should be using READ_MULT_BLOCK instead of READ_SINGLE_BLOCK.

Posted on December 17, 2013 at 21:33

If you access your device a block at a time the command overhead will simply kill you, this is true of a large swath of block storage devices. Similarly if you fread()/fwrite() data at 32 bytes at a time the file system and card will kill your performance. A strategy that gets you to a cluster size will help significantly.

The SD cards nominally have an internal 128 KB erase block, and block management. Devices formatted incorrectly will also be significantly slowed.

The numbers I've quoted are for a 32 MB file transfer performed through FATFS using a block size of 32 KB, which is a reasonable sweet-spot between memory usage and device efficiency.

I tried the ''Extreme Plus'' was seeing 15.7 MBps/9.7 MBps for that. An older 2GB SanDisk (No Class marked SD) saw 13.7 / 1.9 MBps. I have a full sized Walgreens 2GB special, but would need to put it on another non-F4 system, but pretty sure it will be slower.
Tips, buy me a coffee, or three.. PayPal Venmo Up vote any posts that you find helpful, it shows what's working..
Robertson.Jamie
Associate II
Posted on December 17, 2013 at 23:20

I've written my own optimized, read-only FAT file system and therefore bypass fread() entirely. SD block size is transferred directly into an audio buffer, so I'm only getting hit with latency in front of each SD read operation. What you point out makes perfect sense - I need to restructure my code to read in larger blocks. As it is, I'm waiting up to 350 usecs every time I want to transfer 512 bytes, which itself only takes 42 usecs at normal SDIO speeds.

I'm still surprised that newer cards appear to have longer read wait times than older cards (admittedly based on a very small sampling.)