cancel
Showing results for 
Search instead for 
Did you mean: 

STM3210B-EVAL Mass storage - SD card example is too slow

brunoferren9
Associate II
Posted on March 13, 2018 at 13:19

I am currently using the STM3210B-EVAL Mass Storage example, which allow access of the SD card from a PC, through Mass Storage USB connection. The SD card is accessed through SPI. I'm using a 2GB microSD card, and I can successfully read and write files on it through Windows. So far so good.

However, I measure very poor write performances, something like 80 kB/s (writing a 10MB file takes approx 2 minutes). I measured up to 3MB/s writing performances on the same micro SD card, when used on a card reader, so the issue seems not to be that the card is too slow in itself.

Actually, I think I found out where most of the time is lost, but I fail to see why and how to correct this. Here is how it goes :

For each 512 bytes buffer to write, a write command (CMD24) is sent, followed by the 512 bytes of data. I modified the code and I use a DMA here so there is no wasted time. With a 12MHz clock on the SPI, this take approx 340ns. The SD card then send the 'data ok' ack (0x05, or 0xE5 masked), and then sends zeros (MISO line is low) while writing the data. And there is the issue : this takes something like 2 to 4 ms! Then the data line goes up again, indicating the end of the write process, and the next buffer is sent.

There is a little time lost elsewhere, but even if there were not, counting just these 2-4ms for each 512 bytes, I can't expect more than approx 150kB/s of writing speed. This is way too low.

I noticed the example code is really simple, with only a few commands used (CMD17 and 24 for read and write, CMD9, 10 and 13 for infos and status, CMD0 and 1 for startup, and that's it).

I'm not looking for maximum performances, I know the SPI bus will limit the max speed, but what I notice here seems in no way linked to the limitations of the SPI bus, but to the time the SD card takes to write a buffer.

How would you solve this ?

9 REPLIES 9
Posted on March 13, 2018 at 15:09

Single sector accesses will have a very high command processing overhead, you need to do multi-sector IO.

Performance will be brutal at the f_write() level if you do a lot of small writes. You should manage your buffering to write blocks which are aligned to the sector/cluster boundaries. A buffer of 32KB will be near optimal.

Tips, buy me a coffee, or three.. PayPal Venmo Up vote any posts that you find helpful, it shows what's working..
Posted on March 13, 2018 at 15:41

You mean using the CMD25 (WRITE_MULTIPLE_BLOCK) command ? That's indeed what I'm currently investigating.

Posted on March 13, 2018 at 16:28

Yes

https://community.st.com/0D50X00009XkYXdSAN

But honestly why are you using a STM3210B and SPI? The part is like a decade old, if you must use an F103 series part pick one with SDIO.

USB in your case is going to put a 700-800 KBps ceiling on things in the most optimumimplementation.

Tips, buy me a coffee, or three.. PayPal Venmo Up vote any posts that you find helpful, it shows what's working..
Posted on March 13, 2018 at 17:13

I'm actually using a STM32F070, with the STM32F0xx_StdPeriph_Driver and STM32_USB-FS-Device_Driver libraries (I migrated the USB library to make it work on a F070). That's because I migrated a code that was initially developped on a F103, and at the time, changing the StdPeriph library and modifying the USB one, (almost) whithout changing my user  code, seemed easier than starting using Cube and the newer HAL libraries.

So now it's easier for me to start from an example that uses these older libraries, and since I have an old STM3210B eval board lying around, and several years of experience on the F103, using this example code seemed like the simplest way. And it was actually quite easy to make it work on the F070.

But I'm just realizing now that the example code is really not optimized to begin with. I also noticed it doesn't support microSD HC cards, which will eventually be a problem. So now I have to modify it, or start all over with CubeF0. Not an easy choice.

I tested a simple code using CMD25 and this indeed leads to better results, now I'm trying to figure out how to modify the mass storage example to use it properly. If I can get a 500 KBps throughput I'll be happy with it.

Posted on March 13, 2018 at 17:32

Seem to recall there being a descriptor on the USB side indicating the maximum number of sectors supported in a single request.

The newer stm32_adafruit_sd.c code supports the commands used by SDHC/XC cards, here the testing goes to 200GB cards

Tips, buy me a coffee, or three.. PayPal Venmo Up vote any posts that you find helpful, it shows what's working..
Posted on March 13, 2018 at 17:49

OK, I'll have to dig deeper, but I think I'm on the right track. Thanks for the help!

Posted on March 15, 2018 at 15:57

Hi again,

I found the HC support in the adafruit file, this seems usable. However, this same file still uses only the single block write option (CMD24).

Is there a code example out there which implements the CMD25 function ?

Posted on March 15, 2018 at 17:24

I have a patched up version running in the lab. With a 25 MHz SPI clock on an L4 it yields 1.13 MBps, vs 0.16 MBps using the stock code. Currently also sends ACMD23 to pre-erase the blocks being sent.

Tips, buy me a coffee, or three.. PayPal Venmo Up vote any posts that you find helpful, it shows what's working..
Posted on March 16, 2018 at 10:18

I just successfully tested a modified version of my code with proper use of the CMD25 command (I had to go up into the BOT layer for that), speed goes up to 270 kBps. Far from optimal yet, but usable for my application. And I still have to test using the ACMD23 command, plus hardware optimizations (my SPI clock is only 12MHz at the moment).

I think I'm good now, thanks for the help!