Skip to main content
MPole
Associate
April 23, 2019
Question

External Loader in ST-LINK works very slowly. 66MB in 25 minutes.

  • April 23, 2019
  • 7 replies
  • 3715 views

Hello. I have writed external loaders for 2 flash memories: W25N01GV - 1Gb and GD25Q127C - 128Mb. They are working correctly but too slow. For example, programming 66MB of data take time about 25 minutes! That's horrible long. What can I do to make programming faster(or is it possible at all)? 

I've tried one thing. I make the write function in my program empty, it doesn't write anything. But in this situation the programming process is also as long as with the normal write function. So the problem is in the ST-link. 

int Write (uint32_t Address, uint32_t Size, uint16_t * Buffer)

{

//sFLASH_WriteBuffer((uint8_t *)Buffer, Address-START_ADDR, Size);

return 1;

}

This topic has been closed for replies.

7 replies

Andreas Bolsch
Lead III
April 23, 2019

Probably you're using SWD connection? As far as I know ST-Link V2 variants support at most 4 MHz SWD clock. After subtracting some overhead, this means hardly more than 200 kByte/s transfer rate. With OpenOCD and ST-Link V2 at 4 MHz I got about 150 kBytes/s read/write rate to external NOR flash, either via QSPI or bitbanging.

With ST-Link V3 and 24 MHz SWD clock, read up to 600 kBytes/s, write up to 450 kBytes/s.

Could still be improved, e. g. F76x datasheet allows SWD clock up to 80 MHz, but where is a corresponding debug adapter ... For JTAG the upper limit is even lower.

So, there is definitely much room for improvement in your setup. I don't know how the external loader interface in STLink-Utility/STM32CubeProgrammer (BTW: which one do you use) is implemented in detail (that's closed source, I'm afraid), but for reasonable speed sort of "dual-port buffer" (i. e. buffer is filled/drained via debug access and CPU simultaneously) or multiple buffers are necessary. Maybe there is only a "ping-pong" style transfer implemented?

However, for really big NAND flashes, the approach via SWD is not feasible. Think about USB-HS, Ethernet or removable SD-card.

Tesla DeLorean
Guru
April 23, 2019

SD Cards get chewed up pretty quick. With a code-load dongle one could export SDIO or QSPI to a pin header/pads and make a super-cheap chip-on-a-board dongle to program a system.

Ethernet works well if you have it, stream/broadcast content from a server, can be done as part of final/functional test stage.

Tips, Buy me a coffee, or three.. PayPal Venmo (See Profile) Up vote any posts that you find helpful, it shows what's working..
MPole
MPoleAuthor
Associate
April 23, 2019

I'm using STLINK. After your response I install STM32CubeProgrammer and tried to download my file to the 1Gb memory. It goes faster but still programming takes much time. In STLINK was 40MB in 13 minutes, now in STM32CubeProgrammer its 6 minutes so speed is more than x2. Still nothing special.

Uwe Bonnes
Chief
April 23, 2019

Perhaps get a STLINKV3 where SWD speed can get faster. Or think about another way of programming.

Tesla DeLorean
Guru
April 23, 2019

Looked to be 15 MHz on the On-Board V3 I most recently looked at.

The V3 driver needs works, having a lot of kernel level driver lockup issues with VCP

Tips, Buy me a coffee, or three.. PayPal Venmo (See Profile) Up vote any posts that you find helpful, it shows what's working..
MPole
MPoleAuthor
Associate
April 23, 2019

Yeah. I think about STM32CubeProgrammer + StLink with SWD at 24MHz. Thanks for help!

Tesla DeLorean
Guru
April 23, 2019

Do you have any other interfaces on the board you can pull data via? Ethernet, flash drive, QSPI loader dongle?

Tips, Buy me a coffee, or three.. PayPal Venmo (See Profile) Up vote any posts that you find helpful, it shows what's working..
AVI-crak
Senior
April 23, 2019

The possible theoretical speed of the SWD interface is half the frequency of the system bus of the external chip. All stm chips have a lower system bus limit of 8 MHz in the cold start mode.

But the speed of the system bus can be increased after setting the PLL.

Will the SWD interface work at frequencies above 50 MHz?

To fill the external qspi memory using sd card. All other options have either a high price tag, or low speed, or are difficult to implement.

The write speed of qspi 256Mbit using an external sd card is 15 seconds.

Uwe Bonnes
Chief
April 23, 2019

AVI-crak: Can you point to an original source for " The possible theoretical speed of the SWD interface is half the frequency of the system bus"?

AVI-crak
Senior
April 23, 2019

I was able to find a limit, but not a confirmation of the high speed of work.

The SWD / jtag interface has two frequency domains. For external communication, the programmer frequency is used; for internal communication, the system core bus frequency of the ARM processor is used. The connection of two frequency domains uses the synchronization of an external latch, when writing from the system bus, when reading from an external programmer. The latch response time is not uniform, and may take one whole wait cycle.

In addition - synchronization works at the level of individual bits of information, not whole blocks !!!

To limit the number of errors - the external frequency should have 1/2 of the internal.

For learning and experiences https://github.com/texane/stlink

Uwe Bonnes
Chief
April 23, 2019

Well, I do not count texane or the rantings in OpenOCD as reference for the Fswd < Fclk/2 myth.

Look at the contributor list of texane ;)

B.t.w. have a look at https://github.com/UweBonnes/bl*ckm*gic/tree/stlinkv2 for bl*ckm*gic running on the PC with original STLINK firmware.