cancel
Showing results for 
Search instead for 
Did you mean: 

External Loader in ST-LINK works very slowly. 66MB in 25 minutes.

MPole
Associate II

Hello. I have writed external loaders for 2 flash memories: W25N01GV - 1Gb and GD25Q127C - 128Mb. They are working correctly but too slow. For example, programming 66MB of data take time about 25 minutes! That's horrible long. What can I do to make programming faster(or is it possible at all)? 

I've tried one thing. I make the write function in my program empty, it doesn't write anything. But in this situation the programming process is also as long as with the normal write function. So the problem is in the ST-link. 

int Write (uint32_t Address, uint32_t Size, uint16_t * Buffer)

{

//sFLASH_WriteBuffer((uint8_t *)Buffer, Address-START_ADDR, Size);

return 1;

}

12 REPLIES 12
Andreas Bolsch
Lead II

Probably you're using SWD connection? As far as I know ST-Link V2 variants support at most 4 MHz SWD clock. After subtracting some overhead, this means hardly more than 200 kByte/s transfer rate. With OpenOCD and ST-Link V2 at 4 MHz I got about 150 kBytes/s read/write rate to external NOR flash, either via QSPI or bitbanging.

With ST-Link V3 and 24 MHz SWD clock, read up to 600 kBytes/s, write up to 450 kBytes/s.

Could still be improved, e. g. F76x datasheet allows SWD clock up to 80 MHz, but where is a corresponding debug adapter ... For JTAG the upper limit is even lower.

So, there is definitely much room for improvement in your setup. I don't know how the external loader interface in STLink-Utility/STM32CubeProgrammer (BTW: which one do you use) is implemented in detail (that's closed source, I'm afraid), but for reasonable speed sort of "dual-port buffer" (i. e. buffer is filled/drained via debug access and CPU simultaneously) or multiple buffers are necessary. Maybe there is only a "ping-pong" style transfer implemented?

However, for really big NAND flashes, the approach via SWD is not feasible. Think about USB-HS, Ethernet or removable SD-card.

MPole
Associate II

I'm using STLINK. After your response I install STM32CubeProgrammer and tried to download my file to the 1Gb memory. It goes faster but still programming takes much time. In STLINK was 40MB in 13 minutes, now in STM32CubeProgrammer its 6 minutes so speed is more than x2. Still nothing special.

Uwe Bonnes
Principal II

Perhaps get a STLINKV3 where SWD speed can get faster. Or think about another way of programming.

MPole
Associate II

Yeah. I think about STM32CubeProgrammer + StLink with SWD at 24MHz. Thanks for help!

AVI-crak
Senior

The possible theoretical speed of the SWD interface is half the frequency of the system bus of the external chip. All stm chips have a lower system bus limit of 8 MHz in the cold start mode.

But the speed of the system bus can be increased after setting the PLL.

Will the SWD interface work at frequencies above 50 MHz?

To fill the external qspi memory using sd card. All other options have either a high price tag, or low speed, or are difficult to implement.

The write speed of qspi 256Mbit using an external sd card is 15 seconds.

Uwe Bonnes
Principal II

AVI-crak: Can you point to an original source for " The possible theoretical speed of the SWD interface is half the frequency of the system bus"?

I was able to find a limit, but not a confirmation of the high speed of work.

The SWD / jtag interface has two frequency domains. For external communication, the programmer frequency is used; for internal communication, the system core bus frequency of the ARM processor is used. The connection of two frequency domains uses the synchronization of an external latch, when writing from the system bus, when reading from an external programmer. The latch response time is not uniform, and may take one whole wait cycle.

In addition - synchronization works at the level of individual bits of information, not whole blocks !!!

To limit the number of errors - the external frequency should have 1/2 of the internal.

For learning and experiences https://github.com/texane/stlink

Looked to be 15 MHz on the On-Board V3 I most recently looked at.

The V3 driver needs works, having a lot of kernel level driver lockup issues with VCP

Tips, buy me a coffee, or three.. PayPal Venmo Up vote any posts that you find helpful, it shows what's working..

Do you have any other interfaces on the board you can pull data via? Ethernet, flash drive, QSPI loader dongle?

Tips, buy me a coffee, or three.. PayPal Venmo Up vote any posts that you find helpful, it shows what's working..