stm32L4S5, spi pause between 8-bit frames

heisenbug · ‎2024-02-13

Hi all,

i am performing some 400bytes blocks transferts by SPI, at 16Mhz.

I have NSS off, so no CS up/down between frames.

From some testing, whatever system clock i set, or whatever spi clock i set, there is a fixed 4us time between 2 consecutive frames. This pause is of course very visible and heavy at 20Mhz spi clock, by scope something like:

__||||||||_____________||||||||______________|||||||_____

Transfer is done by LL_SPI_TransmitData8

If any help to reduce this delay between frames, would be very good, or at least to know i cannot reduce it in any way will be helpful too.

thanks a lot

TDK · ‎2024-02-13

Code is definitely impacting it. The hardware will only have pauses when the code can't keep up.

I don't see how TI mode would help.

If you feel a post has answered your question, please click "Accept as Solution".

View solution in original post

TDK · ‎2024-02-13

Use DMA or improve the speed of your code. Show your code if you want tips on how to improve it.

Compiling using Release settings will also speed things up a bit.

If you feel a post has answered your question, please click "Accept as Solution".

heisenbug · ‎2024-02-13

hi @TDK

thanks a lot.tw)

Well, code is actually the one in zephyr os, but anyway, i changed mcu clock speed
from 80 to 120Mhz, this delay is still exactly the same, 4us. So looks like the code is not impacting it.

Sure, if you confirm i cannot move it down, my next step is using DMA.

And, question, could the TI mode help here (i can control CS by software in case) ?

TDK · ‎2024-02-13

Code is definitely impacting it. The hardware will only have pauses when the code can't keep up.

I don't see how TI mode would help.

If you feel a post has answered your question, please click "Accept as Solution".

AScha.3 · ‎2024-02-13

Hi,

I am using the SPI to drive a 2,2" TFT here :

Writing the line/grid needed 305ms with standard HAL_ calls ,

optimized access with direct register write now needs 6ms !

(always at 18Mbit speed , no DMA, maximum with spi2 on a F303 cpu)

If you feel a post has answered your question, please click "Accept as Solution".

heisenbug · ‎2024-02-13

@TDK , sorry, not really clear how this 4us can be exactly the same if i move the system clock 30% faster, code should be executed faster, so it should be reduced. Could you maybe explain this a bit in depth ?

@AScha.3

thanks,

quite new to stm32, i am actually using LL_SPI_TransmitData8 that should be the "direct register access" already ?

AScha.3 · ‎2024-02-13

Sorry - dont know , i never tried LL since long time , when STM switching to HAL and LL ...

But yes, the macro should do it - but if i write to register , i know it happens at max speed.

example for write -> spi

SPI2->DR = cmd; //HAL_SPI_Transmit(&ILI9341_SPI_PORT, &cmd, sizeof(cmd), HAL_MAX_DELAY);

HAL call needs about 900ns , direct write about 14ns . (But no error checking etc , what HAL always doing.)

I use the HAL (hoping: no errors there) because you get , what you want without much fiddling around.

And if something should be faster, i play the game to write direct.

See : most times its not important, when you switch on something, needs 10ns or 10us - you anyway (as a human) many times slower, to see it. Just in some cases, here when drawing the background grid, you really see 300ms drawing the lines; so here is a point, speeding up is useful . Then same drawing in 6ms - you cannot see , it looks like "instant" grid there. But this is just like a crossword puzzle for me - just for fun.

You can use just the HAL and use the peripherals with their intended purpose , so set the DMA -> SPI to transfer a block of data at maximum speed. No need for crossword puzzle - if i am at work, i am not for fun there.

If you feel a post has answered your question, please click "Accept as Solution".

TDK · ‎2024-02-13

Maybe you can show the relevant code you are using to send data over SPI. Going to be more useful to look at the code in question rather than talking in sweeping generalizations.

You are correct that 30% (or whatever) faster clock should result in 30% faster code, in general. I stand by my statement that there is no hardware requirement to have delays between bytes. Therefore, the explanation for why they are there must be due to the code that is driving the peripheral. The peripheral even has a FIFO which should alleviate code speed requirements even more.

I don't see any options in the registers to put delays between bytes. TI mode will put a 1-bit delay, but this is not what you're seeing based on your diagram. Could be missing something, but don't think I am.

If you feel a post has answered your question, please click "Accept as Solution".

gregstm · ‎2024-02-13

If it was me, I would be writing some dedicated test software to confirm the behaviour. Writing direct to registers to ensure that no other software is causing the effect.

heisenbug · ‎2024-02-14

Hi,

still thanks all

I did a brief test, writing my data blocks by

while (l--) {
    *(volatile uint8_t *)(0x4000380c) = *p++;
}

so writing on data register excluding any other os code.

Result is always the same

The gap between frames is of course very visible since i use now 20Mhz spi bus clock.

So seems i cannot improve this behavior in any way as of now.