cancel
Showing results for 
Search instead for 
Did you mean: 

SPI slave MISO delay on STM32F401

PRobe
Associate II

I am attempting to communicate between ESP32 and STM32F401 using SPI Mode 0.

I am achieving approximately 6MHz reliably and note data corruptions at 8MHz or so.

Both MPUs are using DMA.

I can improve performance by dropping APB1 prescaler from 2 to 1 changing APB1 from 36MHz to 72MHz (naughty, I know).

The working theory is that we are failing at 8MHz through SCK/MISO violations, see:

https://docs.espressif.com/projects/esp-idf/en/latest/api-reference/peripherals/spi_master.html#speed-and-timing-considerations

Looking at the STM32 datasheet however and MISO is set initially from NSS and then SCK with tsu(NSS) 4*APB1and Tv(SO) of max 17nS.

I have meaured NSS to SCK as 180nS+

What does not make sense to me is that:

- NSS setup > 180nS but the issue seems improved by increasing APB1.

- Tv(SO) = 17nS should give me >=20MHz bus speed.

Also, the regular MISO output delay Tv(SO) is 17nS which is short of APB1 of 28 nS.

I could accept that once loaded the STM output might be driven by SCK but to load the output shift register itself then we must cross clock SCK/APB1 domains and hence have delays?

That is too say the STM output is fed by DMA (no software delays) but the bytes/words still have to be loaded regularly into the shift register - where is this delay shown?

Any pointers/corrections gratefully received!

19 REPLIES 19

Do you have an oscilloscope?

JW

PRobe
Associate II

Hi,

Yes & also a Saleae but I am awaiting boards to measure.

(The NSS setup >180nS claim is based on a screen shot sent by an overseas colleague.)

I will make the scope (noise) and LA (shifting & timing) measurements once I have the board and post if still scratching my head.

I am hoping to get feedback on my appreciation of the relevant timing constraint so I can hit the ground.

DMA keeps the holding register (buffer) full, so at the point where the shift register needs new value it's already there, so that's not an issue.

Depending on particular timing elements, for simplex communication in the MISO direction, you could consider using the other, "incorrect" CPHA only in the slave. Duplex then can be achieved by using two SPI modules in simplex, sharing clock, and setting them to different CPHA.

If the interconnection is longer, or it's going through a cable, I'd also consider signal integrity, grounding, reflections/termination etc. issues.

JW

PRobe
Associate II

"DMA keeps the holding register (buffer) full, so at the point where the shift register needs new value it's already there, so that's not an issue."

Sure (and I may be navel gazing) but I don't see how DMA can cross domains without at least one APB1.

Not that loosing one APB1 explains the lack of performance but takes us closer to understanding.

"Depending on particular timing elements, for simplex communication in the MISO direction, ....."

Neat trick. Unfortunately the ESP only has two available general purpose SPI modules and the other one is in use.

"If the interconnection is longer, or it's going through a cable, I'd also consider signal integrity, grounding, reflections/termination etc. issues.

All PCB, but yes, the edge will be hard regardless of bit rate.

I will scope of course when able.

S.Ma
Principal

hmm back to the white board.

ESP is slave and STM32 is master or is it reverse?

How do you init your DMA region? By NSS rise/fall edge?

Is this interrupt based? How long does it takes worst case?

PRobe
Associate II

Hi,

ESP is master.

    #define ESP_SPI_RX_DMA_STREAM               DMA1_Stream3
    #define ESP_SPI_RX_DMA_CHANNEL              DMA_CHANNEL_0
        
    #define ESP_SPI_RX_DMA_IRQn                 DMA1_Stream3_IRQn
    #define ESP_SPI_RX_DMA_IRQHandler           DMA1_Stream3_IRQHandler
        
    #define ESP_SPI_TX_DMA_STREAM               DMA1_Stream4
    #define ESP_SPI_TX_DMA_CHANNEL              DMA_CHANNEL_0
        
    #define ESP_SPI_DMA_CLK_ENABLE              __HAL_RCC_DMA1_CLK_ENABLE
        
        hEspSpiDma_tx.Instance                 = ESP_SPI_TX_DMA_STREAM;
 
        hEspSpiDma_tx.Init.Channel             = ESP_SPI_TX_DMA_CHANNEL;
        hEspSpiDma_tx.Init.Direction           = DMA_MEMORY_TO_PERIPH;
        hEspSpiDma_tx.Init.PeriphInc           = DMA_PINC_DISABLE;
        hEspSpiDma_tx.Init.MemInc              = DMA_MINC_ENABLE;
        hEspSpiDma_tx.Init.PeriphDataAlignment = DMA_PDATAALIGN_BYTE;
        hEspSpiDma_tx.Init.MemDataAlignment    = DMA_MDATAALIGN_BYTE;
        hEspSpiDma_tx.Init.Mode                = DMA_NORMAL;
        hEspSpiDma_tx.Init.Priority            = DMA_PRIORITY_LOW;
        hEspSpiDma_tx.Init.FIFOMode            = DMA_FIFOMODE_DISABLE;
        hEspSpiDma_tx.Init.FIFOThreshold       = DMA_FIFO_THRESHOLD_FULL;
        hEspSpiDma_tx.Init.MemBurst            = DMA_MBURST_INC4;
        hEspSpiDma_tx.Init.PeriphBurst         = DMA_PBURST_INC4;
 
        HAL_DMA_Init(&hEspSpiDma_tx);
 
        // Associate the initialized DMA handle to the the SPI handle
        __HAL_LINKDMA(hspi, hdmatx, hEspSpiDma_tx);
 
 
        // Configure the DMA handler for Reception process
        hEspSpiDma_rx.Instance                 = ESP_SPI_RX_DMA_STREAM;
 
        hEspSpiDma_rx.Init.Channel             = ESP_SPI_RX_DMA_CHANNEL;
        hEspSpiDma_rx.Init.Direction           = DMA_PERIPH_TO_MEMORY;
        hEspSpiDma_rx.Init.PeriphInc           = DMA_PINC_DISABLE;
        hEspSpiDma_rx.Init.MemInc              = DMA_MINC_ENABLE;
        hEspSpiDma_rx.Init.PeriphDataAlignment = DMA_PDATAALIGN_BYTE;
        hEspSpiDma_rx.Init.MemDataAlignment    = DMA_MDATAALIGN_BYTE;
        hEspSpiDma_rx.Init.Mode                = DMA_NORMAL;
        hEspSpiDma_rx.Init.Priority            = DMA_PRIORITY_HIGH;
        hEspSpiDma_rx.Init.FIFOMode            = DMA_FIFOMODE_DISABLE;
        hEspSpiDma_rx.Init.FIFOThreshold       = DMA_FIFO_THRESHOLD_FULL;
        hEspSpiDma_rx.Init.MemBurst            = DMA_MBURST_INC4;
        hEspSpiDma_rx.Init.PeriphBurst         = DMA_PBURST_INC4;
 
        HAL_DMA_Init(&hEspSpiDma_rx);
 
        // Associate the initialized DMA handle to the the SPI handle
        __HAL_LINKDMA(hspi, hdmarx, hEspSpiDma_rx);

DMA on completion of the proceeding transaction using:

   if(HAL_SPI_TransmitReceive_DMA(&espSpi, (uint8_t*)txBuffer, (uint8_t*)rxBuffer, ESP_MESSAGE_SIZE+LOG_BUFFER_SIZE) != HAL_OK)

There would be too much latency to launch STM RX/TX on NSS.

NSS rising edge interrupt detects the end.

This works at 4MHz but fails at 6/7 MHz but can be made to work at 6/7 by overclocking APB1.

So this seems to be about clocks on the STM side & hence why I am asking if my appreciation is correct etc.

S.Ma
Principal

SPI Slave works at SYSCLK/4 on most parts.

How much time do you have between NSS rise and fall time?

PRobe
Associate II

'How much time do you have between NSS rise and fall time?'

Lots, one of the first things was to add delays on transmitter - 100mS or more.

All other STM stuff has been disabled.

'SPI Slave works at SYSCLK/4 on most parts.'

So 18 MHz.

I don't see that in the datasheet though. Pg 99 NSS is dependent on APB1 but the rest in fixed delays and so I think that SCK is driving the logic.

Hence I do not get how data from CPU RAM domain gets into ESP SCK domain without a delay.

Regards

> "DMA keeps the holding register (buffer) full, so at the point where the shift register needs new value it's already there, so that's not an issue."

> Sure (and I may be navel gazing) but I don't see how DMA can cross domains without at least one APB1.

OK so I try to formulate it differently: When previous frame (byte, halfword) is finished transmitting, SPI transfers a new frame from holding to shift register and starts transmit it. At that moment, the TXE flag gets set, and that triggers DMA to transfer a next frame from memory into the holding register. In other words, DMA has the whole time of transmission of the whole current frame, to transfer that next frame.

> "Depending on particular timing elements, for simplex communication in the MISO direction, ....."

> Neat trick. Unfortunately the ESP only has two available general purpose SPI modules and the other one is in use.

You'd need two simplex SPI modules on the slave, i.e. STM32 side; one duplex is enough on the master slave.

JW