SPI pauses between bytes using HAL_SPI_TransmitReceive()

TIvan.1 · ‎2023-07-24

Hello,

I am using STM32F303RBT on a custom board and am trying to communicate through SPI with LTC6811-1 with code generated through CUBEMX.

I am able to send and receive the data fine when the data is not required to be sent in one burst (LTC6811 can handle pauses between bytes except when it is trying to write I2C data over SPI and then requires tight clocking).

Can the pauses be eliminated by using bare metal for HAL_SPI_TransmitReceive()?
Does someone have a simpler (faster) version of this function?

Below are clock and MOSI signals aforementioned with 15 us pauses between.
The code which calls HAL_SPI_TransmitReceive(...) is also attached.

Pauses between bytes

___________________________________________________________________________________________

#include "BMSDriver.h"

/*******************************************************************************/

// MICROCONTROLLER SPECIFIC FUNCTIONS FOR CONTROLLING :

void cs_low()

{

HAL_GPIO_WritePin(BMS_CHIP_SELECT_GPIO_PORT, BMS_CHIP_SELECT_GPIO_PIN, GPIO_PIN_RESET);

}

void cs_high()

{

HAL_GPIO_WritePin(BMS_CHIP_SELECT_GPIO_PORT, BMS_CHIP_SELECT_GPIO_PIN, GPIO_PIN_SET);

}

void delay_u(uint16_t micro)

{

delay_us(micro);

}

void delay_m(uint16_t milli)

{

delay_us(milli*1000);

}

void spi_write_array(uint8_t len, // Option: Number of bytes to be written on the SPI port

uint8_t data[] //Array of bytes to be written on the SPI port

)

{

SPI_HandleTypeDef *pspi=&hspi3;

uint8_t ret_val;

uint8_t i;

for ( i = 0; i < len; i++ )

{

HAL_SPI_TransmitReceive(pspi, (uint8_t*)&data[i], &ret_val, 1, 5);

}

void spi_write_read(uint8_t *tx_data,//array of data to be written on SPI port

uint8_t tx_len, //length of the tx data array

uint8_t *rx_data,//Input: array that will store the data read by the SPI port

uint8_t rx_len //Option: number of bytes to be read from the SPI port

)

{

SPI_HandleTypeDef *pspi=&hspi3;

uint8_t i;

uint8_t rxDummy;

uint8_t txDummy=0xFF;

// Transfer data to LTC681x

for ( i = 0; i < tx_len; i++ )

{

// Transmit byte.

HAL_SPI_TransmitReceive(pspi, (uint8_t*)&tx_data[i], (uint8_t*)&rxDummy, 1, 5);

}

// Receive data from DC2259A board.

for ( i = 0; i < rx_len; i++ )

{

// Receive byte.

HAL_SPI_TransmitReceive(pspi, (uint8_t*)&txDummy, (uint8_t*)&rx_data[i], 1, 5);

}

uint8_t spi_read_byte(uint8_t tx_data)

{

SPI_HandleTypeDef *pspi=&hspi3;

uint8_t rx_data;

if ( HAL_SPI_TransmitReceive(pspi, (uint8_t*) &tx_data, (uint8_t*)&rx_data, 1, 5) == HAL_OK )

{

return(rx_data);

}

return(1);

}

TIvan.1 · ‎2023-07-24

Thank you for the interest!

I have tried with your suggestion to write the whole array, but with no performance improvement with default CUBE library.

I however managed to shorten the delay between bytes by writing a much simpler version of HAL_SPI_TransmitReceive function (Code below). However, I still have delays between bytes (tried with both using for loops with length 1 and single instruction with whole array and length parameter). I get the pauses between bytes in all four combinations.

Is there an hardware option I need to enable that I'm missing?

Faster SPI_TransmitReceive code:

void HAL_SPI_TransmitReceiveFast(SPI_HandleTypeDef *hspi, uint8_t *pTxData, uint8_t *pRxData, uint16_t Size,
        uint32_t Timeout)
{
	SPI_TypeDef *SPIx= hspi->Instance;
	uint16_t count=Size;
	__HAL_SPI_ENABLE(hspi);
	while (count--)
	{
		while ((SPIx->SR & SPI_FLAG_TXE) == 0 || (SPIx->SR & SPI_FLAG_BSY));
		*(__IO uint8_t *)&SPIx->DR = *pTxData++;
		while ((SPIx->SR & SPI_FLAG_RXNE) == 0 || (SPIx->SR & SPI_FLAG_BSY));
		*pRxData++ = *(__IO uint8_t *)&SPIx->DR;
	}
}

Slow (Cube MX generated about 20 us pauses):

Slow (STM32 HAL)

Faster version (still with pauses but only 4 us):

Fast

View solution in original post

AScha.3 · ‎2023-07-24

so try sending whole array...

HAL_SPI_TransmitReceive(pspi, (uint8_t*)data, &ret_val, len, 5);

If you feel a post has answered your question, please click "Accept as Solution".

TIvan.1 · ‎2023-07-24

Thank you for the interest!

I have tried with your suggestion to write the whole array, but with no performance improvement with default CUBE library.

I however managed to shorten the delay between bytes by writing a much simpler version of HAL_SPI_TransmitReceive function (Code below). However, I still have delays between bytes (tried with both using for loops with length 1 and single instruction with whole array and length parameter). I get the pauses between bytes in all four combinations.

Is there an hardware option I need to enable that I'm missing?

Faster SPI_TransmitReceive code:

void HAL_SPI_TransmitReceiveFast(SPI_HandleTypeDef *hspi, uint8_t *pTxData, uint8_t *pRxData, uint16_t Size,
        uint32_t Timeout)
{
	SPI_TypeDef *SPIx= hspi->Instance;
	uint16_t count=Size;
	__HAL_SPI_ENABLE(hspi);
	while (count--)
	{
		while ((SPIx->SR & SPI_FLAG_TXE) == 0 || (SPIx->SR & SPI_FLAG_BSY));
		*(__IO uint8_t *)&SPIx->DR = *pTxData++;
		while ((SPIx->SR & SPI_FLAG_RXNE) == 0 || (SPIx->SR & SPI_FLAG_BSY));
		*pRxData++ = *(__IO uint8_t *)&SPIx->DR;
	}
}

Slow (Cube MX generated about 20 us pauses):

Slow (STM32 HAL)

Faster version (still with pauses but only 4 us):

Fast

AScha.3 · ‎2023-07-24

if you want fastest possible, need to use DMA.

+

what clocks for core + bus ? max. ?

+ optimizer -Ofast or -O2 set ? without optimizer - forget it.

If you feel a post has answered your question, please click "Accept as Solution".

Piranha · ‎2023-07-24

The TXE flag is set immediately after the Tx byte is sent to the hardware, but your code does not load the next Tx byte until the Rx byte of the previous period is received. With such approach one just cannot get the maximum speed. Instead you should do something like this:

void HAL_SPI_TransmitReceiveFast(SPI_HandleTypeDef *hspi, uint8_t *pTxData, uint8_t *pRxData, uint16_t Size,
	uint32_t Timeout)
{
	SPI_TypeDef *SPIx = hspi->Instance;
	__HAL_SPI_ENABLE(hspi);
	while (Size) {
		uint32_t rSR = SPIx->SR;
		if (rSR & SPI_FLAG_TXE) {
			*(volatile uint8_t *)&SPIx->DR = *pTxData++;
		}
		if (rSR & SPI_FLAG_RXNE) {
			*pRxData++ = *(volatile uint8_t *)&SPIx->DR;
			--Size;
		}
	}
	while ((SPIx->SR & SPI_FLAG_BSY));
}

@AScha.3 , the DMA doesn't magically make the interface faster, it just frees the CPU from doing data transfers.

TIvan.1 · ‎2023-07-27

I managed to get it to work with the because there was the other underlying issue not related to SPI timing. It is working currently with the code above. Also, It is strange that I have yet to see documented code which uses FIFO or something similar Pirahna suggested.

Here is my untested idea:
Fill the TX FIFO with 16 words (or how much the message has). When RX FIFO receives some word read it and, give the TX FIFO the next word if it has any. If the FIFO works properly, there should be no pauses, as the CPU never waits until the message is received without having a new message to send.

void HAL_SPI_TransmitReceiveFast(SPI_HandleTypeDef *hspi, uint8_t *pTxData, uint8_t *pRxData, uint16_t Size, uint32_t Timeout)
{
    SPI_TypeDef *SPIx = hspi->Instance;
    uint16_t count = Size;
    uint16_t txCount = 0;
    uint16_t rxCount = 0;
    // Enable the SPI peripheral
    __HAL_SPI_ENABLE(hspi);
    // Fill the TX FIFO initially, up to its depth (16 bytes)
    while (txCount < 16 && count > 0)
    {
        *(__IO uint8_t *)&SPIx->DR = *pTxData++;
        txCount++;
        count--;
    }
    // Transmit and receive data in a loop
    while (count)
    {
        // Check if RX FIFO has data to be read
        if (SPIx->SR & SPI_FLAG_RXNE)
        {
            //Read the data
            *pRxData++ = *(__IO uint8_t *)&SPIx->DR;
            // Add the next data to TxFIFO if it exists
            if(txCount<Size)
            {
                *(__IO uint8_t *)&SPIx->DR = *pTxData++;
                txCount++;
                count--;
            }
        }
    }
    // Wait for the last byte to be received
    while ((SPIx->SR & SPI_FLAG_RXNE) == 0);
    // Read the last received byte
    *pRxData++ = *(__IO uint8_t *)&SPIx->DR;
    // Wait until the last data has been sent out before disabling the SPI peripheral
    while (SPIx->SR & SPI_FLAG_BSY);
    // Disable the SPI peripheral
    __HAL_SPI_DISABLE(hspi);
}

_alaBaster · ‎2024-09-30

Hi @TIvan.1

did you find a solution on how to SPI_TransmitReceive with no interleaved bytes? Meaning no dead intervals between clocking cycles? I have a peripheral needing such continuous framing.

Thanks, bye

_alaBaster · ‎2024-09-30

after quick trials, I can reply myself:

one possibility to obtain SPI clock cycles not interleaved during write is to use DMA, i.e.:

// Transmit the register address and data

HAL_SPI_Transmit_DMA(&hspi2, &txData, 2);

Instead during read I intended to use HAL_SPI_TransmitReceive_DMA with no success.

I had then to split the two phases (that is sending the register address and then receiving the value) into 2 separate functions:

// Transmit the register address (read mode)
HAL_StatusTypeDef statusTX = HAL_SPI_Transmit_DMA(&hspi2, &txData, 1);

// Receive the register data
HAL_StatusTypeDef statusRX = HAL_SPI_Receive_DMA(&hspi2, &rxData, 1);

I shall say, I am interfacing to a RTC by Micro-Crystal.

Is anyone aware on how to use TransmitReceive_DMA instead of splitting the 2 functions?

Best regards