cancel
Showing results for 
Search instead for 
Did you mean: 

Do not use HAL_SPI_Transmit() for high performance (DMA may not be as necessary)

RhSilicon
Lead

Hi,

I'm using an STM32F407VGT6 at 168MHz, SPI3 clocked at 21MHz (the SPI1, which does up to 42MHz is in use elsewhere).

I did some tests with the SPI ILI9341 display 2.4" of 320x240 pixels, and to fill the screen completely using HAL_SPI_Transmit() takes 191ms.

But if you only use what is necessary (bare metal?), the time is 58ms.

Is that 330% performance improvement? (191/58=3.293103448)

(I tried to change the 8-bit mode to 16-bit in flight, but I haven't been able to do it yet, I don't know if this time of 58ms will go down)

 

 

 

#define I_don_t_want_to_lose_performance_using_HAL 1

void ILI9341_FillRect(uint16_t x, uint16_t y, uint16_t w, uint16_t h,
		uint16_t color) {
	// clipping
	if ((x >= ILI9341_WIDTH) || (y >= ILI9341_HEIGHT))
		return;
	if ((x + w - 1) >= ILI9341_WIDTH)
		w = ILI9341_WIDTH - x;
	if ((y + h - 1) >= ILI9341_HEIGHT)
		h = ILI9341_HEIGHT - y;

	ILI9341_Select();
	ILI9341_SetAddressWindow(x, y, x + w - 1, y + h - 1);

	uint8_t data[] = { color >> 8, color & 0xFF };

	HAL_GPIO_WritePin(ILI9341_DC_GPIO_Port, ILI9341_DC_Pin, GPIO_PIN_SET);

	for (y = h; y > 0; y--) {
		for (x = w; x > 0; x--) {

#if I_don_t_want_to_lose_performance_using_HAL != 1
			HAL_SPI_Transmit(&ILI9341_SPI_PORT, data, sizeof(data),
			HAL_MAX_DELAY);
#else
			*((__IO uint8_t*) &ILI9341_SPI_PORT.Instance->DR) = data[0];

			while(!__HAL_SPI_GET_FLAG(&ILI9341_SPI_PORT, SPI_FLAG_TXE));

			*((__IO uint8_t*) &ILI9341_SPI_PORT.Instance->DR) = data[1];

			while(!__HAL_SPI_GET_FLAG(&ILI9341_SPI_PORT, SPI_FLAG_TXE));
#endif
		}
	}

	ILI9341_Unselect();
}

 

 

It might be very interesting to review all the STM32 libraries that use SPI, such as the display ones.

 Original full library here

11 REPLIES 11

I guess you're right. I really though it could be 32-bits on the F4, but maybe I've been using the H7 too much lately. I guess you could chain the second transfer off of the TC trigger without too much trouble or delay.

If you feel a post has answered your question, please click "Accept as Solution".
TDK
Guru

What you wrote will probably work okay if the SPI clock rate is high, but just because TXE=1 the transmission may not be complete. You should wait for BSY=0 after the final transfer before you pull CS high.

 

TDK_0-1688046392829.png

 

If you feel a post has answered your question, please click "Accept as Solution".