Skip to main content
PHolt.1
Senior
January 11, 2022
Question

32F417 SPI running at half the speed it should

  • January 11, 2022
  • 24 replies
  • 4087 views

Hello,

I am using a chopped-down function from the Cube HAL code to talk to a serial FLASH. The original function has a mass of options which were checked at runtime - the usual thing for HAL code...

It's all working fine. SPI2, 21MHz (the fastest I can go with PCLK1=PCLK2=42MHz).

But testing the speed, I am seeing 600us time taken to read 512 bytes. The SPI speed limit should be about 200us. The actual time, obtained by waggling a pin, is 600us. Can anyone see anything obviously wrong with this code? As I say, it works perfectly.

// Used only for SPI2. Mode is always SPI_MODE_MASTER and 2LINE.
 
//__attribute__((optimize("O0")))
HAL_StatusTypeDef B_HAL_SPI_TransmitReceive(SPI_HandleTypeDef *hspi, uint8_t *pTxData, uint8_t *pRxData, uint16_t Size)
{
 uint16_t initial_TxXferCount;
 uint32_t tmp_mode;
 HAL_SPI_StateTypeDef tmp_state;
 
 /* Variable used to alternate Rx and Tx during transfer */
 uint32_t txallowed = 1U;
 HAL_StatusTypeDef errorcode = HAL_OK;
 
 /* Init temporary variables */
 tmp_state = hspi->State;
 tmp_mode = hspi->Init.Mode;
 initial_TxXferCount = Size;
 
 if (!((tmp_state == HAL_SPI_STATE_READY) || \
 ((tmp_mode == SPI_MODE_MASTER) && (hspi->Init.Direction == SPI_DIRECTION_2LINES) && (tmp_state == HAL_SPI_STATE_BUSY_RX))))
 {
 errorcode = HAL_BUSY;
 goto error;
 }
 
 if ((pTxData == NULL) || (pRxData == NULL) || (Size == 0U))
 {
 errorcode = HAL_ERROR;
 goto error;
 }
 
 /* Don't overwrite in case of HAL_SPI_STATE_BUSY_RX */
 if (hspi->State != HAL_SPI_STATE_BUSY_RX)
 {
 hspi->State = HAL_SPI_STATE_BUSY_TX_RX;
 }
 
 /* Set the transaction information */
 hspi->ErrorCode = HAL_SPI_ERROR_NONE;
 hspi->pRxBuffPtr = (uint8_t *)pRxData;
 hspi->RxXferCount = Size;
 hspi->RxXferSize = Size;
 hspi->pTxBuffPtr = (uint8_t *)pTxData;
 hspi->TxXferCount = Size;
 hspi->TxXferSize = Size;
 
 /*Init field not used in handle to zero */
 hspi->RxISR = NULL;
 hspi->TxISR = NULL;
 
 /* Check if the SPI is already enabled */
 if ((hspi->Instance->CR1 & SPI_CR1_SPE) != SPI_CR1_SPE)
 {
 __HAL_SPI_ENABLE(hspi);
 }
 
 	/* Transmit and Receive data in 8 Bit mode */
 
 // The need for this initial byte is unknown
 if (initial_TxXferCount == 0x01U)
 {
 *((__IO uint8_t *)&hspi->Instance->DR) = (*hspi->pTxBuffPtr);
 hspi->pTxBuffPtr++;
 hspi->TxXferCount--;
 }
 
 while ((hspi->TxXferCount > 0U) || (hspi->RxXferCount > 0U))
 {
 /* Check TXE flag */
 if ((__HAL_SPI_GET_FLAG(hspi, SPI_FLAG_TXE)) && (hspi->TxXferCount > 0U) && (txallowed == 1U))
 {
 *(__IO uint8_t *)&hspi->Instance->DR = (*hspi->pTxBuffPtr);
 hspi->pTxBuffPtr++;
 hspi->TxXferCount--;
 /* Next Data is a reception (Rx). Tx not allowed */
 txallowed = 0U;
 }
 
 /* Wait until RXNE flag is reset */
 if ((__HAL_SPI_GET_FLAG(hspi, SPI_FLAG_RXNE)) && (hspi->RxXferCount > 0U))
 {
 (*(uint8_t *)hspi->pRxBuffPtr) = hspi->Instance->DR;
 hspi->pRxBuffPtr++;
 hspi->RxXferCount--;
 /* Next Data is a Transmission (Tx). Tx is allowed */
 txallowed = 1U;
 }
 
 }
 
 // Clear overrun flag in 2 Lines communication mode because received is not read
 // For 45DBxx the Init.Direction is always SPI_DIRECTION_2LINES and is set up in b_main.c
 //if (hspi->Init.Direction == SPI_DIRECTION_2LINES)
 //{
 	__HAL_SPI_CLEAR_OVRFLAG(hspi);
 //}
 
error :
 hspi->State = HAL_SPI_STATE_READY;
 return errorcode;
}

This topic has been closed for replies.

24 replies

waclawek.jan
Super User
January 12, 2022

You can easily exclude problems with ocassional lengthy delays caused by interrupts simply by writing a simple test program which does NOTHING but dump 512 bytes through SPI. It does not need to be meaningful for the sFLASH itself, you just observe the bus/timing.

JW

TDK
January 12, 2022

If you've measured it on a scope and there is no delay between bytes, and the rate is 21 MHz, then explain how you're "seeing 600us" transaction time. Those two facts are at odds and cannot both be true.

"If you feel a post has answered your question, please click ""Accept as Solution""."
PHolt.1
PHolt.1Author
Senior
January 12, 2022

OK here we go. Spot on, the clock bursts have gaps of exactly 2/3 of the time.

The zoomed X axis is 200ns/div.

The waveform is ringing because I am probing the board without decent scope grounds.

Ignore other traces - they are irrelevant. Just the clock.

So what the hell is it waiting for during those 800ns? 2 whole byte periods.

0693W00000HrQnTQAV.png 

To recap, this is the code

0693W00000HrQxmQAF.png 

The title of the thread should say 1/3 not 1/2.

TDK
January 12, 2022

As we've been saying all along, the code cannot keep up with the data rate. Use DMA, higher optimization settings, or your own code to overcome the issue.

800ns is not a ton of time in terms of cycles. Did you specify your CPU clock rate anywhere in this post?

"If you feel a post has answered your question, please click ""Accept as Solution""."
PHolt.1
PHolt.1Author
Senior
January 12, 2022

168MHz.

No, there is something in that code which is waiting exactly two byte periods.

Some timing issue, but I can't see it.

Maybe one of the two USART flags is not indicating the status until something else happens.

An example of this, not relevant here, is that you cannot (with most UARTs) get a "TX buffer empty" interrupt unless the TX buffer becomes empty, so you have to load a byte in there first (often that is done by a timer ISR).

waclawek.jan
Super User
January 12, 2022

> No, there is something in that code which is waiting exactly two byte periods.

Or executing code takes exactly two byte.

Have you read out and checked the SPI registers content? Do you have interrupts enabled on that SPI?

JW

PHolt.1
PHolt.1Author
Senior
January 12, 2022

Config function:

 
// Stripped down copy of this function
// Must not enable interrupts!
// Used only for SPI2
 
static void B_HAL_SPI_Init(SPI_HandleTypeDef *hspi)
{
 
	/* Init the low level hardware : GPIO, CLOCK, NVIC... */
	// HAL_SPI_MspInit(hspi);
 
	GPIO_InitTypeDef GPIO_InitStruct = {0};
 
	/* Peripheral clock enable */
	__HAL_RCC_SPI2_CLK_ENABLE();
	__HAL_RCC_GPIOC_CLK_ENABLE();
	__HAL_RCC_GPIOB_CLK_ENABLE();
 
	/**SPI2 GPIO Configuration
	PC2 ------> SPI2_MISO
	PC3 ------> SPI2_MOSI
	PB10 ------> SPI2_SCK
	*/
	GPIO_InitStruct.Pin = GPIO_PIN_2|GPIO_PIN_3;
	GPIO_InitStruct.Mode = GPIO_MODE_AF_PP;
	GPIO_InitStruct.Pull = GPIO_NOPULL;
	GPIO_InitStruct.Speed = GPIO_SPEED_FREQ_VERY_HIGH;
	GPIO_InitStruct.Alternate = GPIO_AF5_SPI2;
	B_HAL_GPIO_Init(GPIOC, &GPIO_InitStruct);
 
	GPIO_InitStruct.Pin = GPIO_PIN_10;
	GPIO_InitStruct.Mode = GPIO_MODE_AF_PP;
	GPIO_InitStruct.Pull = GPIO_NOPULL;
	GPIO_InitStruct.Speed = GPIO_SPEED_FREQ_VERY_HIGH;
	GPIO_InitStruct.Alternate = GPIO_AF5_SPI2;
	B_HAL_GPIO_Init(GPIOB, &GPIO_InitStruct);
 
	hspi->State = HAL_SPI_STATE_BUSY;
 
	/* Disable the selected SPI peripheral */
	__HAL_SPI_DISABLE(hspi);
 
	/*----------------------- SPIx CR1 & CR2 Configuration ---------------------*/
	/* Configure : SPI Mode, Communication Mode, Data size, Clock polarity and phase, NSS management,
 	 Communication speed, First bit and CRC calculation state */
	WRITE_REG(hspi->Instance->CR1, (hspi->Init.Mode | hspi->Init.Direction | hspi->Init.DataSize |
 hspi->Init.CLKPolarity | hspi->Init.CLKPhase | (hspi->Init.NSS & SPI_CR1_SSM) |
 hspi->Init.BaudRatePrescaler | hspi->Init.FirstBit | hspi->Init.CRCCalculation));
 
	/* Configure : NSS management, TI Mode */
	WRITE_REG(hspi->Instance->CR2, (((hspi->Init.NSS >> 16U) & SPI_CR2_SSOE) | hspi->Init.TIMode));
 
	#if defined(SPI_I2SCFGR_I2SMOD)
	/* Activate the SPI mode (Make sure that I2SMOD bit in I2SCFGR register is reset) */
	CLEAR_BIT(hspi->Instance->I2SCFGR, SPI_I2SCFGR_I2SMOD);
	#endif /* SPI_I2SCFGR_I2SMOD */
 
	hspi->ErrorCode = HAL_SPI_ERROR_NONE;
	hspi->State = HAL_SPI_STATE_READY;
 
}

and this is the config:

// Config SPI2 for serial FLASH. This SPI channel runs as fast as it can.
// Note that the mode is always SPI_DIRECTION_2LINES even if a tx-only function is used e.g.
// B_HAL_SPI_Transmit.
 
static void KDE_Init_SPI2(void)
{
	hspi2.Instance = SPI2;
	hspi2.Init.Mode = SPI_MODE_MASTER;
	hspi2.Init.Direction = SPI_DIRECTION_2LINES;
	hspi2.Init.DataSize = SPI_DATASIZE_8BIT;
	hspi2.Init.CLKPolarity = SPI_POLARITY_LOW;
	hspi2.Init.CLKPhase = SPI_PHASE_1EDGE;
	hspi2.Init.NSS = SPI_NSS_SOFT;
	hspi2.Init.BaudRatePrescaler = SPI_BAUDRATEPRESCALER_2; // 21MHz; the max possible on SPI2
	hspi2.Init.FirstBit = SPI_FIRSTBIT_MSB;
	hspi2.Init.TIMode = SPI_TIMODE_DISABLE;
	hspi2.Init.CRCCalculation = SPI_CRCCALCULATION_DISABLE;
	hspi2.Init.CRCPolynomial = 10;
	B_HAL_SPI_Init(&hspi2);
}
 

No interrupts used in this code.

waclawek.jan
Super User
January 12, 2022

The mcu does not work out of source code.

Read out and check the relevant registers.

Write a simple test program which does NOTHING but dump 512 bytes through SPI. It does not need to be meaningful for the sFLASH itself, you just observe the bus/timing.

Insert the pin toggling into places within the Tx/Rx routine to see progress there.

Ultimately you want to use DMA. Not that it's without problems.

JW

Tesla DeLorean
Guru
January 12, 2022

Why can't Tx and Rx occur concurrently on symmetrical SPI bus?

Tips, Buy me a coffee, or three.. PayPal Venmo (See Profile) Up vote any posts that you find helpful, it shows what's working..
PHolt.1
PHolt.1Author
Senior
January 12, 2022

It always does happen concurrently. You cannot receive a byte without sending one because it is only sending one (I am using only Master mode) that generates the clocks.

DMA is ****** complicated (I've done it for the waveform generator project and it worked only after somebody spent a fair bit of time on it) and it will certainly not work if a software loop like this cannot be made to work right.

BTW is there a config on this forum which avoids the "read more" at the bottom?

It would amaze me if ints were enabled because the NULL for the ISRs would crash the whole thing. Anyway, I did a tight loop at the start of the startup code, after SPI2 is initialised, and get the same thing.

Tesla DeLorean
Guru
January 12, 2022

But you can push data in the transmit buffer, when it's empty, and it will generate the clocks to pull out the data on the backside.

DMA isn't going to enforce your TX-ALLOWED, it will provide TWO channels, one that will feed when TXE signals, and one that will consume when RXNE signals.

>>BTW is there a config on this forum which avoids the "read more" at the bottom?

No, they have this worthless software for SalesForce that was written by a 3 year old, a dumber one,..

Tips, Buy me a coffee, or three.. PayPal Venmo (See Profile) Up vote any posts that you find helpful, it shows what's working..