cancel
Showing results for 
Search instead for 
Did you mean: 

NUCLEO-F446ZE SPI Slave (using DMA) erroneous BSY flag. Is this a known issue?

sameerazer
Associate II

Hello there,

I have the following setup:

NUCLEO-F446ZE SPI 1 is a SPI master connected to an external ADC device.

NUCLEO-F446ZE SPI 4 is a SPI slave connected to an external MCU, which I will refer to going forward as SOM.

For debugging purposes, I have two GPIO outputs:

SPI_COMM_1 is toggled at the beginning and end of SPI_CheckFlag_BSY(). This GPIO is labeled as Check BSY in the logic analyzer screenshots below.

SPI_COMM_2 is toggled within HAL_SPI_ErrorCallback(). This GPIO triggers my logic analyzer capture, and it's labeled as SPI Error in the logic analyzer screenshots below.

The behavior that I'm observing is that the SPI_CheckFlag_BSY() function will randomly return with an error (timeout) after a transaction on SPI 4 (slave). I'm using the HAL_SPI_TransmitReceive_DMA() function to setup the SPI 4 transfers. All peripherals were setup using the CubeMX (including FreeRTOS).

As the screenshots below indicate, neither SPI 1 nor SPI 4 is busy with any transfers, yet the SPI_CheckFlag_BSY() continues to busy-wait for roughly a 100 milliseconds before returning with a timeout error. Most of the time, it only busy-waits for as little as 1.5 microseconds up to 41.5 microseconds.

Here's the code I modified to track the behavior of SPI_CheckFlag_BSY()

static HAL_StatusTypeDef SPI_CheckFlag_BSY(SPI_HandleTypeDef *hspi, uint32_t Timeout, uint32_t Tickstart)
{
  /* Control the BSY flag */
    HAL_GPIO_TogglePin(SPI_COMM_1_GPIO_Port, SPI_COMM_1_Pin);
  if(SPI_WaitFlagStateUntilTimeout(hspi, SPI_FLAG_BSY, RESET, Timeout, Tickstart) != HAL_OK)
  {
    SET_BIT(hspi->ErrorCode, HAL_SPI_ERROR_FLAG);
    HAL_GPIO_TogglePin(SPI_COMM_1_GPIO_Port, SPI_COMM_1_Pin);
    return HAL_TIMEOUT;
  }
  HAL_GPIO_TogglePin(SPI_COMM_1_GPIO_Port, SPI_COMM_1_Pin);
  return HAL_OK;
}

Overall capture (noting SOM SPI signals on top, ADC SPI signals on bottom, and debug signals in the middle, with SPI Error acting as the trigger for the logic capture).

0690X000006COQaQAO.png

Zooming in on a typical wait (1.5 microseconds)

0690X000006COQfQAO.png

Zooming in on error frame (Noting the 100 millisecond wait as well as the lack of activity on both SPI 1 and SPI 4 buses)

0690X000006COQkQAO.png

Please advise. Thank you!

6 REPLIES 6

Yes, it's a known - and very unpleasant - hardware bug in the SPI module:0690X000006COSbQAO.png

JW

Thank you!

Are their any known workarounds to this problem?

Workarounds are described in the Errata document. It's probably up to you to implement any known workaround in the HAL code.

I see. Yeah, I read the workarounds in the document shown above. But as you predicted, I was hoping for a HAL-specific workaround.

I personally would go for a precise timer-based timeout. In case of CPHA=1 it would be enough to wait until the last RXNE, but it doesn't appear to be your case.

It's unlikely there's a pre-chewed solution in Cube/HAL. Mind, any "library" deals only with typical usage modes, and as soon as you want something with a bit of extended functionality you'd either need to jump through hoops if you want to retain your HAL-ness, or simply roll out something you own.

JW

ST appears to have no interest in incorporating and maintaining errata workarounds in HAL code. Doing so would require coupling workarounds in the code to MCU revisions determined at run time. There's not enough testing and validation of the HAL anyway - adding errata coverage would certainly make heads explode.