cancel
Showing results for 
Search instead for 
Did you mean: 

STM32H7xx - Serious Bug in SPI Driver HAL Version 1.3.0, 1.7.0 and 1.8.0

Dub Bartolec
Associate III

We were experiencing strange crashes originating from HAL_SPI_TransmitReceive and HAL_SPI_Receive when operating in SPI Slave mode.

Issue is that STM32 HAL code is not handling cases for 8 bit SPI Transfers correctly causing overflow of user buffers and corruption of stack or heap.

Here is simplified example of code:

uint8_t txData[4];

uint8_t rxData[4];

HAL_SPI_TransmitReceive(hspi, txData, rxData, 4, HAL_MAX_DELAY);

On some occasions function will increment rxData pointer past its size of 4 bytes and will copy SPI data to memory that was not designated to be written. In above case stack will be overwritten.

Bug is in function HAL_SPI_TransmitReceive() and HAL_SPI_Receive and it is in the block that handles 8 bit transfers:

/* Receive data in 8 Bit mode */
  else
  {
    /* Transfer loop */
    while (hspi->RxXferCount > 0U)
    {
      /* Check the RXWNE/FRLVL flag */
      if (hspi->Instance->SR & (SPI_FLAG_RXWNE|SPI_FLAG_FRLVL))
      {
        if (hspi->Instance->SR & SPI_FLAG_RXWNE)
        {
          *((uint32_t *)hspi->pRxBuffPtr) = *((__IO uint32_t *)&hspi->Instance->RXDR);
          hspi->pRxBuffPtr += sizeof(uint32_t);
          hspi->RxXferCount-=4;
        }
        else if ((hspi->Instance->SR & SPI_FLAG_FRLVL) > SPI_FRLVL_QUARTER_FULL)
        {
          *((uint16_t *)hspi->pRxBuffPtr) = *((__IO uint16_t *)&hspi->Instance->RXDR);
          hspi->pRxBuffPtr += sizeof(uint16_t);
          hspi->RxXferCount-=2;
        }
        else
        {
          *((uint8_t *)hspi->pRxBuffPtr) = *((__IO uint8_t *)&hspi->Instance->RXDR);
          hspi->pRxBuffPtr += sizeof(uint8_t);
          hspi->RxXferCount--;
        }
      }
      else
      {
....
...
}
 

It actually occurs that E.g. we are expecting 4 bytes, but master sends more than 4 bytes and by the time we poll SPI buffer we have:

  1. 1 byte in FIFO
  2. Code reads byte and decrements RxXferCount so its value is now 3.
  3. Another 4 bytes arrive to SPI FIFO
  4. At this point code reads 4 bytes into buffer as it runs if (hspi->Instance->SR & SPI_FLAG_RXWNE) condition. At this point corruption has already occured as only 3 bytes were requested by caller of the function.
  5. Code then subtracts 4 from RxXferCount and count is now 0xFFFF,
  6. The rest is now in a hands of master. If master sends more data if there is no timeout set it will copy up to 0xFFFF bytes into RAM causing catasrophic failure of the program.

This is very easy to reproduce.

Can someone from STM development team review this code and let us know what the next step should be ?

14 REPLIES 14

Hi @Pavel A.​ 

As I confirmed that issue exists in 1.3.0, 1.7.0 and 1.8.0 I've updated title of this thread.

Another side effect of this bug is that it will also cause MCU crash while trying to write uint32_t value to address that is not aligned.

Also I've discovered that this bug can also occur if HAL_SPI_Receive or HAL_SPI_TransmitReceive is called from SPI Master.

Something that definitely needs to be fixed.

A *good* bug report can help to fix it.

Hi @Pavel A.​ 

That's exactly what how I thought about it...

I've reported that issue to STM support.

Case number is: 00118012

Case has been closed and STM engineers in Sydney contacted me so I gave them everything that is needed to reproduce the problem.

I also pointed them to actual code and provided insight into how I fixed it temporarily until HAL is fixed officially.

They got it and last time I've heard from them was 3 days ago.

I don't know how long it'll take them to fix it but the way SPI code in HAL H7 this has to be fixed as it will affect SPI Master operation too.

smati2
Associate II

Hi @Dub Bartolec​ ,

I am having basically the same issue. In our setup under STM32H743, the SPI is in Slave mode. It is bare metal (no RTOS). I am using HAL_SPI_TransmitReceive function. It all works fine until the need for receiving 32 bytes. I get about half of them and the rest are all zero and the call gets stuck in HAL_SPI_TransmitReceive and never returns. After seeing your post, I saw ver 1.11.0 was available on Git. So, I replaced my stm32h7xx_hal_spi.c and stm32h7xx_hal_spi.h with that version. It now actually receives the entire 32 bytes and the bytes are valid. However, it does not come out of the HAL_SPI_TransmitReceive routine. The transmitter thinks it needs to send more but TXP is not set so it gets stuck in that loop. I am doing 8 bytes transfers.

So, what was your temporary fix? Can you please share that so I could try it at my end?

BTW and FYI, if I replaced my drivers folder under my STM32CubeIDE ver 1.10.1 and compile, I run into multiple errors. They mostly have to do with core M7 and time stuff and that's why I only updated the SPI driver. I think I may need to get the latest STM32CubeIDE (Ver 1.11.2) and install the drivers through it to remedy the compilation issue. However, I don't think it would resolve my SPI issue and that's why I am hoping your temporary fix (if you can share it) will fix my issue for the time being.

Thank you.

S.Ma
Principal

For SPI Slave mode, on STM32L4 which uses the SPI version with 32 bit RX and TX FIFO, I use 4 wire mode SPI and use DMA channels in cyclic buffer mode. The issue is that when NSS goes up by master, the DMA TX job is to keep the FIFO full, so it is.... for the next transaction, which is the undesired application need. Remedy was to use SYS/RCC full reset of the SPI IP to flush the HW FIFO. That works. It is unfortunate to nuke a fly with an H-bomb, until someone find a gentler way to do it.