STM32H7xx - Serious Bug in SPI Driver HAL Version 1.3.0, 1.7.0 and 1.8.0
We were experiencing strange crashes originating from HAL_SPI_TransmitReceive and HAL_SPI_Receive when operating in SPI Slave mode.
Issue is that STM32 HAL code is not handling cases for 8 bit SPI Transfers correctly causing overflow of user buffers and corruption of stack or heap.
Here is simplified example of code:
uint8_t txData[4];
uint8_t rxData[4];
HAL_SPI_TransmitReceive(hspi, txData, rxData, 4, HAL_MAX_DELAY);
On some occasions function will increment rxData pointer past its size of 4 bytes and will copy SPI data to memory that was not designated to be written. In above case stack will be overwritten.
Bug is in function HAL_SPI_TransmitReceive() and HAL_SPI_Receive and it is in the block that handles 8 bit transfers:
/* Receive data in 8 Bit mode */
else
{
/* Transfer loop */
while (hspi->RxXferCount > 0U)
{
/* Check the RXWNE/FRLVL flag */
if (hspi->Instance->SR & (SPI_FLAG_RXWNE|SPI_FLAG_FRLVL))
{
if (hspi->Instance->SR & SPI_FLAG_RXWNE)
{
*((uint32_t *)hspi->pRxBuffPtr) = *((__IO uint32_t *)&hspi->Instance->RXDR);
hspi->pRxBuffPtr += sizeof(uint32_t);
hspi->RxXferCount-=4;
}
else if ((hspi->Instance->SR & SPI_FLAG_FRLVL) > SPI_FRLVL_QUARTER_FULL)
{
*((uint16_t *)hspi->pRxBuffPtr) = *((__IO uint16_t *)&hspi->Instance->RXDR);
hspi->pRxBuffPtr += sizeof(uint16_t);
hspi->RxXferCount-=2;
}
else
{
*((uint8_t *)hspi->pRxBuffPtr) = *((__IO uint8_t *)&hspi->Instance->RXDR);
hspi->pRxBuffPtr += sizeof(uint8_t);
hspi->RxXferCount--;
}
}
else
{
....
...
}
It actually occurs that E.g. we are expecting 4 bytes, but master sends more than 4 bytes and by the time we poll SPI buffer we have:
- 1 byte in FIFO
- Code reads byte and decrements RxXferCount so its value is now 3.
- Another 4 bytes arrive to SPI FIFO
- At this point code reads 4 bytes into buffer as it runs if (hspi->Instance->SR & SPI_FLAG_RXWNE) condition. At this point corruption has already occured as only 3 bytes were requested by caller of the function.
- Code then subtracts 4 from RxXferCount and count is now 0xFFFF,
- The rest is now in a hands of master. If master sends more data if there is no timeout set it will copy up to 0xFFFF bytes into RAM causing catasrophic failure of the program.
This is very easy to reproduce.
Can someone from STM development team review this code and let us know what the next step should be ?
