STM32 SPI with DMA, DMA complete but SPI_BSY still

gwillette · ‎2015-07-15

Posted on July 15, 2015 at 22:47

Here is my scenario:

SPI communications from STM32 to an FRAM NV device using SPI3. STM32 is SPI master, and SPI3 is set up to use the appropriate DMA channels for both RX and TX. The cube tool was used to setup the pin configuration on the STM32. There are currently no other users of SPI3 (they have been disabled for debugging purposes).

The problem:

DMA TX and RX are both successful, unless I attempt to perform a write then read of some arbitrary number of bytes (512 bytes or more). I am using the ST-generated callback routines to signal the end of each respective DMA transaction, but I will focus on DMA TX because that is where I am seeing an issue. SPI_DMATransmitCplt is called once the DMA transaction has been completed, and not before (correct me if I'm wrong).

This function waits until the TXE flag is set (ready to be loaded for next transaction) and then waits until the SPI_BSY flag is reset. If both of these events pass, I set an a bit in an event group, which allows the original API function (which I wrote) to un-block. If I immediately check the SPI_BSY flag at this point, it is reliably busy for specific-sized transmissions, and remains busy for a time proportional to the transmission size. From 1 'tick' to 600 'ticks' (for TX of 65535 bytes). This tells me that the BSY flag is indeed related to the TX which I initiated.

The initial issue, and what is really a problem still, is that if I don't manually insert a check and wait for the SPI_BSY flag after a DMA TX, there is a chance that when I attempt to perform an RX, it will fail with status SPI_BSY. Even though the SPI_DMATransmitCplt() function processed without error.

Can someone help me understand what's happening here? I don't understand how the ST-provided callback can wait until the BSY flag is reset before resuming, and then immediately after, the SPI is BSY.

Thanks,

Gary

#spi-dma #spi-dma-bsy

jpeacock · ‎2015-07-16

Posted on July 16, 2015 at 16:33

The SPI TX DMA completes when the last byte is loaded into the shift register. SPI TXE is buffered so it is not an end of transmission. The delay you see is the time for the last byte to shift out.

If you have an RX DMA channel set up to read the data shifted in at the same time then use the RX DMA completion since it finishes when the last byte is shifted in, guarantees the last shift out is done. An RX ready is an end of transmission.

Jack Peacock

gwillette · ‎2015-07-16

Posted on July 16, 2015 at 20:07

Here is the TX callback function:

static
void
SPI_DMATransmitCplt(
struct
__DMA_HandleTypeDef *hdma)
{
SPI_HandleTypeDef* hspi = ( SPI_HandleTypeDef* )((DMA_HandleTypeDef* )hdma)->Parent;
/* DMA Normal Mode */
if
((hdma->Instance->CCR & DMA_CIRCULAR) == 0)
{
/* Wait until TXE flag is set to send data */
if
(SPI_WaitOnFlagUntilTimeout(hspi, SPI_FLAG_TXE, RESET, SPI_TIMEOUT_VALUE) != HAL_OK)
{
SET_BIT(hspi->ErrorCode, HAL_SPI_ERROR_FLAG);
}
/* Disable Tx DMA Request */
CLEAR_BIT(hspi->Instance->CR2, SPI_CR2_TXDMAEN);
/* Wait until Busy flag is reset before disabling SPI */
if
(SPI_WaitOnFlagUntilTimeout(hspi, SPI_FLAG_BSY, SET, SPI_TIMEOUT_VALUE) != HAL_OK)
{
SET_BIT(hspi->ErrorCode, HAL_SPI_ERROR_FLAG);
}
hspi->TxXferCount = 0;
hspi->State = HAL_SPI_STATE_READY;
}
/* Clear OVERUN flag in 2 Lines communication mode because received is not read */
if
(hspi->Init.Direction == SPI_DIRECTION_2LINES)
{
__HAL_SPI_CLEAR_OVRFLAG(hspi);
}
/* Check if Errors has been detected during transfer */
if
(hspi->ErrorCode != HAL_SPI_ERROR_NONE)
{
HAL_SPI_ErrorCallback(hspi);
}
else
{
HAL_SPI_TxCpltCallback(hspi);
}
}

You'll notice that in the callback, we wait for the TXE flag to be set, and then we wait for the SPI_BSY flag to be reset before continuing on. (I don't know why the polarity is opposite for the SPI_WaitOnFlagUntilTimeout() fcn, but it is. I verified that when the argument is 'RESET' we are waiting for the bit to be set, and vice versa. Again, everything you are seeing was written by ST.) Shouldn't the action of waiting for BSY to be reset be sufficient? It is according to the reference manual... Also there should only be one byte of data that we wait for between the TXE getting set and the BSY getting reset, right? Per the reference manual ''In transmission mode, when the DMA has written all the data to be transmitted (flag TCIF is set in the DMA_ISR register), the BSY flag can be monitored to ensure that the SPI communication is complete. This is required to avoid corrupting the last transmission before disabling the SPI or entering the Stop mode. The software must first wait until TXE=1 and then until BSY=0.''

jpeacock · ‎2015-07-16

Posted on July 16, 2015 at 22:10

A spinlock inside an interrupt? Am I missing something? No wonder the HAL has a bad reputation. The old library never did anything that, uhh, out of the ordinary.

First of all an interrupt (or a callback inside an interrupt) should never do a spinlock (wait loop polling a flag). It defeats the purpose of using an interrupt in the first place. Might as well just poll everything.

I don't use the HAL so I can't explain what ST is doing. You may be seeing a race condition between SPI and DMA. How is the SPI RX data register cleared after every TX DMA transfer? Does the HAL set up an RX DMA to clear the data in register?

I can only reiterate that SPI DMA works fine if you use the RX DMA to keep the incoming SPI buffer cleared and use the RX DMA TC event as end of transmission. I never tried what ST recommends, which in any case is a very poor example of how to handle real time events. If the SPI RX is not being cleared after every TX DMA that's likely the cause of the timing problem due to buffer overflow.

A lot of the problems with SPI are a result of not keeping RX in sync with TX. It's easy to forget SPI is bi-directional and try to use it like a full duplex UART. Make sure the TX and RX are both inactive before starting and always, always, always clear the RX after every TX. Once the RX overflows the flags become unreliable.

Jack Peacock

gwillette · ‎2015-07-17

Posted on July 17, 2015 at 19:43

I was able to get the issue resolved by doing what you suggested. I am using HAL_SPI_TransmitReceive_DMA() (builtin from ST) with either a non-incrementing 'dummy' u8 for read or write (depending on the desired transaction). This function uses a similar looking callback to what I posted above, except this one pays attention to both the RXNE and TXE flags. More importantly, the callback is not called until the DMA TX and RX is complete. In testing, I find that the 'spinlocks' never have to wait for the flags, but I agree, the potential to hang on each flag for 10ms is dreadful in an interrupt. Perhaps ST figures if this ever happens there is another serious issue which makes waiting 10ms relatively minor.

In the effort of continuing to use HAL, and not rewrite the entire project, I will continue with this method. The calling task is blocked until the DMA is truly finished this time, and there is no required delay once the task unblocks. The way a DMA transfer should work....

Thanks very much for your help.

Gary