cancel
Showing results for 
Search instead for 
Did you mean: 

STM32F7 SAI DMA FIFO error

When using DMA to realize a SAI transmit (double buffer, rearmed in HAL_SAI_TxCpltCallback), I found some very strange behavior.

In some configurations, a FIFO error interrupt (FEIF, ref. manual p. 235) happens after a few seconds. The time until this error occurs mainly depends on the audio frequency.

This is my SAI block config:

hsai_BlockA2.Instance = SAI2_Block_A;
hsai_BlockA2.Init.AudioMode = SAI_MODEMASTER_TX;
hsai_BlockA2.Init.Synchro = SAI_ASYNCHRONOUS;
hsai_BlockA2.Init.OutputDrive = SAI_OUTPUTDRIVE_DISABLE;
hsai_BlockA2.Init.NoDivider = SAI_MASTERDIVIDER_ENABLE;
hsai_BlockA2.Init.FIFOThreshold = SAI_FIFOTHRESHOLD_EMPTY;
hsai_BlockA2.Init.AudioFrequency = SAI_AUDIO_FREQUENCY_48K;
hsai_BlockA2.Init.SynchroExt = SAI_SYNCEXT_DISABLE;
hsai_BlockA2.Init.MonoStereoMode = SAI_STEREOMODE;
hsai_BlockA2.Init.CompandingMode = SAI_NOCOMPANDING;
hsai_BlockA2.Init.TriState = SAI_OUTPUT_NOTRELEASED;

This is my corresponding DMA config:

hdma_sai2_a.Instance = DMA2_Stream4;
hdma_sai2_a.Init.Channel = DMA_CHANNEL_3;
hdma_sai2_a.Init.Direction = DMA_MEMORY_TO_PERIPH;
hdma_sai2_a.Init.PeriphInc = DMA_PINC_DISABLE;
hdma_sai2_a.Init.MemInc = DMA_MINC_ENABLE;
hdma_sai2_a.Init.PeriphDataAlignment = DMA_PDATAALIGN_WORD;
hdma_sai2_a.Init.MemDataAlignment = DMA_MDATAALIGN_WORD;
hdma_sai2_a.Init.Mode = DMA_NORMAL;
hdma_sai2_a.Init.Priority = DMA_PRIORITY_HIGH;
hdma_sai2_a.Init.FIFOMode = DMA_FIFOMODE_ENABLE;
hdma_sai2_a.Init.FIFOThreshold = DMA_FIFO_THRESHOLD_FULL;
hdma_sai2_a.Init.MemBurst = DMA_MBURST_SINGLE;
hdma_sai2_a.Init.PeriphBurst = DMA_PBURST_SINGLE;

With the given audio frequency (48 kHz) I found these factors that influence the error:

  • Disabling the DMA FIFO -> no effect
  • Changing the CPU clock frequency -> no effect
  • Increasing audio frequency (192 kHz) -> interrupt happens very fast
  • Decreasing audio frequency (8 kHz) -> interrupt never happens
  • Placing the DMA buffer in DTCM RAM instead of SRAM -> interrupt never happens (only STM32F746NG)
  • Using the 4 Increment Burst Size for the DMA FIFO -> interrupt never happens
  • Using the DMA Circular Mode (regardless of the FIFO settings) -> interrupt never happens

I was able to reproduce this behavior with a STM32F746NG and a STM32F769VG.

From what I can see in the CubeMX sample project, the prefered way to use the DMA with SAI seems to be in Circular Mode.

However I want to understand this behavior in order to avoid further problems.

Can anyone explain to me, why this error interrupt occurs and why the prefered way in the CubeMX samples is Circular Buffer Mode?

7 REPLIES 7

This error means, that the peripheral (SAI) set request to DMA while there were no data in the DMA's FIFO. It indicates possible underflow (there might have been no actual underflow as the DMA might have fulfilled the request - which is risen when the holding register is empty - even before the shift register was emptied).

This happens with noncircular DMA because the latency from transfer complete to invoke the interrupt, handle it, and restart the DMA to serve the next request, is in your particular software longer than it takes to transmit the current frame from the shift register of SAI and load the next frame (which empties the holding register and raises the request to DMA).

> why the prefered way in the CubeMX samples is Circular Buffer Mode?

This has nothing to do with CubeMX - generally, peripherals which are relatively fast and input/output a continuous stream of data, are to be served by a circular DMA as that serves as their FIFO and screens the peripheral from latencies in software.

Btw. there is no STM32GF69VG.

JW

Thank you for your fast response!

> Btw. there is no STM32GF69VG.

I fixed the typo...

> This happens with noncircular DMA because the latency from transfer complete to invoke the interrupt, handle it, and restart the DMA to serve the next request, is in your particular software longer than it takes to transmit the current frame from the shift register of SAI and load the next frame (which empties the holding register and raises the request to DMA).

Ok, so I found this in the STM32 HAL code:

static void SAI_DMATxCplt(DMA_HandleTypeDef *hdma)
{
  SAI_HandleTypeDef* hsai = (SAI_HandleTypeDef*)((DMA_HandleTypeDef* )hdma)->Parent;
 
  if((hdma->Instance->CR & DMA_SxCR_CIRC) == 0)
  {
    hsai->XferCount = 0;
 
    /* Disable SAI Tx DMA Request */
    hsai->Instance->CR1 &= (uint32_t)(~SAI_xCR1_DMAEN);
 
    /* Stop the interrupts error handling */
    __HAL_SAI_DISABLE_IT(hsai, SAI_InterruptFlag(hsai, SAI_MODE_DMA));
 
    hsai->State= HAL_SAI_STATE_READY;
  }
#if (USE_HAL_SAI_REGISTER_CALLBACKS == 1)
  hsai->TxCpltCallback(hsai);
#else
  HAL_SAI_TxCpltCallback(hsai);
#endif /* USE_HAL_SAI_REGISTER_CALLBACKS */
}

What you are saying is that the code in line 10 disables the SAI transmit. If this happens too late, the underrun error occurs.

By adding a small delay before line 10, I was able to enforce the error every time.

With your explanation it makes sense to always use DMA in circular mode. Does this mean however that I could run into this problem with other fast peripherals? I have seen that the HAL UART implementation uses the same mechanism for DMA data transmission.

If I'm transmitting a data block with fixed size, should I use circular DMA for this too?

Edit:

There is one thing that I still don't understand:

I'm always arming the DMA with a fixed block size (256 * uint32_t). Manually clearing the FEIF flag after each transmission complete interrupt also resolves the problem. So the error definitely occurs after a finished transmission and not during the start of a new one. This matches your description.

Why is there an underrun error after the transmission was finished? If all 256 samples were transmitted correctly, I would expect that the successful transmission clears and disables all pending interrupts.

> So the error definitely occurs after a finished transmission and not during the start of a new one.

Certainly not. After DMA in non-circular mode sets TC, it disables itself and won't check for "FIFO empty and request active" anymore. The error occurs after DMA is enabled again, if the frame from holding register of SAI has been "spent" meantime.

IMO you might be confused by some Cube code. I don't use Cube and won't comment on it.

JW

Alright, I understand that.

However I still have a problem understanding why a complete transmission causes an error in the next transmission.

If I do something like this, while actively delaying the execution of the TC ISR (small amount):

HAL_SAI_Transmit_DMA(...);
HAL_Delay(100);
HAL_SAI_Transmit_DMA(...);

The second call will always trigger the FEIF error. Why do the two function calls behave differently?

I added a small delay in the TC ISR before this register is cleared:

 hsai->Instance->CR1 &= (uint32_t)(~SAI_xCR1_DMAEN);

So the problem is not the rearming of the SAI DMA by the application but rather the time it takes for the controller to clear the SAI_xCR1_DMAEN in the ISR?

> I still have a problem understanding why a complete transmission causes an error in the next transmission.

I'm not going to investigate, but maybe there's a latch after the the root signal gated by the DMA-enable bit.

JW

SFren.2
Associate

So I had a similar issue.

I2C and DMA in direct mode. Debug mode worked fine, in release the second transmission would receive 16 bytes (full FIFO), the FIFO error flag was set, the FIFO full flag was set, and the transmission never finished.

The problem was hardware level optimization. Basically the stream was being enabled an unnecessary number of times, and for some reason this was switching the DMA stream into a pseudo FIFO mode.

while (!LL_DMA_IsEnabledStream(DMA, STREAM))
{
     LL_DMA_EnableStream(DMA, STREAM);
}

In the above code, for whatever reason LL_DMA_IsEnabledStream() was returning false, but viewing the register in a debugger, the stream was clearly enabled! So the Cortex M4 was being too clever for it's own good.

The solution: Data Memory Barrier. This ensures that the read/write operations actually happen, rather than the Cortex M deciding these operations have no effect.

while (!LL_DMA_IsEnabledStream(DMA, STREAM))
{
	LL_DMA_EnableStream(DMA, STREAM);
	__DMB();
}

Notes:

  • The HAL implementation, __HAL_DMA_ENABLE() does not check if the stream is enabled, it simply sets the EN bit.
  • I am using Rust, and so the example code provided is similar, but not the same as I used
  • As Rust is the language I am using, LLVM generates the optimized machine code, and so this will also potentially be different

Which STM32?

> DMA in direct mode

Direct mode means FIFO is not used.

There still may be FIFO errors, as explained in the DMA appnote, but they can be safely ignored.

The DMB does not ensure register write, here it serves simply as a small but consistent delay. There is no need for that loop either.

JW