cancel
Showing results for 
Search instead for 
Did you mean: 

stm32f429 USART DMA FIFO error on transmit upon completion

Peeters.Bram
Senior

Hi,

I am using DMA1 stream 3 for USART3 for tx on an stm32f429.

Driver code is based on cubemx generated st hal drivers firmware package version 1.24 

The uart and dma initalisation parameters are as follows:

huart3.Instance = USART3;
huart3.Init.BaudRate = 921600
huart3.Init.WordLength = UART_WORDLENGTH_8B;
huart3.Init.StopBits = UART_STOPBITS_1;
huart3.Init.Parity = UART_PARITY_NONE;
huart3.Init.Mode = UART_MODE_TX_RX;
huart3.Init.HwFlowCtl = UART_HWCONTROL_NONE;
huart3.Init.OverSampling = UART_OVERSAMPLING_16;

hdma_usart3_tx.Instance = DMA1_Stream3;
hdma_usart3_tx.Init.Channel = DMA_CHANNEL_4;
hdma_usart3_tx.Init.Direction = DMA_MEMORY_TO_PERIPH;
hdma_usart3_tx.Init.PeriphInc = DMA_PINC_DISABLE;
hdma_usart3_tx.Init.MemInc = DMA_MINC_ENABLE;
hdma_usart3_tx.Init.PeriphDataAlignment = DMA_PDATAALIGN_BYTE;
hdma_usart3_tx.Init.MemDataAlignment = DMA_MDATAALIGN_BYTE;
hdma_usart3_tx.Init.Mode = DMA_NORMAL;
hdma_usart3_tx.Init.Priority = DMA_PRIORITY_LOW;
hdma_usart3_tx.Init.FIFOMode = DMA_FIFOMODE_DISABLE;

I use HAL_UART_Transmit_DMA(...) to transmit buffers (always from the same static buffer to which i copy the data to be transmitted first).

There is a semaphore before that to make sure there is only 1 action at a time till it is complete.

Now for some reason, I always get a FIFO error (LISR.FEIF3) at the exact same spot, 100 pct reproducible (and on multiple boards).
It is not the first message, more like the 30th.
And the message itself is transmitted correctly and completely.
This is confirmed by the DMA registers, LISR.TCIF3 is set to 1 and S3NDTR.NDT is 0.
These and other DMA registers at the moment the interrupt occurs are in the IAR screenshot in attachment.

If I look at the datasheet I get as possible reasons for the FIFO error:
• FIFO error: the FIFO error interrupt flag (FEIFx) is set if:
– A FIFO underrun condition is detected
– A FIFO overrun condition is detected (no detection in memory-to-memory mo
because requests and transfers are internally managed by the DMA)
– The stream is enabled while the FIFO threshold level is not compatible with th
size of the memory burst (refer to Table 48: FIFO threshold configurations)

+

In direct mode, the FIFO error flag can also be set under the following conditions:
• In the peripheral-to-memory mode, the FIFO can be saturated (overrun) if the memory
bus is not granted for several peripheral requests
• In the memory-to-peripheral mode, an underrun condition may occur if the memory bus
has not been granted before a peripheral request occurs

Since I am in direct mode it cannot be because of the threshold.
So if the fault is valid it has to be underrun, but then I would expect that the NDT is non zero to indicate at which point the DMA  encountered an underrun.

Am I overlooking something, or am I bumping into some DMA controller bug which causes the occasional spurious FIFO fault  ? But strange then that it is not random but at a fixed point in my flow.
For now I plan to modify the interrupt handler to ignore the error if NDT is 0 to deal with it and cross my fingers it only happens for complete transfers.

But it would be nice to completely understand the problem as I don't want to bury a potential real issue .

Does this ring any bells or any suggestions what else I can check ?

24 REPLIES 24
Karl Yamashita
Lead II

You need to show your code on how you're transmitting. Are you checking HAL status?

Hi Karl,

I am using the st driver code 1.24 (as mentioned in my post).

I don't understand what you mean by 'checking hal status' as the DMA transfer is working autonomously up to the point where the fifo error interrupt occurs.

Anyway the relevant functions to setup the transfer from the driver package are:

 

/**
  * @brief  Sends an amount of data in DMA mode.
  * @note   When UART parity is not enabled (PCE = 0), and Word Length is configured to 9 bits (M1-M0 = 01),
  *         the sent data is handled as a set of u16. In this case, Size must indicate the number
  *         of u16 provided through pData.
  * @PAram  huart  Pointer to a UART_HandleTypeDef structure that contains
  *                the configuration information for the specified UART module.
  * @PAram  pData Pointer to data buffer (u8 or u16 data elements).
  * @PAram  Size  Amount of data elements (u8 or u16) to be sent
  * @retval HAL status
  */
HAL_StatusTypeDef HAL_UART_Transmit_DMA(UART_HandleTypeDef *huart, uint8_t *pData, uint16_t Size)
{
  uint32_t *tmp;

  /* Check that a Tx process is not already ongoing */
  if (huart->gState == HAL_UART_STATE_READY)
  {
    if ((pData == NULL) || (Size == 0U))
    {
      return HAL_ERROR;
    }

    /* Process Locked */
    __HAL_LOCK(huart);

    huart->pTxBuffPtr = pData;
    huart->TxXferSize = Size;
    huart->TxXferCount = Size;

    huart->ErrorCode = HAL_UART_ERROR_NONE;
    huart->gState = HAL_UART_STATE_BUSY_TX;

    /* Set the UART DMA transfer complete callback */
    huart->hdmatx->XferCpltCallback = UART_DMATransmitCplt;

    /* Set the UART DMA Half transfer complete callback */
    huart->hdmatx->XferHalfCpltCallback = UART_DMATxHalfCplt;

    /* Set the DMA error callback */
    huart->hdmatx->XferErrorCallback = UART_DMAError;

    /* Set the DMA abort callback */
    huart->hdmatx->XferAbortCallback = NULL;

    /* Enable the UART transmit DMA stream */
    tmp = (uint32_t *)&pData;
    HAL_DMA_Start_IT(huart->hdmatx, *(uint32_t *)tmp, (uint32_t)&huart->Instance->DR, Size);

    /* Clear the TC flag in the SR register by writing 0 to it */
    __HAL_UART_CLEAR_FLAG(huart, UART_FLAG_TC);

    /* Process Unlocked */
    __HAL_UNLOCK(huart);

    /* Enable the DMA transfer for transmit request by setting the DMAT bit
       in the UART CR3 register */
    SET_BIT(huart->Instance->CR3, USART_CR3_DMAT);

    return HAL_OK;
  }
  else
  {
    return HAL_BUSY;
  }
}

 

and

 

/**
  * @brief  Start the DMA Transfer with interrupt enabled.
  * @PAram  hdma       pointer to a DMA_HandleTypeDef structure that contains
  *                     the configuration information for the specified DMA Stream.  
  * @PAram  SrcAddress The source memory Buffer address
  * @PAram  DstAddress The destination memory Buffer address
  * @PAram  DataLength The length of data to be transferred from source to destination
  * @retval HAL status
  */
HAL_StatusTypeDef HAL_DMA_Start_IT(DMA_HandleTypeDef *hdma, uint32_t SrcAddress, uint32_t DstAddress, uint32_t DataLength)
{
  HAL_StatusTypeDef status = HAL_OK;

  /* calculate DMA base and stream number */
  DMA_Base_Registers *regs = (DMA_Base_Registers *)hdma->StreamBaseAddress;
  
  /* Check the parameters */
  assert_param(IS_DMA_BUFFER_SIZE(DataLength));
 
  /* Process locked */
  __HAL_LOCK(hdma);
  
  if(HAL_DMA_STATE_READY == hdma->State)
  {
    /* Change DMA peripheral state */
    hdma->State = HAL_DMA_STATE_BUSY;
    
    /* Initialize the error code */
    hdma->ErrorCode = HAL_DMA_ERROR_NONE;
    
    /* Configure the source, destination address and the data length */
    DMA_SetConfig(hdma, SrcAddress, DstAddress, DataLength);
    
    /* Clear all interrupt flags at correct offset within the register */
    regs->IFCR = 0x3FU << hdma->StreamIndex;
    
    /* Enable Common interrupts*/
    hdma->Instance->CR  |= DMA_IT_TC | DMA_IT_TE | DMA_IT_DME;
    
    if(hdma->XferHalfCpltCallback != NULL)
    {
      hdma->Instance->CR  |= DMA_IT_HT;
    }
    
    /* Enable the Peripheral */
    __HAL_DMA_ENABLE(hdma);
  }
  else
  {
    /* Process unlocked */
    __HAL_UNLOCK(hdma);	  
    
    /* Return error status */
    status = HAL_BUSY;
  }
  
  return status;
}

 

 

Something extra I was just looking at now is that the bootloader that is also in my board is actually using an older version of the st drivers (1.22 ?) and that version had

 /* Enable Common interrupts*/
hdma->Instance->CR |= DMA_IT_TC | DMA_IT_TE | DMA_IT_DME;
hdma->Instance->FCR |= DMA_IT_FE;

in HAL_DMA_Start_IT.

So it enabled the DMA_IT_FE ( = FEIE) interrupt (and apparently never disabled the interrupt, except on DMA aborts, for some mysterious and probably incorrect reason), which is how it still remains enabled by the time my main application starts running.

But it seems someone at ST decided to just disable the fifo interrupt in later versions.... which is also incorrect because it can occur, so that is burying problems.

It would be interesting to know what the rationale was behind these changes, maybe ST also figured out there was a problem with it and decided to disable it completely ?

You're showing the HAL drivers. Show YOUR own code that you've wrote to call the HAL driver. 

> the bootloader that is also in my board

> So it enabled the DMA_IT_FE ( = FEIE) interrupt (and apparently never disabled the interrupt, except on DMA aborts, for some mysterious and probably incorrect reason), which is how it still remains enabled by the time my main application starts running.

Cube/HAL - quite expectantly - assumes every peripheral to be in its reset state when it starts to work. So, generally, it's a bad idea not to put everything to that state before moving execution from bootloader to application. In other works, your bootloader is buggy by not resetting all peripherals, but you still can hotfix it by doing it before Cube/HAL is started in application.

> which is also incorrect because it can occur, so that is burying problems.

If you don't use DMA FIFO (a.k.a. "Direct mode"), the FIFO interrupt is harmless except for very extreme resource starvation, which is harmless with master transmitters (as they determine the pace of transmission themselves and a few cycles of delay seldom matters); certainly not the case with UART which is compared to bus clocks very slow; and even with faster peripherals such as slave SPI at its highest baudrate, if such resource/bus starvation occurs, you have a graver problem to solve and should have DMA FIFO enabled anyway.

The harmless FIFO interrupt occurs because of the sequencing of peripheral enable vs. DMA enable (if UART Tx DMA is enabled before DMA, the TXE signal towards DMA is active sooner than DMA, so at the moment DMA is enabled it already sees the peripheral request but it does not have buffered the byte to be transmitted from the memory yet - and as I've said above, this particular case is *absolutely* harmless even in resource-starved situation, as it only means that the UART transmitter starts to transmit a few nanoseconds later than if the enabling sequence would be reversed, i.e. DMA first then UART's Tx DMA.

The change to disable this interrupt came as - as I've said -  it is harmless, and it was much easier than to make a quite substantial change in the sequencing of how the various peripherals are enabled/set up in Cube/HAL.

JW

 

Karl, I am not sure what you are looking for at higher levels, but the code surrounding the call is this

	// Tell the scheduler to not go into stop even if no task is scheduled,
	// otherwise it will corrupt the DMA controller uart transfer
	// midway when it kills the cpu
	xTaskSetStopModeBlocker( 1 );

    while( l_nCurLenTxed < length )
    {
        l_nCurLen = length - l_nCurLenTxed;
        if (l_nCurLen > m_anTxBufSize[nUartID] )
        {
            l_nCurLen = m_anTxBufSize[nUartID];
        }

        //take the binary semaphore that protects the DMA transfers
		
		//debug
		//LOGGER_LOG_ERROR(LOG_CLASS, "%s: Take Mutex %d, %x", __FUNCTION__, nUartID, &m_aTxSemaphore[nUartID]);

        if (xSemaphoreTake(m_aTxSemaphore[nUartID], CLOCKING_CONVERT_MS_OSTICKS(MAX_WAIT_TIME_MS) ) == pdTRUE)
        {
            //copy the data to a dedicated buffer
            memcpy(m_aTxBuffer[nUartID], pData + l_nCurLenTxed, l_nCurLen);

            //start the DMA transfer
            if (HAL_UART_Transmit_DMA(m_aphuart[nUartID], (uint8_t*)m_aTxBuffer[nUartID], l_nCurLen) != HAL_OK)
            {
				// Avoid an infinite loop of messages by not adding an extra message if the message we are printing is Seri*
				if ( ( length < 4 ) || (pData[0] != 'S' ) || (pData[1] != 'e' ) || (pData[2] != 'r' ) || (pData[2] != 'i' ))
				{
    	            LOGGER_LOG_ERROR(LOG_CLASS, "%s: HAL_UART_Transmit_DMA failed", __FUNCTION__);
				}

                //release the binary semaphore again to stop the next SerialDbg_Write from blocking indefinitely

				//debug
				//LOGGER_LOG_ERROR(LOG_CLASS, "%s: Give Mutex %d, %x", __FUNCTION__, nUartID, &m_aTxSemaphore[nUartID]);

                xSemaphoreGive(m_aTxSemaphore[nUartID]);

                return e_RETURNVALUE_Failure;
            }
            else
            {
                l_nCurLenTxed += l_nCurLen;
            }
        }
        else
        {
            return e_RETURNVALUE_Failure;
        }

    }

	// Ok to go to stop mode now
	xTaskSetStopModeBlocker( 0 );

 The initialization parameters for uart/dma I already gave in my initial post.

 

 


application. In other works, your bootloader is buggy by not resetting all peripherals

True (though to nitpick, you cannot reset everything, eg IWDG 🙂 ).


> which is also incorrect because it can occur, so that is burying problems.

If you don't use DMA FIFO (a.k.a. "Direct mode"), the FIFO interrupt is harmless except for very extreme resource starvation, which is harmless with master transmitters (as they determine the pace of transmission themselves and a few cycles of delay seldom matters);

Ah right, and I see in the manual that the DMA controller  just continues in case of an overrun/underrun condition.

If the DMEIFx or the FEIFx flag is set due to an overrun or underrun condition, the faulty
stream is not automatically disabled and it is up to the software to disable or not the stream
by resetting the EN bit in the DMA_SxCR register. This is because there is no data loss
when this kind of errors occur

Now,  I understand that there is no data loss in an underrun TX scenario, but in an RX overrun scenario there will be data loss imho unless flow control is enabled ? So for RX it seems to me FEIE should be enabled (so you at least know you lost data), and since both RX and TX use the same HAL_DMA_Start_IT function,  it should always be enabled....

 

if such resource/bus starvation occurs, you have a graver problem to solve and should have DMA FIFO enabled anyway.


I think we agree on that, I am also arguing problems are potentially buried by not enabling the FEIE in the more recent drivers.


The harmless FIFO interrupt occurs because of the sequencing of peripheral enable vs. DMA enable (if UART Tx DMA is enabled before DMA, the TXE signal towards DMA is active sooner than DMA, so at the moment DMA is enabled it already sees the peripheral request but it does not have buffered the byte to be transmitted from the memory yet


Except that does not seem to be the case here. In HAL_UART_Transmit_DMA the sequence is done in a correct way: DMA is enabled first, then the DMAT is set.

 

    /* Enable the UART transmit DMA stream */
    tmp = (uint32_t *)&pData;
    HAL_DMA_Start_IT(huart->hdmatx, *(uint32_t *)tmp, (uint32_t)&huart->Instance->DR, Size);

    /* Clear the TC flag in the SR register by writing 0 to it */
    __HAL_UART_CLEAR_FLAG(huart, UART_FLAG_TC);

    /* Process Unlocked */
    __HAL_UNLOCK(huart);

    /* Enable the DMA transfer for transmit request by setting the DMAT bit
       in the UART CR3 register */
    SET_BIT(huart->Instance->CR3, USART_CR3_DMAT);

 

So this should not trigger the FIFO error because of that reason (unless this order is still wrong ??).

 


The change to disable this interrupt came as - as I've said -  it is harmless, and it was much easier than to make a quite substantial change in the sequencing of how the various peripherals are enabled/set up in Cube/HAL.



So in RX scenarios I don't see how it is harmless (except for cases with flow control). And it seems the order is already correct.

So I am wondering if I am actually running into the real underrun situation with my uart at 921600 baud. HCLK is set to 48Mhz.

TDK
Guru

> Now,  I understand that there is no data loss in an underrun TX scenario, but in an RX overrun scenario there will be data loss imho unless flow control is enabled ?

Yes if you overrun in RX, data will be lost. But that is typically due to code bugs. A baud rate of 900k does not stress the bandwidth of the DMA.

FEIF should be ignored if not using the FIFO, and FEIE should be cleared. It's a non-issue.

If you feel a post has answered your question, please click "Accept as Solution".


FEIF should be ignored if not using the FIFO, and FEIE should be cleared. It's a non-issue.


Fifo error means something even in direct mode as explicitly indicated in the documentation.

And we have an error being generated by the hardware for which there is so far no explanation ( it is not because of a wrong order in the init sequence, and it is not because of bandwidth problems apparently )

For me that is not a 'non-issue'.

I understand for the TX path it will not cause data loss, but if the same thing happens on the RX path it might.

It is not a nuclear reactor control system I am working on (thank god), but unexplained ignored problems have a tendency to bite you in the ass in the long run (and they are also great for getting deeper insights 🙂 ).

I will try lowering the baud rate to see if that changes something (though I don't think either outcome is hard proof for the bandwidth hypothesis).

 

 

Even though you're checking the HAL status for HAL_UART_Transmit_DMA, it's not enough.

I don't see any code for HAL_UART_TxCpltCallback?

You need to set a flag to indicate that the DMA is in the process of sending data. When you get a HAL_UART_TxCpltCallback, clear the flag. Once the flag is cleared, you can then call HAL_UART_Transmit_DMA over and over without it falling into the FIFO Error Interrupt management.

 

So i wrote this code for the STM32F429-Disco board.

Here is an example, i'm calling PollingInit() and PollingRoutine() from this task

/* USER CODE BEGIN Header_StartDefaultTask */
/**
  * @brief  Function implementing the defaultTask thread.
  * @PAram  argument: Not used
  * @retval None
  */
/* USER CODE END Header_StartDefaultTask */
void StartDefaultTask(void const * argument)
{
  /* init code for USB_HOST */
  MX_USB_HOST_Init();
  /* USER CODE BEGIN 5 */
  /* Infinite loop */
  PollingInit();
  for(;;)
  {
	  PollingRoutine();
    osDelay(1);
  }
  /* USER CODE END 5 */
}

 

So here i am sending "Hello World" with a counter. If i comment out line 34, I get the FIFO errors and I see the LED toggle. If i do check for this flag, I never once get a FIFO error nor do i see the LED toggle. 

/*
 * PollingRoutine.c
 *
 *  Created on: Oct 24, 2023
 *      Author: karl.yamashita
 *
 *
 *      Template for projects.
 *
 */


#include "main.h"



extern UART_HandleTypeDef huart1;

char txData[128] = {0};
uint32_t counter = 1;
bool txPending = false;


void PollingInit(void)
{

}

void PollingRoutine(void)
{
	sprintf((char*)txData, "Hello World, Counter= %ld\r\n", counter);
	while(1)
	{
		if(!txPending) // if you comment this out, you're going to get FIFO errors. But if you check if flag is cleared, then you won't get any FIFO errors.
		{
			if(HAL_UART_Transmit_DMA(&huart1, (uint8_t*)txData, strlen((char*)txData)) == HAL_OK)
			{
				txPending = true;
				sprintf((char*)txData, "Hello World, Counter= %ld\r\n", ++counter);
			}
			else
			{
				HAL_GPIO_TogglePin(LD3_GPIO_Port, LD3_Pin);
			}
		}
	}
}

void HAL_UART_TxCpltCallback(UART_HandleTypeDef *huart)
{
	txPending = 0;
}

 

If i put a breakpoint at line 4, it never breaks if i check for the txPending flag. But if i don't check the flag, then the debugger always break after so many messages are sent.

/* FIFO Error Interrupt management ******************************************/
  if ((tmpisr & (DMA_FLAG_FEIF0_4 << hdma->StreamIndex)) != RESET)
  {
    if(__HAL_DMA_GET_IT_SOURCE(hdma, DMA_IT_FE) != RESET)
    {
      /* Clear the FIFO error flag */
      regs->IFCR = DMA_FLAG_FEIF0_4 << hdma->StreamIndex;

      /* Update error code */
      hdma->ErrorCode |= HAL_DMA_ERROR_FE;
    }
  }