stm32f429 USART DMA FIFO error on transmit upon completion

Peeters.Bram · ‎2024-02-09

Hi,

I am using DMA1 stream 3 for USART3 for tx on an stm32f429.

Driver code is based on cubemx generated st hal drivers firmware package version 1.24

The uart and dma initalisation parameters are as follows:

huart3.Instance = USART3;
huart3.Init.BaudRate = 921600
huart3.Init.WordLength = UART_WORDLENGTH_8B;
huart3.Init.StopBits = UART_STOPBITS_1;
huart3.Init.Parity = UART_PARITY_NONE;
huart3.Init.Mode = UART_MODE_TX_RX;
huart3.Init.HwFlowCtl = UART_HWCONTROL_NONE;
huart3.Init.OverSampling = UART_OVERSAMPLING_16;

hdma_usart3_tx.Instance = DMA1_Stream3;
hdma_usart3_tx.Init.Channel = DMA_CHANNEL_4;
hdma_usart3_tx.Init.Direction = DMA_MEMORY_TO_PERIPH;
hdma_usart3_tx.Init.PeriphInc = DMA_PINC_DISABLE;
hdma_usart3_tx.Init.MemInc = DMA_MINC_ENABLE;
hdma_usart3_tx.Init.PeriphDataAlignment = DMA_PDATAALIGN_BYTE;
hdma_usart3_tx.Init.MemDataAlignment = DMA_MDATAALIGN_BYTE;
hdma_usart3_tx.Init.Mode = DMA_NORMAL;
hdma_usart3_tx.Init.Priority = DMA_PRIORITY_LOW;
hdma_usart3_tx.Init.FIFOMode = DMA_FIFOMODE_DISABLE;

I use HAL_UART_Transmit_DMA(...) to transmit buffers (always from the same static buffer to which i copy the data to be transmitted first).

There is a semaphore before that to make sure there is only 1 action at a time till it is complete.

Now for some reason, I always get a FIFO error (LISR.FEIF3) at the exact same spot, 100 pct reproducible (and on multiple boards).
It is not the first message, more like the 30th.
And the message itself is transmitted correctly and completely.
This is confirmed by the DMA registers, LISR.TCIF3 is set to 1 and S3NDTR.NDT is 0.
These and other DMA registers at the moment the interrupt occurs are in the IAR screenshot in attachment.

If I look at the datasheet I get as possible reasons for the FIFO error:
• FIFO error: the FIFO error interrupt flag (FEIFx) is set if:
– A FIFO underrun condition is detected
– A FIFO overrun condition is detected (no detection in memory-to-memory mo
because requests and transfers are internally managed by the DMA)
– The stream is enabled while the FIFO threshold level is not compatible with th
size of the memory burst (refer to Table 48: FIFO threshold configurations)

+

In direct mode, the FIFO error flag can also be set under the following conditions:
• In the peripheral-to-memory mode, the FIFO can be saturated (overrun) if the memory
bus is not granted for several peripheral requests
• In the memory-to-peripheral mode, an underrun condition may occur if the memory bus
has not been granted before a peripheral request occurs

Since I am in direct mode it cannot be because of the threshold.
So if the fault is valid it has to be underrun, but then I would expect that the NDT is non zero to indicate at which point the DMA encountered an underrun.

Am I overlooking something, or am I bumping into some DMA controller bug which causes the occasional spurious FIFO fault ? But strange then that it is not random but at a fixed point in my flow.
For now I plan to modify the interrupt handler to ignore the error if NDT is 0 to deal with it and cross my fingers it only happens for complete transfers.

But it would be nice to completely understand the problem as I don't want to bury a potential real issue .

Does this ring any bells or any suggestions what else I can check ?

Peeters.Bram · ‎2024-02-10

Hi Karl,

That is one of the jobs of the semaphore in my code (which is the thread safe way of doing what you are doing with your flag).

My callback complete is

e_ReturnValue SerialDbg_TxCpltCallback(  uint8_t nUartID )
{
    if (!l_bInitialized)
    {
        LOGGER_LOG_ERROR(LOG_CLASS, "%s: module is not initialized", __FUNCTION__);
        return e_RETURNVALUE_IllegalState;
    }

    portBASE_TYPE xHigherPriorityTaskWoken = pdFALSE;

	//debug
	//LOGGER_LOG_ERROR(LOG_CLASS, "%s: Give Mutex %d, %x", __FUNCTION__, nUartID, &m_aTxSemaphore[nUartID]);

    xSemaphoreGiveFromISR( m_aTxSemaphore[nUartID], &xHigherPriorityTaskWoken );

#if configUSE_PREEMPTION
    portYIELD_FROM_ISR(xHigherPriorityTaskWoken);
#endif

    return e_RETURNVALUE_Success;
}

(Btw my logging functions are safe to call from interrupt in case you were wondering)

(Besides this top level semaphore, there is actually another semaphore I added in a glue wrapper layer (automaticly inserted with some #define magic) around the uart functions (and other hall drivers) to block any 2 concurrent uart actions from happening (needed since the hal drivers do not support USE_RTOS = 1 (see eg __HAL_LOCK) ..... quite sad since they do generate code with freertos ).

Edit: I do realize now that my function to re-allow to go to into stop mode is too soon since at that point the dma controller is still busy transmitting the last subpart of the string. But at the moment the problem occurs I have not triggered another setting that enables stop mode so it should not be the cause. But I will fix this and retest.

TDK · ‎2024-02-10

That's fair. I thought the flag was getting set if fifo was disabled, but now I see that is not the case.

I was unable to replicate this behavior on a Nucleo-F429 board. The issue is probably specific to your code and something going on inside of it. Hard to know how that's manifesting as an error in the peripheral.

If you feel a post has answered your question, please click "Accept as Solution".

Karl Yamashita · ‎2024-02-10

Ok, so more hidden code that you're not showing, sigh. I can't follow what's going on with code that you don't show.

Anyway, take out xSemaphoreGive in yourHAL_UART_Transmit_DMA routine. That should work without getting a FIFO error.

if (xSemaphoreTake(m_aTxSemaphore[nUartID], CLOCKING_CONVERT_MS_OSTICKS(MAX_WAIT_TIME_MS) ) == pdTRUE)
        {
            //copy the data to a dedicated buffer
            memcpy(m_aTxBuffer[nUartID], pData + l_nCurLenTxed, l_nCurLen);

            //start the DMA transfer
            if (HAL_UART_Transmit_DMA(m_aphuart[nUartID], (uint8_t*)m_aTxBuffer[nUartID], l_nCurLen) != HAL_OK)
            {
				// Avoid an infinite loop of messages by not adding an extra message if the message we are printing is Seri*
				if ( ( length < 4 ) || (pData[0] != 'S' ) || (pData[1] != 'e' ) || (pData[2] != 'r' ) || (pData[2] != 'i' ))
				{
    	            LOGGER_LOG_ERROR(LOG_CLASS, "%s: HAL_UART_Transmit_DMA failed", __FUNCTION__);
				}

                //release the binary semaphore again to stop the next SerialDbg_Write from blocking indefinitely

				//debug
				//LOGGER_LOG_ERROR(LOG_CLASS, "%s: Give Mutex %d, %x", __FUNCTION__, nUartID, &m_aTxSemaphore[nUartID]);

            //    xSemaphoreGive(m_aTxSemaphore[nUartID]); // you don't need!!!

                return e_RETURNVALUE_Failure;
            }
            else
            {
                l_nCurLenTxed += l_nCurLen;
            }
        }
        else
        {
            return e_RETURNVALUE_Failure;
        }

Tips and Tricks with TimerCallback https://www.youtube.com/@eebykarl
If you find my solution useful, please click the Accept as Solution so others see the solution.

Peeters.Bram · ‎2024-02-10

Karl,

That is the error path , where you can only come if a DMA transfer was not started so that the transfer complete callback will never happen.

If you do not release the semaphore in the error path, it will never be released and all future transmissions will timeout on trying to get the semaphore (as mentioned in the comment).

BTW double checking the code to make sure what I am saying is correct, I see a bug in the ST HAL_UART_Transmit_DMA function where they do not check the result of the HAL_DMA_Start_IT function call,. so even if the HAL_UART_Transmit_DMA call returns success, DMA might still not have started. It is not something I ever ran into (so far), but still best to fix it.

I do not show all the code, only the parts that seem relevant for this problem to keep things focussed. That means of course that I risk being wrong and left something out that is in fact important, but I have to make some sort of selection. And you could have been right in that concurrent DMA actions where the cause of the problem , but I hope the code and explanation I have added clarifies that that is not the case.

Peeters.Bram · ‎2024-02-10

I am now wondering if the problem might be related to sleep states of the processor.

I am sure I am not going into stop mode, but I do have Freertos 'tickless idle' enabled which also relies on the __WFI instruction

If i quickly skim through 'https://community.st.com/t5/stm32-mcus-products/problem-using-dma-and-a-timer-while-cpu-is-in-sleep-mode/td-p/141825' then it seems it might indeed cause DMA problems (though I do not yet understand everything said there).

I have to try with a test that disables not only stop but also tickless idle during dma transfers....

Karl Yamashita · ‎2024-02-10

FreeRTOS explicitly says you don't need it when you have xSemaphoreGiveFromISR in in an ISR.

So for testing i did what you've done. I can replicate your issue where i get FIFO errors if i release it from inside the HAL_UART_Transmit_DMA

If i comment out xSemaphoreGive i don't get the FIFO errors.

Tips and Tricks with TimerCallback https://www.youtube.com/@eebykarl
If you find my solution useful, please click the Accept as Solution so others see the solution.

Peeters.Bram · ‎2024-02-10

>FreeRTOS explicitly says you don't need it when you have xSemaphoreGiveFromISR in in an ISR.

I am sorry I am not following, what exactly is "it' in that sentence ?

>So for testing i did what you've done. I can replicate your issue where i get FIFO errors if i release it from inside the HAL_UART_Transmit_DMA

I am confused by what exactly you are doing here.

I am not releasing the m_aTxSemaphore[nUartID] from inside HAL_UART_Transmit_DMA. I am releasing it if HAL_UART_Transmit_DMA fails, and if it does not fail then it will be released in the complete callback.

If you are referring to the comment (that I already regret, too much information already probably) I made wrt __HAL_LOCK not working, that is a different semaphore , one that guards against concurrent operations on the uart (so common for all the functions, not a specific tx semaphore like this one). Think of it as an implementation for HAL_LOCK for RTOSes.

If I am misunderstanding and you are doing some else please elaborate.

Karl Yamashita · ‎2024-02-10

Here is an example from FreeRTOS

/* Repetitive task. */
void vATask( void * pvParameters )
{
    /* We are using the semaphore for synchronisation so we create a binary
    semaphore rather than a mutex.  We must make sure that the interrupt
    does not attempt to use the semaphore before it is created! */
    xSemaphore = xSemaphoreCreateBinary();

    for( ;; )
    {
        /* We want this task to run every 10 ticks of a timer.  The semaphore
        was created before this task was started.

        Block waiting for the semaphore to become available. */
        if( xSemaphoreTake( xSemaphore, LONG_TIME ) == pdTRUE )
        {
            /* It is time to execute. */

            ...

            /* We have finished our task.  Return to the top of the loop where
            we will block on the semaphore until it is time to execute
            again.  Note when using the semaphore for synchronisation with an
            ISR in this manner there is no need to 'give' the semaphore
            back. */
        }
    }
}

So I have this code where if i have osSemaphoreRelease commented in, i get FIFO errors. If i comment it out, i don't get FIFO errors. I'm just following FreeRTOS notes.

#include "main.h"



extern UART_HandleTypeDef huart1;
extern osSemaphoreId myBinarySem01Handle[];

char txData[128] = {0};
uint32_t counter = 1;
bool txPending = false;
uint32_t uartID = 1;


void PollingInit(void)
{
	sprintf((char*)txData, "Hello World, Counter= %ld\r\n", counter);
}

void PollingRoutine(void)
{
	if(osSemaphoreWait(myBinarySem01Handle[uartID], 0))
	{
		if(HAL_UART_Transmit_DMA(&huart1, (uint8_t*)txData, strlen((char*)txData)) != HAL_OK)
		{
			//osSemaphoreRelease(myBinarySem01Handle[uartID]);
			HAL_GPIO_TogglePin(LD3_GPIO_Port, LD3_Pin);
		}
		else
		{
			sprintf((char*)txData, "Hello World, Counter= %ld\r\n", ++counter);
		}
	}
}

void HAL_UART_TxCpltCallback(UART_HandleTypeDef *huart)
{
	osSemaphoreRelease(myBinarySem01Handle[uartID]);
}

Tips and Tricks with TimerCallback https://www.youtube.com/@eebykarl
If you find my solution useful, please click the Accept as Solution so others see the solution.

Pavel A. · ‎2024-02-10

@Karl Yamashita In your example there's a race condition, after successful HAL_UART_Transmit_DMA in line 23 the ongoing TX buffer is clobbered by sprintf in line 30. Is this OK you think?

Peeters.Bram · ‎2024-02-10

The Freertos example you refer to is a situation where you consume the semaphore gives that are generated by a timer task that runs continuously.

There is no launching of an asynchronous event there of which the launch might fail so you do not have the need of an error path to take care of.

In your own example you are using CMSIS functions too which I have a lasting allergic reaction after initial exposure more than a decade ago :)

Apart from the remark from Pavel that you are overwriting your buffer while the DMA is working on it, it also seems to just completely break the semaphore protection mechanism if I am not mistaken (it's getting late here)??

The documentation of the semaphore wrapper says:

/// Wait until a Semaphore token becomes available.
/// \param[in] semaphore_id semaphore object referenced with \ref osSemaphoreCreate.
/// \param[in] millisec \ref CMSIS_RTOS_TimeOutValue or 0 in case of no time-out.
/// \return number of available tokens, or -1 in case of incorrect parameters.
/// \note MUST REMAIN UNCHANGED: \b osSemaphoreWait shall be consistent in every CMSIS-RTOS.

What does "no time-out" mean. No waiting of infinite waiting ?

After looking at the implementation :

/**
* @brief Wait until a Semaphore token becomes available
* @PAram  semaphore_id  semaphore object referenced with \ref osSemaphore.
* @PAram  millisec      timeout value or 0 in case of no time-out.
* @retval  number of available tokens, or -1 in case of incorrect parameters.
* @note   MUST REMAIN UNCHANGED: \b osSemaphoreWait shall be consistent in every CMSIS-RTOS.
*/
int32_t osSemaphoreWait (osSemaphoreId semaphore_id, uint32_t millisec)
{
  TickType_t ticks;
  portBASE_TYPE taskWoken = pdFALSE;  
  
  
  if (semaphore_id == NULL) {
    return osErrorParameter;
  }
  
  ticks = 0;
  if (millisec == osWaitForever) {
    ticks = portMAX_DELAY;
  }
  else if (millisec != 0) {
    ticks = millisec / portTICK_PERIOD_MS;
    if (ticks == 0) {
      ticks = 1;
    }
  }
  
  if (inHandlerMode()) {
    if (xSemaphoreTakeFromISR(semaphore_id, &taskWoken) != pdTRUE) {
      return osErrorOS;
    }
	portEND_SWITCHING_ISR(taskWoken);
  }  
  else if (xSemaphoreTake(semaphore_id, ticks) != pdTRUE) {
    return osErrorOS;
  }
  
  return osOK;
}

And

osOK = 0,

.....

osErrorOS = 0xFF, ///< Unspecified RTOS error: run-time error but no other error message fits.

So it turns out that millis = 0 means no wait time but immediately return.

But it returns osErrorOS if the semaphore is not available which is incorrect according to the documentation.

If you did get the semaphore then it returns osErrorOk. Also very wrong.

Am I misreading something here or is the whole thing terribly wrong ??

So if i am right then 0 if you successfully managed to get the semaphore (the token in cmsis speech apparently ), 0xFF if you did not get it.

So it seems then your code you will start the DMA action if you failed to obtain the semaphore so no wonder you get errors :)

I am guessing it only starts doing something because you did not load an initial value in the semaphore ( I don't see the creation of the semaphore and probably the documentation would be wrong anyway :D:D ) ?

Though thinking about it , it is interesting that starting concurrent DMA actions leads to a fifo error, I don't understand why that is.

Also, are you calling the pollingroutine from a single thread as in your original example ? Because then you should always immediately get the HAL_BUSY return code without the HAL_UART_Transmit_DMA function actually doing something since huart->gState = HAL_UART_STATE_BUSY_TX is set and stays set till it is cleared in UART_EndTransmit_IT right before calling the complete callback. The code should not start 2 simultanious DMA transfers even if your semaphore protection fails ? Only if you are calling it from multiple threads on a preemptive OS you risk that both manage to squeeze past the huart->gState = HAL_UART_STATE_READY check at the beginning of HAL_UART_Transmit_DMA.