AnsweredAssumed Answered

SDIO+DMA FIFO Error (TX Underrun) F4 STM32F417ZGT6

Question asked by reynolds.mike on Jun 4, 2014
Latest reply on May 4, 2015 by craig b
I'm having problems with FIFO errors using the F4's SDIO and DMA. It appears that as soon as I enable a block write to the SD card, the DMA often immediately sets the DMA_FLAG_FEIF3 flag in DMA2->LISR. No such problem when receiving data from the card. After researching the problem, I think I understand what's going on, but I'm not sure of the proper solution. It appears that this problem can happen for several reasons:
(1) - A FIFO underrun condition is detected
(2) – A FIFO overrun condition is detected (no detection in memory-to-memory mode
because requests and transfers are internally managed by the DMA)
(3) – The stream is enabled while the FIFO threshold level is not compatible with the
size of the memory burst (refer to Table 26: FIFO threshold configurations)

I don't think that number (3) is the problem because I'm using the recommended DMA setup from ST:

    DMA_ClearFlag(DMA_STREAM_SDIO, SD_SDIO_DMA_FLAG_FEIF | SD_SDIO_DMA_FLAG_DMEIF | SD_SDIO_DMA_FLAG_TEIF | SD_SDIO_DMA_FLAG_HTIF | SD_SDIO_DMA_FLAG_TCIF); // before enabling the control bit to '1' the corresponding event flag should be cleared otherwise an interrupt is generated immediately. // "From performing some experiments I discovered, and ST subsequently confirmed, that the DMA stream will NOT start up if its interrupt status bits are set. This is not described in the current documentation. Even if you're not using interrupts, you MUST clear the appropriate bits in the DMA Interrupt Flag Clear register."   
    // Configure the DMA and enable it
    DMA_Cmd(DMA_STREAM_SDIO, DISABLE);    // DMA2 Stream3  or Stream6 disable
    while (DMA_STREAM_SDIO->CR & DMA_SxCR_EN); // Wait Until the EN bit is read as 0 before DMA_Init() as per
    DMA_DeInit(DMA_STREAM_SDIO);        // DMA2 Stream3  or Stream6 Config
    DMA_InitStructure.DMA_Channel = DMA_CHANNEL_SDIO;
    DMA_InitStructure.DMA_PeripheralBaseAddr = (uint32_t)SDIO_FIFO_ADDRESS;
    DMA_InitStructure.DMA_Memory0BaseAddr = (uint32_t)BufferSRC;
    DMA_InitStructure.DMA_DIR = DMA_DIR_MemoryToPeripheral;
    DMA_InitStructure.DMA_BufferSize = (0); // Specifies the buffer size, in data unit, of the specified Stream. The data unit is equal to the configuration set in DMA_PeripheralDataSize or DMA_MemoryDataSize members depending in the transfer direction. // When the peripheral flow controller is used for a given stream, the value written into the DMA_SxNDTR (DMA_BufferSize) has no effect on the DMA transfer. Actually, whatever the value written, it will be forced by hardware to 0xFFFF as soon as the stream is enabled, ... It doesn't matter what you set this to for SDIO as long as DMA is configured to use SDIO peripheral for flow control.
    DMA_InitStructure.DMA_PeripheralInc = DMA_PeripheralInc_Disable;
    DMA_InitStructure.DMA_MemoryInc = DMA_MemoryInc_Enable;
    DMA_InitStructure.DMA_PeripheralDataSize = DMA_PeripheralDataSize_Word;
    DMA_InitStructure.DMA_MemoryDataSize = DMA_MemoryDataSize_Word;
    DMA_InitStructure.DMA_Mode = DMA_Mode_Normal;
    DMA_InitStructure.DMA_Priority = DMA_Priority_VeryHigh;
    DMA_InitStructure.DMA_FIFOMode = DMA_FIFOMode_Enable;
    DMA_InitStructure.DMA_FIFOThreshold = DMA_FIFOThreshold_Full;
    DMA_InitStructure.DMA_MemoryBurst = DMA_MemoryBurst_INC4;
    DMA_InitStructure.DMA_PeripheralBurst = DMA_PeripheralBurst_INC4;
    DMA_Init(DMA2_Stream3, &DMA_InitStructure);

    DMA_ITConfig(DMA2_Stream3, DMA_IT_TC, ENABLE);

    DMA_FlowControllerConfig(DMA2_Stream3, DMA_FlowCtrl_Peripheral);

    DMA_Cmd(DMA2_Stream3, ENABLE);    // DMA2 Stream3  or Stream6 enable

The reference manual says:
"The content pointed by the FIFO threshold must exactly match to an integer number of memory
burst transfers. If this is not in the case, a FIFO error (flag FEIFx of the DMA_HISR or
DMA_LISR register) will be generated when the stream is enabled, then the stream will be
automatically disabled.

Since I'm transmitting data to the card, this leads me to believe the root cause of the FIFO error problem is a TX underrun as per: DM00046011.pdf:

"If the user enables the used peripheral before the corresponding DMA stream, a "FEIF" (FIFO Error Interrupt Flag) may be set due to the fact the DMA is not ready to provide the first required data to the peripheral (in case of memory-to-peripheral transfer)."

which was pointed out by Mayla in this post: [DMA SDIO TX underrun error]

One of the things I attempted to do to fix the problem reordering the setup sequence in all of the read and write data block functions (SD_ReadBlock, SD_ReadMultiBlocks, SD_WriteBlock, SD_WriteMultiBlocks) to conform to what appears to be the proper order of initialization: DMA setup, CPSM setup, DPSM setup. My understanding is that the DMA must be setup, ready, and waiting before triggering the DPSM to go and fetch data from memory. The release notes of newer StdPeriphLib mentions a bug fixed from previous versions: "Transmit and receive functions: swap the order of state machine and DMA configuration, to fix marginal limitation where the card sent data to the SDIO interface while the DMA is not ready to transfer them"

 V1.1.0 / 21-December-2012

I also gleaned the importance of this from this post where Décio mentions some problems with the ST Std Periph library setup of read/write blocks:
I'm fairly confident that "the used peripheral" is disabled before enabling the DMA, since DCTRL is set to zeros before setting up the DMA. Unless the peripheral must be disabled in some other way...? Then the DMA is setup and enabled as peripheral flow controlled using DMA_FlowControllerConfig(DMA2_Stream3, DMA_FlowCtrl_Peripheral) and the SDIO is enabled for DMA usage using SDIO_DMACmd(ENABLE). Then the DPSM is setup and enabled using SDIO_DataConfig(&SDIO_DataInitStructure).

I also ran accross some other notes of wisdom:
 "Because once the SDIO commences the write to the card, it pretty much immediately requires data to give to the card. If it doesn't get it pronto, you’ll end up with an SDIO error (usually a FIFO error because the transmit FIFO underran)."
 "any delay in getting data in or out during that time then either an under-run or overrun results"
and this:
"Getting data transmit (send data to the card) to startup properly on the SMT32F4xx / 2xx can be very tricky. Here's my understanding.

When you enable the SDIO (via the DTEN bit in the SDIO_DCTRL register) the FIFO is empty. So the TXFIFOHE interrupt will trigger immediately, and at the same time the SDIO peripheral will start attempting to write data to the SD card. Hence data must appear in the Tx FIFO extremely quickly, otherwise a Tx FIFO underrun will occur and the SDIO peripheral will shut down.

It is not possible to pre-load the FIFO before enabling the SDIO. I've tried and it doesn't work. I believe the FIFO is hardware-cleared until the SDIO is enabled, or something similar to that.

What this means is that at the moment of SDIO turn-on (when the DTEN bit is set), that TXFIFOHE interrupt must trigger. At that point in time it must be the highest priority interrupt in the system, or be the only interrupt. If it's delayed for any reason, for example because another interrupt occurs at that time, then a Tx FIFO underrun will very quickly follow. Think very carefully about your enabled interrupts at that critical SDIO transmit start-up point. You may want to consider using the NVIC to make the SDIO be the highest priority interrupt, permitted to preempt all other interrupts. Or, come up with some other scheme to ensure that first TXFIFOHE interrupt can execute immediately.

But the DMA is setup, enabled, and ready, as is SDIO_DMACmd(ENABLE)! It seems like it should be able to fetch the data for the SDIO right away. There shouldn't be anything  interfering with the DMA's ability to grab the data that I'm aware of. There is only one software interrupt enabled: UART0, but no data is being transfered over it. And all of the other DMA streams and channels are inactive.

It seems that one way to fix this is to use the SDIO's flow control setting: SDIO_HardwareFlowControl = SDIO_HardwareFlowControl_Enable, because it can avoid these FIFO errors:

"26.8 HW flow control. The HW flow control functionality is used to avoid FIFO underrun (TX
mode) and overrun (RX mode) errors.
The behavior is to stop SDIO_CK and freeze SDIO state machines. The data
transfer is stalled while the FIFO is unable to transmit or receive
data. Only state machines clocked by SDIOCLK are frozen, the AHB
interface is still alive. The FIFO can thus be filled or emptied even if
flow control is activated."

BUT, as per the F4 hardware errata, HW flow control cannot be used ("Do not use the HW flow control") because of data corruption and CRC errors. It goes on to say: "Overrun errors (Rx mode) and FIFO underrun (Tx mode) should be managed by the application software".

Others have said this:
"The activation of 'Hardware flow control' is supposed to solve FIFO overrun errors, but when enabling it i get random Data CRC errors.."
"Read the ERRATA, the SDIO flow controller has some issues by silicon limitations."

 The reference manual says:

"When a FIFO overrun or underrun condition occurs, the data are not lost because the
peripheral request is not acknowledged by the stream until the overrun or underrun
condition is cleared. If this acknowledge takes too much time, the peripheral itself may
detect an overrun or underrun condition of its internal buffer and data might be lost.

How can TX underrun errors be managed by application software? When a FIFO error occurs, do you have to repeat the DMA and DPSM setup and retry the data transaction? Does any partial data get sent out the SDIO before the TX underrun halts the data flow? If so, it seems like the command would have to be re-issued as to to restart the whole block write process (in the case of a single block write) or repoint a multiblock write to the address where the incomplete block transfer was aborted midway through. This seems to get complicated fast. I feel like I must be missing something simple, like slightly altering the initialization sequence such that the DMA is more reactive when data is requested. Yet, as far as I can tell, it should be sitting there waiting to go when the SDIO's DPSM goes from WAIT_S to SEND.

In the Frank's blog post he appears to be using interrupts and somehow manages to get the data pushed into the FIFO manually when the TXFIFOHE interrupt goes off. I want to rely exclusively on the DMA, if possible, rather than having the interrupts going off that I need to manually handle. This way I can initiate the transaction, and later poll for the successful completion (or erroneous termination) of the DMA transaction and SDIO operation sometime later, after I've performed some other tasks in the background with minimal CPU intervention in the meantime.

Also, what exactly is the difference between the flow control setting of the DMA (DMA_FlowControllerConfig(DMA2_Stream3, DMA_FlowCtrl_Peripheral)) vs the SDIO flow control setting (SDIO_InitStructure.SDIO_HardwareFlowControl = SDIO_HardwareFlowControl_Disable;)?

My starting point software is based off ST's library: STM32F4xx_StdPeriph_Driver v1.0.2 (2012/03/05), and not the new stm32f4xx_hal_sd.c stuff in CubeMX.

I've incorporated various bug fixes such as using all read and write functions (SD_ReadBlock, etc) accepting a 32-bit block-oriented address rather than the ST hack of using a 64-bit byte-oriented address and then using multiplies and shifts instead of divides (divide by 512) to convert between different orientations.

    if ((g_CardType == SDIO_STD_CAPACITY_SD_CARD_V1_1) || (g_CardType == SDIO_STD_CAPACITY_SD_CARD_V2_0)) {
        // if card isn't high capacity (SDHC or SDXC), then it expects addresses to be byte-oriented instead of block oriented. Convert block-oriented address to byte-oriented in this case.
            ReadAddr_Block_LBA *= 512 ; //convert from block address to byte-oriented address by multiplying by 512
            ReadAddr_Block_LBA <<= 9; //convert from block address to byte-oriented address by multiplying by 512
    } else if (g_CardType == SDIO_HIGH_CAPACITY_SD_CARD) {
        BlockSize = 512; // 512!
    } else {
        return (SD_UNSUPPORTED_HW);

And patching SDSTATUS_Tab[16]; ----> SDSTATUS_Tab[64];

SDIO_InitStructure.SDIO_ClockBypass = SDIO_ClockBypass_Disable; // F4 errata: MUST use SDIO_ClockBypass_Disable on STM32F4, unless not using USB or RNG, due to SDIO clock divider BYPASS mode may not work properly mentioned in the device's errata doc
And fixed the 32-bit addess limitations in SCSI layer (SCSI_ProcessRead(), SCSI_ProcessWrite(), etc) from suggestions by clive1 (<--THANKS!).


Any assistance or guidance would be greatly appreciated!

Thanks in advance!