I'm having difficulties with STM32F4xx SDIO when using DMA transfer in circular mode (single buffer or double memory buffer alike). SDIO transfer (multi-block read or write) completes successfully, however DMA still waits for the last 4 words of data (indefinitely).
Everything works nicely if DMA mode is set to DMA_PFCTRL (peripheral flow control). I can read/write memory card at any speed, as many blocks (512 Bytes) as needed. But peripheral flow control does not work with circular buffer, of course.
Once DMA reconfigured to use circular buffer, it can transfer one or even multiple full buffers successfully (proper data is received, etc.). But SDIO controller reports transfer completion (SDIO_IT_DATAEND flag in STA register) when DMA still waits for 16 more bytes (4 words, or exactly one full FIFO). DMA’s DMA_SxNDTR register holds 0x04.
Debugger didn’t reveal any setup issues. All registers and counters are set and updated during the transfer properly. All interrupts triggers accordingly. No any FIFO-related errors, etc.
If SDIO counter is increased to +16, then all the data, that DMA expects to receive is delivered. Of course this wouldn’t be a “multiple-of-block-size” amount of data, thus it will break SDIO CRC check (incomplete block), and will lead to general transfer error. But this is just to demonstrate, that data is available in SDIO FIFO. It is just not delivered to DMA FIFO before SDIO stops. Like if the last moment SDIO FIFO Data Available interrupt (RXDAVLIE) is not triggered.
Tested with different HAL libraries: Keil’s version of CMSIS HAL, and STMicroelectronics HAL generated by STM32CubeMX. Other than minor bugs, both stack with the same DMA issue in circular mode.
Could it be a hardware bug?
Tested with two discovery boards and both reproduce the same issue:
- HSE @8MHz
- CPU clocked @168MHz
- SDIO clocked @48MHz
- Memory Card (SD class 10): 4-bit @24MHz
(also tested at high Speed: 4-bit @48MHz, SDIO clock bypass)
Setup code is generated with STM32CubeMX for Keil uVision 5.
The only peripheral enabled is SDIO. One GPIO pin is used to flash LED.
SD Card interrupt has higher NVIC priority over DMA Rx/Tx interrupts (e.g. 8 against 10 for DMA). Otherwise HAL library code stacks inside DMA interrupt handler while waiting for SDIO data transfer completion.
16kB test buffer is located in SDRAM1 and aligned at 1kB boundary.
Full DMA configuration:
hdma_sdio_rx.Instance = DMA2_Stream3;
hdma_sdio_rx.Init.Channel = DMA_CHANNEL_4;
hdma_sdio_rx.Init.Direction = DMA_PERIPH_TO_MEMORY;
hdma_sdio_rx.Init.PeriphInc = DMA_PINC_DISABLE;
hdma_sdio_rx.Init.MemInc = DMA_MINC_ENABLE;
hdma_sdio_rx.Init.PeriphDataAlignment = DMA_PDATAALIGN_WORD;
hdma_sdio_rx.Init.MemDataAlignment = DMA_MDATAALIGN_WORD;
hdma_sdio_rx.Init.Mode = DMA_PFCTRL -OR- DMA_CIRCULAR;
hdma_sdio_rx.Init.Priority = DMA_PRIORITY_HIGH;
hdma_sdio_rx.Init.FIFOMode = DMA_FIFOMODE_ENABLE;
hdma_sdio_rx.Init.FIFOThreshold = DMA_FIFO_THRESHOLD_FULL;
hdma_sdio_rx.Init.MemBurst = DMA_MBURST_INC4;
hdma_sdio_rx.Init.PeriphBurst = DMA_PBURST_INC4;