I am having an issue with DMA1 getting "stuck" during operation. While DMA1 is stuck, DMA2 continues to run without any issues. I realize that this is likely a race somewhere, however I am hoping for any insight into how or why the DMA controller becomes unusable.
First some project details:
When both SPI2 & USART2 are running at a high rate (lots of display updates over SPI2, lots of data in over USART2), eventually the DMA1 controller becomes unusable. The only way that this is detected is that I am expecting data over the USART2 Rx, and it never arrives.
The SPI2 DMA1s4 continues to run, is re-configurable, can be started / stopped and even the TxComplete callbacks are firing -- but no data is being sent out over SPI2. I can still successfully send data over SPI2, just not using the DMA controller.
The USART2 DMA1s5 is completely unusable, when I try to disable it by clearing `DMA_S5CR_EN`, it never clears and seemingly cannot be disabled. Any attempts to de-init/ re-init USART2 and its DMA stream get stuck trying to disable DMA1s5.
SPI2 Usage
The first line of the display is sent out by an RTOS task, using `SPI_Transmit_DMA()`. Subsequent lines are sent out in the `HAL_SPI_TxCpltCallback()` also using `SPI_Transmit_DMA()`.
SPI2 Initialization
SPI_HandleTypeDef lcd_spi;
lcd_spi.Instance = SPI2;
lcd_spi.Init.BaudRatePrescaler = SPI_BAUDRATEPRESCALER_8;
lcd_spi.Init.Direction = SPI_DIRECTION_2LINES;
lcd_spi.Init.CLKPhase = SPI_PHASE_1EDGE;
lcd_spi.Init.CLKPolarity = SPI_POLARITY_LOW;
lcd_spi.Init.DataSize = SPI_DATASIZE_8BIT;
lcd_spi.Init.FirstBit = SPI_FIRSTBIT_MSB;
lcd_spi.Init.TIMode = SPI_TIMODE_DISABLE;
lcd_spi.Init.CRCPolynomial = 7;
lcd_spi.Init.NSS = SPI_NSS_SOFT;
lcd_spi.Init.Mode = SPI_MODE_MASTER;
DMA_HandleTypeDef spi_dma_tx;
spi_dma_tx.Instance = DMA1_Stream4;
spi_dma_tx.Init.Channel = DMA_CHANNEL_0;
spi_dma_tx.Init.FIFOMode = DMA_FIFOMODE_DISABLE;
spi_dma_tx.Init.FIFOThreshold = DMA_FIFO_THRESHOLD_FULL;
spi_dma_tx.Init.MemBurst = DMA_MBURST_INC4;
spi_dma_tx.Init.PeriphBurst = DMA_PBURST_INC4;
spi_dma_tx.Init.Direction = DMA_MEMORY_TO_PERIPH;
spi_dma_tx.Init.PeriphInc = DMA_PINC_DISABLE;
spi_dma_tx.Init.MemInc = DMA_MINC_ENABLE;
spi_dma_tx.Init.PeriphDataAlignment = DMA_PDATAALIGN_BYTE;
spi_dma_tx.Init.MemDataAlignment = DMA_MDATAALIGN_BYTE;
spi_dma_tx.Init.Mode = DMA_NORMAL;
spi_dma_tx.Init.Priority = DMA_PRIORITY_MEDIUM;
__HAL_LINKDMA(hspi, hdmatx, spi_dma_tx);
HAL_NVIC_SetPriority(DMA1_Stream4_IRQn, 5, 1);
USART2 Usage
DMA Rx is started in circular mode. It is being run with Flow Control enabled, and IDLE detection since we can receive any amount of data.
In the HAL_UART_RxCplt, HAL_UART_RxHalfCplt callbacks, and on IDLE interrupt, data in the dma buffer is transferred into a ring buffer that is shared with an RTOS thread. If that ring buffer is full, `HAL_UART_DMAPause()` is called until the RTOS thread drains the buffer, and then `HAL_UART_DMAResume()` is called.
USART2 Initialization
UART_HandleTypeDef huart2;
huart2.Instance = USART2;
huart2.Init.BaudRate = 2534400;
huart2.Init.WordLength = UART_WORDLENGTH_8B;
huart2.Init.StopBits = UART_STOPBITS_1;
huart2.Init.Parity = UART_PARITY_NONE;
huart2.Init.Mode = UART_MODE_TX_RX;
huart2.Init.HwFlowCtl = UART_HWCONTROL_RTS_CTS;
huart2.Init.OverSampling = UART_OVERSAMPLING_16;
huart2.Init.OneBitSampling = UART_ONE_BIT_SAMPLE_DISABLE;
huart2.AdvancedInit.AdvFeatureInit = UART_ADVFEATURE_NO_INIT;
HAL_NVIC_SetPriority(USART2_IRQn, 5, 0);
hdma_usart2_rx.Instance = DMA1_Stream5;
hdma_usart2_rx.Init.Channel = DMA_CHANNEL_4;
hdma_usart2_rx.Init.Direction = DMA_PERIPH_TO_MEMORY;
hdma_usart2_rx.Init.PeriphInc = DMA_PINC_DISABLE;
hdma_usart2_rx.Init.MemInc = DMA_MINC_ENABLE;
hdma_usart2_rx.Init.PeriphDataAlignment = DMA_PDATAALIGN_BYTE;
hdma_usart2_rx.Init.MemDataAlignment = DMA_MDATAALIGN_BYTE;
hdma_usart2_rx.Init.Mode = DMA_CIRCULAR;
hdma_usart2_rx.Init.Priority = DMA_PRIORITY_MEDIUM;
hdma_usart2_rx.Init.FIFOMode = DMA_FIFOMODE_DISABLE;
hdma_usart2_rx.Init.FIFOThreshold = DMA_FIFO_THRESHOLD_FULL;
hdma_usart2_rx.Init.MemBurst = DMA_MBURST_SINGLE;
hdma_usart2_rx.Init.PeriphBurst = DMA_PBURST_SINGLE;
__HAL_LINKDMA(&huart2, hdmarx, hdma_usart2_rx);
HAL_NVIC_SetPriority(DMA1_Stream5_IRQn, 5, 0);
/* Start receiving DMA */
SET_BIT(esp_uart.Instance->CR3, USART_CR3_EIE);
SET_BIT(esp_uart.Instance->CR1, USART_CR1_PEIE | USART_CR1_IDLEIE);
HAL_UART_Receive_DMA(&huart2, dmabuf, sizeof(dmabuf));
Just in case it is of any use, here is a dump of relevant registers after DMA1 stops responding. It is DMA1->S5CR's EN flag which is stuck set, and SMA1->S4 that continues to fire interrupts, but no data is being sent.
CR: 0x3F037E83 clock control register
PLLCFGR: 0x2840384A PLL configuration register
CFGR: 0x0000940A clock configuration register
CIR: 0x00000000 clock interrupt register
AHB1RSTR: 0x00000000 AHB1 peripheral reset register
AHB2RSTR: 0x00000000 AHB2 peripheral reset register
AHB3RSTR: 0x00000000 AHB3 peripheral reset register
APB1RSTR: 0x00000000 APB1 peripheral reset register
APB2RSTR: 0x00000000 APB2 peripheral reset register
AHB1ENR: 0x007401FF AHB1 peripheral clock register
AHB2ENR: 0x00000080 AHB2 peripheral clock enable register
AHB3ENR: 0x00000001 AHB3 peripheral clock enable register
APB1ENR: 0x10624203 APB1 peripheral clock enable register
APB2ENR: 0x00C04D00 APB2 peripheral clock enable register
AHB1LPENR: 0x7EF7B7FF AHB1 peripheral clock enable in low power mode register
AHB2LPENR: 0x000000F1 AHB2 peripheral clock enable in low power mode register
AHB3LPENR: 0x00000003 AHB3 peripheral clock enable in low power mode register
APB1LPENR: 0xFFFFCBFF APB1 peripheral clock enable in low power mode register
APB2LPENR: 0x04F77F33 APB2 peripheral clock enabled in low power mode register
BDCR: 0x00008103 Backup domain control register
CSR: 0x00000002 clock control & status register
SSCGR: 0x00000000 spread spectrum clock generation register
PLLI2SCFGR: 0x2A002340 PLLI2S configuration register
PLLSAICFGR: 0x24011E00 PLL configuration register
DKCFGR1: 0x00500001 dedicated clocks configuration register
DKCFGR2: 0x0B000000 dedicated clocks configuration register
OTGHSEN: 0 USB OTG HS clock enable
ETHMACPTPEN: 0 Ethernet PTP clock enable
ETHMACRXEN: 0 Ethernet Reception clock enable
ETHMACTXEN: 0 Ethernet Transmission clock enable
ETHMACEN: 0 Ethernet MAC clock enable
DMA2DEN: 0 DMA2D clock enable
DMA2EN: 1 DMA2 clock enable
DMA1EN: 1 DMA1 clock enable
CCMDATARAMEN: 1 CCM data RAM clock enable
BKPSRAMEN: 1 Backup SRAM interface clock enable
CRCEN: 0 CRC clock enable
GPIOKEN: 0 IO port K clock enable
GPIOJEN: 0 IO port J clock enable
GPIOIEN: 1 IO port I clock enable
GPIOHEN: 1 IO port H clock enable
GPIOGEN: 1 IO port G clock enable
GPIOFEN: 1 IO port F clock enable
GPIOEEN: 1 IO port E clock enable
GPIODEN: 1 IO port D clock enable
GPIOCEN: 1 IO port C clock enable
GPIOBEN: 1 IO port B clock enable
GPIOAEN: 1 IO port A clock enable
CPUID: 0x410FC271 CPUID base register
ICSR: 0x00C36000 Interrupt control and state register
VTOR: 0x08010000 Vector table offset register
AIRCR: 0xFA050000 Application interrupt and reset control register
SCR: 0x00000000 System control register
CCR: 0x00070200 Configuration and control register
SHPR1: 0x00000000 System handler priority registers
SHPR2: 0x40000000 System handler priority registers
SHPR3: 0xF0F00000 System handler priority registers
SHCRS: 0x00010000 System handler control and state register
CFSR_UFSR_BFSR_MMFSR: 0x00000000 Configurable fault status register
HFSR: 0x00000000 Hard fault status register
MMFAR: 0x00000000 Memory management fault address register
BFAR: 0x00000000 Bus fault address register
LISR: 0x00000000 low interrupt status register
HISR: 0x00000000 high interrupt status register
LIFCR: 0x00000000 low interrupt flag clear register
HIFCR: 0x00000000 high interrupt flag clear register
S4CR: 0x00010446 stream x configuration register
S4NDTR: 0x00000000 stream x number of data register
S4PAR: 0x4000380C stream x peripheral address register
S4M0AR: 0x20011A40 stream x memory 0 address register
S4M1AR: 0x00000000 stream x memory 1 address register
S4FCR: 0x000000A0 stream x FIFO control register
S5CR: 0x0801051F stream x configuration register
S5NDTR: 0x000000FE stream x number of data register
S5PAR: 0x40004424 stream x peripheral address register
S5M0AR: 0x2004E000 stream x memory 0 address register
S5M1AR: 0x00000000 stream x memory 1 address register
S5FCR: 0x000000A0 stream x FIFO control register
CHSEL: 0 Channel selection
MBURST: 0 Memory burst transfer configuration
PBURST: 0 Peripheral burst transfer configuration
ACK: 0
CT: 0 Current target (only in double buffer mode)
DBM: 0 Double buffer mode
PL: 1 Priority level
PINCOS: 0 Peripheral increment offset size
MSIZE: 0 Memory data size
PSIZE: 0 Peripheral data size
MINC: 1 Memory increment mode
PINC: 0 Peripheral increment mode
CIRC: 0 Circular mode
DIR: 1 Data transfer direction
PFCTRL: 0 Peripheral flow controller
TCIE: 0 Transfer complete interrupt enable
HTIE: 0 Half transfer interrupt enable
TEIE: 1 Transfer error interrupt enable
DMEIE: 1 Direct mode error interrupt enable
EN: 0 Stream enable / flag stream ready when read low
CHSEL: 4 Channel selection
MBURST: 0 Memory burst transfer configuration
PBURST: 0 Peripheral burst transfer configuration
ACK: 0
CT: 0 Current target (only in double buffer mode)
DBM: 0 Double buffer mode
PL: 1 Priority level
PINCOS: 0 Peripheral increment offset size
MSIZE: 0 Memory data size
PSIZE: 0 Peripheral data size
MINC: 1 Memory increment mode
PINC: 0 Peripheral increment mode
CIRC: 1 Circular mode
DIR: 0 Data transfer direction
PFCTRL: 0 Peripheral flow controller
TCIE: 1 Transfer complete interrupt enable
HTIE: 1 Half transfer interrupt enable
TEIE: 1 Transfer error interrupt enable
DMEIE: 1 Direct mode error interrupt enable
EN: 1 Stream enable / flag stream ready when read low
CR1: 0x00000354 control register 1
CR2: 0x00001700 control register 2
SR: 0x00000002 status register
DR: 0x00000000 data register
CRCPR: 0x00000007 CRC polynomial register
RXCRCR: 0x00000000 RX CRC register
TXCRCR: 0x00000000 TX CRC register
I2SCFGR: 0x00000000 I2S configuration register
I2SPR: 0x00000002 I2S prescaler register
CR1: 0x0000001D Control register 1
CR2: 0x00000000 Control register 2
CR3: 0x00000340 Control register 3
BRR: 0x00000012 Baud rate register
GTPR: 0x00000000 Guard time and prescaler register
RTOR: 0x00000000 Receiver timeout register
RQR: (not readable) Request register
ISR: 0x00621230 Interrupt & status register
ICR: (not readable) Interrupt flag clear register
RDR: 0x000000CA Receive data register
TDR: 0x00000052 Transmit data register
The only way I was able to get DMA stuck was when I (deliberately, as an experiment) mapped two streams to the same trigger...
> HAL_DMAPause()
What's that? Sounds suspicious. I can't find it in CubeF7 https://github.com/STMicroelectronics/STM32CubeF7/search?q=HAL_DMAPause
Hi JW,
Thank you for your response! I mis-typed that one, it is actually "HAL_UART_DMAPause()" https://github.com/STMicroelectronics/STM32CubeF7/blob/79acbf8ec060d3ec751f2eaba6ee050269995357/Drivers/STM32F7xx_HAL_Driver/Src/stm32f7xx_hal_uart.c#L1440 . I have updated the initial question to clarify.
Avoid this function and report back if the problem persists.
Hi JW,
I've replaced "HAL_UART_DMAPause()" and "HAL_UART_DMAResume()" with "CLEAR_BIT(huart2.Instance->CR3, USART_CR3_DMAR)" and "SET_BIT(huart2.Instance->CR3, USART_CR3_DMAR)" respectively. This has not helped the situation.
Unfortunately, I don't think I can remove DMA Pause / Resume entirely, as I rely on them for flow control while the system is under load. I know for a fact that I am successfully pausing / resuming thousands of times before DMA gets stuck. It's my understanding that I should be able to consistently perform UART flow control in this manner.
Here is a snippet of my USART Rx routines, I've put in fairly heavy locking just to be sure.
extern UART_HandleTypeDef huart2;
extern DMA_HandleTypeDef hdma_usart2_rx;
static struct RingBuffer rx_buffer; /* A simple ring buffer, details irrelvant */
/* Circular buffer for use by the dma controller */
static uint8_t dmabuf[256];
static size_t old_pos = 0;
static uint32_t LOCK(void) {
uint32_t priMsk = __get_PRIMASK();
return priMsk;
static void UNLOCK(uint32_t priMask) {
static bool rxIsPaused(void) {
return (HAL_IS_BIT_SET(huart2.Instance->CR3, USART_CR3_DMAR) == 0);
static size_t getRxBytesAvailable(void) {
const size_t pos = sizeof(dmabuf) - hdma_usart2_rx.Instance->NDTR;
size_t num_bytes;
if (pos == old_pos) {
num_bytes = 0;
} else if (pos > old_pos) {
num_bytes = pos - old_pos;
} else {
num_bytes = sizeof(dmabuf) - old_pos + pos;
return num_bytes;
static void consumeRxDataBuffer(size_t num_bytes) {
const size_t remaining = sizeof(dmabuf) - old_pos;
const size_t write_len = MIN(num_bytes, remaining);
ring_buffer_write(&rx_buffer, &dmabuf[old_pos], write_len);
const size_t bytes_left = num_bytes - write_len;
if (bytes_left) {
/* Wrapping to beginning of dmabuf */
ring_buffer_write(&rx_buffer, dmabuf, bytes_left);
old_pos = (old_pos + num_bytes) % sizeof(dmabuf);
static void processIncoming(void)
const uint32_t L = LOCK();
if (rxIsPaused()) {
/* buffer will be processed prior to resumption */
goto unlock;
/* Calculate current position in buffer */
size_t read_amount = getRxBytesAvailable();
if (read_amount == 0) {
goto unlock;
if (read_amount > ring_buffer_get_space_available(&rx_buffer)) {
/* Pause UART DMA until we consume & have enough space again */
goto unlock;
/* Remember: This is used from both DMA1 an USART2 ISRs */
void HAL_UART_RxHalfCpltCallback(UART_HandleTypeDef* uart)
/* Remember: This is used from both DMA1 an USART2 ISRs */
void HAL_UART_RxCpltCallback(UART_HandleTypeDef* uart)
void USART2_IRQHandler(void)
/* RTOS consumer calls this after reading from the ring buffer */
void resumeRx(void) {
const uint32_t L = LOCK();
if (rxIsPaused() == false) {
goto unlock;
size_t read_amount = getRxBytesAvailable();
if (read_amount > ring_buffer_get_space_available(&rx_buffer)) {
/* No space, don't resume yet */
goto unlock;
To test it out, I greatly increased the size of my task-owned ring buffer (rx_buffer) such that I never need to pause / resume. Transfers in that case seem to be OK and I haven't been able to reproduce my issue.
Unfortunately, I can't afford that memory and it is my understanding that it should be possible to perform flow control in this manner.
Hi Marc,
> my understanding that it should be possible to perform flow control in this manner.
Yes, that was my understanding, too. However, the modules in STM32 are very complex, they are written by people, and people do make errors. You are exploring a seldom-used feature which is probably buggy and is beyond ST's understanding or care.
My experience (post mingled courtesy of ST's mania to pay for inferior forum software) with a similar scheme has a somewhat different outcome - missing transfers rather than stuck DMA, but that can be explained by different mcu/family (with potentially different design of the DMA - ST does not care to provide the users with this detail) and/or minute details in setup and events sequencing.
I don't say this *is* the source of your problem, but it sounds very likely, given also results of your experiment.
As there's no point expecting ST will ever acknowledge this as a problem yet alone produce a workaround (which very likely may be none, for the existing chips), the easiest way is to avoid the problem is to avoid the request (trigger) disable/enable at all. I know that means rethinking the logic of your application and I know the amount of work it can present - I went down that path too.