cancel
Showing results for 
Search instead for 
Did you mean: 

Help with DMA Controller becoming unresponsive

MJ.1
Associate II

Hello,

I am having an issue with DMA1 getting "stuck" during operation. While DMA1 is stuck, DMA2 continues to run without any issues. I realize that this is likely a race somewhere, however I am hoping for any insight into how or why the DMA controller becomes unusable.

First some project details:

  • Running on an stm32f746ig, with m7 r0p1 core
  • We are using Cube HAL
  • DMA1 Setup:
    • Stream 4 is used in Direct Mode for SPI2 tx
    • Stream 5 is used in Circular Mode for USART2 RX, flow control enabled, HalfRXCplt + RxCplt + IDLE detection, and using HAL_UART_DMAPause() when host buffers are full
  • SPI2 is used for output to a display, we perform non-DMA transfers over this as well as larger DMA transfers
  • USART2 is connected to another microcontroller, and we only use DMA for RX

When both SPI2 & USART2 are running at a high rate (lots of display updates over SPI2, lots of data in over USART2), eventually the DMA1 controller becomes unusable. The only way that this is detected is that I am expecting data over the USART2 Rx, and it never arrives.

The SPI2 DMA1s4 continues to run, is re-configurable, can be started / stopped and even the TxComplete callbacks are firing -- but no data is being sent out over SPI2. I can still successfully send data over SPI2, just not using the DMA controller.

The USART2 DMA1s5 is completely unusable, when I try to disable it by clearing `DMA_S5CR_EN`, it never clears and seemingly cannot be disabled. Any attempts to de-init/ re-init USART2 and its DMA stream get stuck trying to disable DMA1s5.

SPI2 Usage

The first line of the display is sent out by an RTOS task, using `SPI_Transmit_DMA()`. Subsequent lines are sent out in the `HAL_SPI_TxCpltCallback()` also using `SPI_Transmit_DMA()`.

SPI2 Initialization

SPI_HandleTypeDef lcd_spi;
	lcd_spi.Instance               = SPI2;
	lcd_spi.Init.BaudRatePrescaler = SPI_BAUDRATEPRESCALER_8;
	lcd_spi.Init.Direction         = SPI_DIRECTION_2LINES;
	lcd_spi.Init.CLKPhase          = SPI_PHASE_1EDGE;
	lcd_spi.Init.CLKPolarity       = SPI_POLARITY_LOW;
	lcd_spi.Init.DataSize          = SPI_DATASIZE_8BIT;
	lcd_spi.Init.FirstBit          = SPI_FIRSTBIT_MSB;
	lcd_spi.Init.TIMode            = SPI_TIMODE_DISABLE;
	lcd_spi.Init.CRCCalculation    = SPI_CRCCALCULATION_DISABLE;
	lcd_spi.Init.CRCPolynomial     = 7;
	lcd_spi.Init.NSS               = SPI_NSS_SOFT;
	lcd_spi.Init.Mode = SPI_MODE_MASTER;
 
	HAL_SPI_Init(&lcd_spi);
	
	DMA_HandleTypeDef spi_dma_tx;
	spi_dma_tx.Instance                 = DMA1_Stream4;
	spi_dma_tx.Init.Channel             = DMA_CHANNEL_0;
	spi_dma_tx.Init.FIFOMode            = DMA_FIFOMODE_DISABLE;
	spi_dma_tx.Init.FIFOThreshold       = DMA_FIFO_THRESHOLD_FULL;
	spi_dma_tx.Init.MemBurst            = DMA_MBURST_INC4;
	spi_dma_tx.Init.PeriphBurst         = DMA_PBURST_INC4;
	spi_dma_tx.Init.Direction           = DMA_MEMORY_TO_PERIPH;
	spi_dma_tx.Init.PeriphInc           = DMA_PINC_DISABLE;
	spi_dma_tx.Init.MemInc              = DMA_MINC_ENABLE;
	spi_dma_tx.Init.PeriphDataAlignment = DMA_PDATAALIGN_BYTE;
	spi_dma_tx.Init.MemDataAlignment    = DMA_MDATAALIGN_BYTE;
	spi_dma_tx.Init.Mode                = DMA_NORMAL;
	spi_dma_tx.Init.Priority            = DMA_PRIORITY_MEDIUM;
	
	HAL_DMA_Init(&spi_dma_tx);
	__HAL_LINKDMA(hspi, hdmatx, spi_dma_tx);
	
	HAL_NVIC_SetPriority(DMA1_Stream4_IRQn, 5, 1);
	HAL_NVIC_EnableIRQ(DMA1_Stream4_IRQn);

USART2 Usage

DMA Rx is started in circular mode. It is being run with Flow Control enabled, and IDLE detection since we can receive any amount of data.

In the HAL_UART_RxCplt, HAL_UART_RxHalfCplt callbacks, and on IDLE interrupt, data in the dma buffer is transferred into a ring buffer that is shared with an RTOS thread. If that ring buffer is full, `HAL_UART_DMAPause()` is called until the RTOS thread drains the buffer, and then `HAL_UART_DMAResume()` is called.

USART2 Initialization

UART_HandleTypeDef huart2;
huart2.Instance = USART2;
huart2.Init.BaudRate = 2534400;
huart2.Init.WordLength = UART_WORDLENGTH_8B;
huart2.Init.StopBits = UART_STOPBITS_1;
huart2.Init.Parity = UART_PARITY_NONE;
huart2.Init.Mode = UART_MODE_TX_RX;
huart2.Init.HwFlowCtl = UART_HWCONTROL_RTS_CTS;
huart2.Init.OverSampling = UART_OVERSAMPLING_16;
huart2.Init.OneBitSampling = UART_ONE_BIT_SAMPLE_DISABLE;
huart2.AdvancedInit.AdvFeatureInit = UART_ADVFEATURE_NO_INIT;
HAL_UART_Init(&huart2);
 
HAL_NVIC_SetPriority(USART2_IRQn, 5, 0);
HAL_NVIC_EnableIRQ(USART2_IRQn);
 
hdma_usart2_rx.Instance = DMA1_Stream5;
hdma_usart2_rx.Init.Channel = DMA_CHANNEL_4;
hdma_usart2_rx.Init.Direction = DMA_PERIPH_TO_MEMORY;
hdma_usart2_rx.Init.PeriphInc = DMA_PINC_DISABLE;
hdma_usart2_rx.Init.MemInc = DMA_MINC_ENABLE;
hdma_usart2_rx.Init.PeriphDataAlignment = DMA_PDATAALIGN_BYTE;
hdma_usart2_rx.Init.MemDataAlignment = DMA_MDATAALIGN_BYTE;
hdma_usart2_rx.Init.Mode = DMA_CIRCULAR;
hdma_usart2_rx.Init.Priority = DMA_PRIORITY_MEDIUM;
hdma_usart2_rx.Init.FIFOMode = DMA_FIFOMODE_DISABLE;
hdma_usart2_rx.Init.FIFOThreshold = DMA_FIFO_THRESHOLD_FULL;
hdma_usart2_rx.Init.MemBurst = DMA_MBURST_SINGLE;
hdma_usart2_rx.Init.PeriphBurst = DMA_PBURST_SINGLE;
HAL_DMA_Init(&hdma_usart2_rx);
 
__HAL_LINKDMA(&huart2, hdmarx, hdma_usart2_rx);
 
HAL_NVIC_SetPriority(DMA1_Stream5_IRQn, 5, 0);
HAL_NVIC_EnableIRQ(DMA1_Stream5_IRQn);
 
 
/* Start receiving DMA */
SET_BIT(esp_uart.Instance->CR3, USART_CR3_EIE);
__HAL_UART_CLEAR_IDLEFLAG(&huart2);
SET_BIT(esp_uart.Instance->CR1, USART_CR1_PEIE | USART_CR1_IDLEIE);
HAL_UART_Receive_DMA(&huart2, dmabuf, sizeof(dmabuf));

Thank you

6 REPLIES 6
MJ.1
Associate II

Just in case it is of any use, here is a dump of relevant registers after DMA1 stops responding. It is DMA1->S5CR's EN flag which is stuck set, and SMA1->S4 that continues to fire interrupts, but no data is being sent.

RCC:
        CR:          0x3F037E83  clock control register
        PLLCFGR:     0x2840384A  PLL configuration register
        CFGR:        0x0000940A  clock configuration register
        CIR:         0x00000000  clock interrupt register
        AHB1RSTR:    0x00000000  AHB1 peripheral reset register
        AHB2RSTR:    0x00000000  AHB2 peripheral reset register
        AHB3RSTR:    0x00000000  AHB3 peripheral reset register
        APB1RSTR:    0x00000000  APB1 peripheral reset register
        APB2RSTR:    0x00000000  APB2 peripheral reset register
        AHB1ENR:     0x007401FF  AHB1 peripheral clock register
        AHB2ENR:     0x00000080  AHB2 peripheral clock enable register
        AHB3ENR:     0x00000001  AHB3 peripheral clock enable register
        APB1ENR:     0x10624203  APB1 peripheral clock enable register
        APB2ENR:     0x00C04D00  APB2 peripheral clock enable register
        AHB1LPENR:   0x7EF7B7FF  AHB1 peripheral clock enable in low power mode register
        AHB2LPENR:   0x000000F1  AHB2 peripheral clock enable in low power mode register
        AHB3LPENR:   0x00000003  AHB3 peripheral clock enable in low power mode register
        APB1LPENR:   0xFFFFCBFF  APB1 peripheral clock enable in low power mode register
        APB2LPENR:   0x04F77F33  APB2 peripheral clock enabled in low power mode register
        BDCR:        0x00008103  Backup domain control register
        CSR:         0x00000002  clock control & status register
        SSCGR:       0x00000000  spread spectrum clock generation register
        PLLI2SCFGR:  0x2A002340  PLLI2S configuration register
        PLLSAICFGR:  0x24011E00  PLL configuration register
        DKCFGR1:     0x00500001  dedicated clocks configuration register
        DKCFGR2:     0x0B000000  dedicated clocks configuration register
RCC->AHB1ENR:
        OTGHSULPIEN:   0  USB OTG HSULPI clock enable
        OTGHSEN:       0  USB OTG HS clock enable
        ETHMACPTPEN:   0  Ethernet PTP clock enable
        ETHMACRXEN:    0  Ethernet Reception clock enable
        ETHMACTXEN:    0  Ethernet Transmission clock enable
        ETHMACEN:      0  Ethernet MAC clock enable
        DMA2DEN:       0  DMA2D clock enable
        DMA2EN:        1  DMA2 clock enable
        DMA1EN:        1  DMA1 clock enable
        CCMDATARAMEN:  1  CCM data RAM clock enable
        BKPSRAMEN:     1  Backup SRAM interface clock enable
        CRCEN:         0  CRC clock enable
        GPIOKEN:       0  IO port K clock enable
        GPIOJEN:       0  IO port J clock enable
        GPIOIEN:       1  IO port I clock enable
        GPIOHEN:       1  IO port H clock enable
        GPIOGEN:       1  IO port G clock enable
        GPIOFEN:       1  IO port F clock enable
        GPIOEEN:       1  IO port E clock enable
        GPIODEN:       1  IO port D clock enable
        GPIOCEN:       1  IO port C clock enable
        GPIOBEN:       1  IO port B clock enable
        GPIOAEN:       1  IO port A clock enable
SCB:
        CPUID:                 0x410FC271  CPUID base register
        ICSR:                  0x00C36000  Interrupt control and state register
        VTOR:                  0x08010000  Vector table offset register
        AIRCR:                 0xFA050000  Application interrupt and reset control register
        SCR:                   0x00000000  System control register
        CCR:                   0x00070200  Configuration and control register
        SHPR1:                 0x00000000  System handler priority registers
        SHPR2:                 0x40000000  System handler priority registers
        SHPR3:                 0xF0F00000  System handler priority registers
        SHCRS:                 0x00010000  System handler control and state register
        CFSR_UFSR_BFSR_MMFSR:  0x00000000  Configurable fault status register
        HFSR:                  0x00000000  Hard fault status register
        MMFAR:                 0x00000000  Memory management fault address register
        BFAR:                  0x00000000  Bus fault address register
DMA1:
        LISR:    0x00000000  low interrupt status register
        HISR:    0x00000000  high interrupt status register
        LIFCR:   0x00000000  low interrupt flag clear register
        HIFCR:   0x00000000  high interrupt flag clear register
        S4CR:    0x00010446  stream x configuration register
        S4NDTR:  0x00000000  stream x number of data register
        S4PAR:   0x4000380C  stream x peripheral address register
        S4M0AR:  0x20011A40  stream x memory 0 address register
        S4M1AR:  0x00000000  stream x memory 1 address register
        S4FCR:   0x000000A0  stream x FIFO control register
        S5CR:    0x0801051F  stream x configuration register
        S5NDTR:  0x000000FE  stream x number of data register
        S5PAR:   0x40004424  stream x peripheral address register
        S5M0AR:  0x2004E000  stream x memory 0 address register
        S5M1AR:  0x00000000  stream x memory 1 address register
        S5FCR:   0x000000A0  stream x FIFO control register
DMA1->S4CR:
        CHSEL:   0  Channel selection
        MBURST:  0  Memory burst transfer configuration
        PBURST:  0  Peripheral burst transfer configuration
        ACK:     0
        CT:      0  Current target (only in double buffer mode)
        DBM:     0  Double buffer mode
        PL:      1  Priority level
        PINCOS:  0  Peripheral increment offset size
        MSIZE:   0  Memory data size
        PSIZE:   0  Peripheral data size
        MINC:    1  Memory increment mode
        PINC:    0  Peripheral increment mode
        CIRC:    0  Circular mode
        DIR:     1  Data transfer direction
        PFCTRL:  0  Peripheral flow controller
        TCIE:    0  Transfer complete interrupt enable
        HTIE:    0  Half transfer interrupt enable
        TEIE:    1  Transfer error interrupt enable
        DMEIE:   1  Direct mode error interrupt enable
        EN:      0  Stream enable / flag stream ready when read low
DMA1->S5CR:
        CHSEL:   4  Channel selection
        MBURST:  0  Memory burst transfer configuration
        PBURST:  0  Peripheral burst transfer configuration
        ACK:     0
        CT:      0  Current target (only in double buffer mode)
        DBM:     0  Double buffer mode
        PL:      1  Priority level
        PINCOS:  0  Peripheral increment offset size
        MSIZE:   0  Memory data size
        PSIZE:   0  Peripheral data size
        MINC:    1  Memory increment mode
        PINC:    0  Peripheral increment mode
        CIRC:    1  Circular mode
        DIR:     0  Data transfer direction
        PFCTRL:  0  Peripheral flow controller
        TCIE:    1  Transfer complete interrupt enable
        HTIE:    1  Half transfer interrupt enable
        TEIE:    1  Transfer error interrupt enable
        DMEIE:   1  Direct mode error interrupt enable
        EN:      1  Stream enable / flag stream ready when read low
SPI2:
        CR1:      0x00000354  control register 1
        CR2:      0x00001700  control register 2
        SR:       0x00000002  status register
        DR:       0x00000000  data register
        CRCPR:    0x00000007  CRC polynomial register
        RXCRCR:   0x00000000  RX CRC register
        TXCRCR:   0x00000000  TX CRC register
        I2SCFGR:  0x00000000  I2S configuration register
        I2SPR:    0x00000002  I2S prescaler register
USART2:
        CR1:       0x0000001D  Control register 1
        CR2:       0x00000000  Control register 2
        CR3:       0x00000340  Control register 3
        BRR:       0x00000012  Baud rate register
        GTPR:      0x00000000  Guard time and prescaler register
        RTOR:      0x00000000  Receiver timeout register
        RQR:   (not readable)  Request register
        ISR:       0x00621230  Interrupt & status register
        ICR:   (not readable)  Interrupt flag clear register
        RDR:       0x000000CA  Receive data register
        TDR:       0x00000052  Transmit data register

Wow.

The only way I was able to get DMA stuck was when I (deliberately, as an experiment) mapped two streams to the same trigger...

> HAL_DMAPause()

What's that? Sounds suspicious. I can't find it in CubeF7 https://github.com/STMicroelectronics/STM32CubeF7/search?q=HAL_DMAPause

JW

Hi JW,

Thank you for your response! I mis-typed that one, it is actually "HAL_UART_DMAPause()" https://github.com/STMicroelectronics/STM32CubeF7/blob/79acbf8ec060d3ec751f2eaba6ee050269995357/Drivers/STM32F7xx_HAL_Driver/Src/stm32f7xx_hal_uart.c#L1440 . I have updated the initial question to clarify.

Avoid this function and report back if the problem persists.

JW

Hi JW,

I've replaced "HAL_UART_DMAPause()" and "HAL_UART_DMAResume()" with "CLEAR_BIT(huart2.Instance->CR3, USART_CR3_DMAR)" and "SET_BIT(huart2.Instance->CR3, USART_CR3_DMAR)" respectively. This has not helped the situation.

Unfortunately, I don't think I can remove DMA Pause / Resume entirely, as I rely on them for flow control while the system is under load. I know for a fact that I am successfully pausing / resuming thousands of times before DMA gets stuck. It's my understanding that I should be able to consistently perform UART flow control in this manner.

Here is a snippet of my USART Rx routines, I've put in fairly heavy locking just to be sure.

extern UART_HandleTypeDef huart2;
extern DMA_HandleTypeDef hdma_usart2_rx;
static struct RingBuffer rx_buffer; /* A simple ring buffer, details irrelvant */
 
/* Circular buffer for use by the dma controller */
static uint8_t dmabuf[256];
static size_t old_pos = 0;
 
static uint32_t LOCK(void) {
	uint32_t priMsk = __get_PRIMASK();
	__disable_irq();
	return priMsk;
}
 
static void UNLOCK(uint32_t priMask) {
	__set_PRIMASK(priMask);
}
 
 
static bool rxIsPaused(void) {
	return (HAL_IS_BIT_SET(huart2.Instance->CR3, USART_CR3_DMAR) == 0);
}
 
static size_t getRxBytesAvailable(void) {
	const size_t pos = sizeof(dmabuf) - hdma_usart2_rx.Instance->NDTR;
	size_t num_bytes;
	if (pos == old_pos) {
		num_bytes = 0;
	} else if (pos > old_pos) {
		num_bytes = pos - old_pos;
	} else {
		num_bytes = sizeof(dmabuf) - old_pos + pos;
	}
	return num_bytes;
}
 
static void consumeRxDataBuffer(size_t num_bytes) {
	const size_t remaining = sizeof(dmabuf) - old_pos;
	const size_t write_len = MIN(num_bytes, remaining);
	ring_buffer_write(&rx_buffer, &dmabuf[old_pos], write_len);
 
	const size_t bytes_left = num_bytes - write_len;
	if (bytes_left) {
		/* Wrapping to beginning of dmabuf */
		ring_buffer_write(&rx_buffer, dmabuf, bytes_left);
	}
 
	old_pos = (old_pos + num_bytes) % sizeof(dmabuf);
}
 
static void processIncoming(void)
{
	const uint32_t L = LOCK();
 
	if (rxIsPaused()) {
		/* buffer will be processed prior to resumption */
		goto unlock;
	}
 
	/* Calculate current position in buffer */
	size_t read_amount = getRxBytesAvailable();
	if (read_amount == 0) {
		goto unlock;
	}
 
	if (read_amount > ring_buffer_get_space_available(&rx_buffer)) {
		/* Pause UART DMA until we consume & have enough space again */
		HAL_UART_DMAPause(&huart2);
		goto unlock;
	}
 
	consumeRxDataBuffer(read_amount);
 
unlock:
	UNLOCK(L);
}
 
/* Remember: This is used from both DMA1 an USART2 ISRs */
void HAL_UART_RxHalfCpltCallback(UART_HandleTypeDef* uart)
{
	processIncoming();
}
 
/* Remember: This is used from both DMA1 an USART2 ISRs */
void HAL_UART_RxCpltCallback(UART_HandleTypeDef* uart)
{
	processIncoming();
}
 
void USART2_IRQHandler(void)
{
	if ( __HAL_UART_GET_FLAG(&huart2, UART_FLAG_IDLE) )
	{
		__HAL_UART_CLEAR_IDLEFLAG(&huart2);
		processIncoming();
	}
	else
	{
		HAL_UART_IRQHandler(&huart2);
	}
}
 
 
/* RTOS consumer calls this after reading from the ring buffer */
void resumeRx(void) {
	const uint32_t L = LOCK();
 
	if (rxIsPaused() == false) {
		goto unlock;
	}
 
	size_t read_amount = getRxBytesAvailable();
	if (read_amount > ring_buffer_get_space_available(&rx_buffer)) {
		/* No space, don't resume yet */
		goto unlock;
	}
	consumeRxDataBuffer(read_amount);
	HAL_UART_DMAResume(&huart2);
 
unlock:
	UNLOCK(L);
}

To test it out, I greatly increased the size of my task-owned ring buffer (rx_buffer) such that I never need to pause / resume. Transfers in that case seem to be OK and I haven't been able to reproduce my issue.

Unfortunately, I can't afford that memory and it is my understanding that it should be possible to perform flow control in this manner.

Thanks again,

Marc

Hi Marc,

> my understanding that it should be possible to perform flow control in this manner.

Yes, that was my understanding, too. However, the modules in STM32 are very complex, they are written by people, and people do make errors. You are exploring a seldom-used feature which is probably buggy and is beyond ST's understanding or care.

My experience (post mingled courtesy of ST's mania to pay for inferior forum software) with a similar scheme has a somewhat different outcome - missing transfers rather than stuck DMA, but that can be explained by different mcu/family (with potentially different design of the DMA - ST does not care to provide the users with this detail) and/or minute details in setup and events sequencing.

I don't say this *is* the source of your problem, but it sounds very likely, given also results of your experiment.

As there's no point expecting ST will ever acknowledge this as a problem yet alone produce a workaround (which very likely may be none, for the existing chips), the easiest way is to avoid the problem is to avoid the request (trigger) disable/enable at all. I know that means rethinking the logic of your application and I know the amount of work it can present - I went down that path too.

JW