Skip to main content
MJ.1
Associate
October 22, 2020
Question

Help with DMA Controller becoming unresponsive

  • October 22, 2020
  • 4 replies
  • 2682 views

Hello,

I am having an issue with DMA1 getting "stuck" during operation. While DMA1 is stuck, DMA2 continues to run without any issues. I realize that this is likely a race somewhere, however I am hoping for any insight into how or why the DMA controller becomes unusable.

First some project details:

  • Running on an stm32f746ig, with m7 r0p1 core
  • We are using Cube HAL
  • DMA1 Setup:
    • Stream 4 is used in Direct Mode for SPI2 tx
    • Stream 5 is used in Circular Mode for USART2 RX, flow control enabled, HalfRXCplt + RxCplt + IDLE detection, and using HAL_UART_DMAPause() when host buffers are full
  • SPI2 is used for output to a display, we perform non-DMA transfers over this as well as larger DMA transfers
  • USART2 is connected to another microcontroller, and we only use DMA for RX

When both SPI2 & USART2 are running at a high rate (lots of display updates over SPI2, lots of data in over USART2), eventually the DMA1 controller becomes unusable. The only way that this is detected is that I am expecting data over the USART2 Rx, and it never arrives.

The SPI2 DMA1s4 continues to run, is re-configurable, can be started / stopped and even the TxComplete callbacks are firing -- but no data is being sent out over SPI2. I can still successfully send data over SPI2, just not using the DMA controller.

The USART2 DMA1s5 is completely unusable, when I try to disable it by clearing `DMA_S5CR_EN`, it never clears and seemingly cannot be disabled. Any attempts to de-init/ re-init USART2 and its DMA stream get stuck trying to disable DMA1s5.

SPI2 Usage

The first line of the display is sent out by an RTOS task, using `SPI_Transmit_DMA()`. Subsequent lines are sent out in the `HAL_SPI_TxCpltCallback()` also using `SPI_Transmit_DMA()`.

SPI2 Initialization

SPI_HandleTypeDef lcd_spi;
	lcd_spi.Instance = SPI2;
	lcd_spi.Init.BaudRatePrescaler = SPI_BAUDRATEPRESCALER_8;
	lcd_spi.Init.Direction = SPI_DIRECTION_2LINES;
	lcd_spi.Init.CLKPhase = SPI_PHASE_1EDGE;
	lcd_spi.Init.CLKPolarity = SPI_POLARITY_LOW;
	lcd_spi.Init.DataSize = SPI_DATASIZE_8BIT;
	lcd_spi.Init.FirstBit = SPI_FIRSTBIT_MSB;
	lcd_spi.Init.TIMode = SPI_TIMODE_DISABLE;
	lcd_spi.Init.CRCCalculation = SPI_CRCCALCULATION_DISABLE;
	lcd_spi.Init.CRCPolynomial = 7;
	lcd_spi.Init.NSS = SPI_NSS_SOFT;
	lcd_spi.Init.Mode = SPI_MODE_MASTER;
 
	HAL_SPI_Init(&lcd_spi);
	
	DMA_HandleTypeDef spi_dma_tx;
	spi_dma_tx.Instance = DMA1_Stream4;
	spi_dma_tx.Init.Channel = DMA_CHANNEL_0;
	spi_dma_tx.Init.FIFOMode = DMA_FIFOMODE_DISABLE;
	spi_dma_tx.Init.FIFOThreshold = DMA_FIFO_THRESHOLD_FULL;
	spi_dma_tx.Init.MemBurst = DMA_MBURST_INC4;
	spi_dma_tx.Init.PeriphBurst = DMA_PBURST_INC4;
	spi_dma_tx.Init.Direction = DMA_MEMORY_TO_PERIPH;
	spi_dma_tx.Init.PeriphInc = DMA_PINC_DISABLE;
	spi_dma_tx.Init.MemInc = DMA_MINC_ENABLE;
	spi_dma_tx.Init.PeriphDataAlignment = DMA_PDATAALIGN_BYTE;
	spi_dma_tx.Init.MemDataAlignment = DMA_MDATAALIGN_BYTE;
	spi_dma_tx.Init.Mode = DMA_NORMAL;
	spi_dma_tx.Init.Priority = DMA_PRIORITY_MEDIUM;
	
	HAL_DMA_Init(&spi_dma_tx);
	__HAL_LINKDMA(hspi, hdmatx, spi_dma_tx);
	
	HAL_NVIC_SetPriority(DMA1_Stream4_IRQn, 5, 1);
	HAL_NVIC_EnableIRQ(DMA1_Stream4_IRQn);

USART2 Usage

DMA Rx is started in circular mode. It is being run with Flow Control enabled, and IDLE detection since we can receive any amount of data.

In the HAL_UART_RxCplt, HAL_UART_RxHalfCplt callbacks, and on IDLE interrupt, data in the dma buffer is transferred into a ring buffer that is shared with an RTOS thread. If that ring buffer is full, `HAL_UART_DMAPause()` is called until the RTOS thread drains the buffer, and then `HAL_UART_DMAResume()` is called.

USART2 Initialization

UART_HandleTypeDef huart2;
huart2.Instance = USART2;
huart2.Init.BaudRate = 2534400;
huart2.Init.WordLength = UART_WORDLENGTH_8B;
huart2.Init.StopBits = UART_STOPBITS_1;
huart2.Init.Parity = UART_PARITY_NONE;
huart2.Init.Mode = UART_MODE_TX_RX;
huart2.Init.HwFlowCtl = UART_HWCONTROL_RTS_CTS;
huart2.Init.OverSampling = UART_OVERSAMPLING_16;
huart2.Init.OneBitSampling = UART_ONE_BIT_SAMPLE_DISABLE;
huart2.AdvancedInit.AdvFeatureInit = UART_ADVFEATURE_NO_INIT;
HAL_UART_Init(&huart2);
 
HAL_NVIC_SetPriority(USART2_IRQn, 5, 0);
HAL_NVIC_EnableIRQ(USART2_IRQn);
 
hdma_usart2_rx.Instance = DMA1_Stream5;
hdma_usart2_rx.Init.Channel = DMA_CHANNEL_4;
hdma_usart2_rx.Init.Direction = DMA_PERIPH_TO_MEMORY;
hdma_usart2_rx.Init.PeriphInc = DMA_PINC_DISABLE;
hdma_usart2_rx.Init.MemInc = DMA_MINC_ENABLE;
hdma_usart2_rx.Init.PeriphDataAlignment = DMA_PDATAALIGN_BYTE;
hdma_usart2_rx.Init.MemDataAlignment = DMA_MDATAALIGN_BYTE;
hdma_usart2_rx.Init.Mode = DMA_CIRCULAR;
hdma_usart2_rx.Init.Priority = DMA_PRIORITY_MEDIUM;
hdma_usart2_rx.Init.FIFOMode = DMA_FIFOMODE_DISABLE;
hdma_usart2_rx.Init.FIFOThreshold = DMA_FIFO_THRESHOLD_FULL;
hdma_usart2_rx.Init.MemBurst = DMA_MBURST_SINGLE;
hdma_usart2_rx.Init.PeriphBurst = DMA_PBURST_SINGLE;
HAL_DMA_Init(&hdma_usart2_rx);
 
__HAL_LINKDMA(&huart2, hdmarx, hdma_usart2_rx);
 
HAL_NVIC_SetPriority(DMA1_Stream5_IRQn, 5, 0);
HAL_NVIC_EnableIRQ(DMA1_Stream5_IRQn);
 
 
/* Start receiving DMA */
SET_BIT(esp_uart.Instance->CR3, USART_CR3_EIE);
__HAL_UART_CLEAR_IDLEFLAG(&huart2);
SET_BIT(esp_uart.Instance->CR1, USART_CR1_PEIE | USART_CR1_IDLEIE);
HAL_UART_Receive_DMA(&huart2, dmabuf, sizeof(dmabuf));

Thank you

This topic has been closed for replies.

4 replies

MJ.1
MJ.1Author
Associate
October 22, 2020

Just in case it is of any use, here is a dump of relevant registers after DMA1 stops responding. It is DMA1->S5CR's EN flag which is stuck set, and SMA1->S4 that continues to fire interrupts, but no data is being sent.

RCC:
 CR: 0x3F037E83 clock control register
 PLLCFGR: 0x2840384A PLL configuration register
 CFGR: 0x0000940A clock configuration register
 CIR: 0x00000000 clock interrupt register
 AHB1RSTR: 0x00000000 AHB1 peripheral reset register
 AHB2RSTR: 0x00000000 AHB2 peripheral reset register
 AHB3RSTR: 0x00000000 AHB3 peripheral reset register
 APB1RSTR: 0x00000000 APB1 peripheral reset register
 APB2RSTR: 0x00000000 APB2 peripheral reset register
 AHB1ENR: 0x007401FF AHB1 peripheral clock register
 AHB2ENR: 0x00000080 AHB2 peripheral clock enable register
 AHB3ENR: 0x00000001 AHB3 peripheral clock enable register
 APB1ENR: 0x10624203 APB1 peripheral clock enable register
 APB2ENR: 0x00C04D00 APB2 peripheral clock enable register
 AHB1LPENR: 0x7EF7B7FF AHB1 peripheral clock enable in low power mode register
 AHB2LPENR: 0x000000F1 AHB2 peripheral clock enable in low power mode register
 AHB3LPENR: 0x00000003 AHB3 peripheral clock enable in low power mode register
 APB1LPENR: 0xFFFFCBFF APB1 peripheral clock enable in low power mode register
 APB2LPENR: 0x04F77F33 APB2 peripheral clock enabled in low power mode register
 BDCR: 0x00008103 Backup domain control register
 CSR: 0x00000002 clock control & status register
 SSCGR: 0x00000000 spread spectrum clock generation register
 PLLI2SCFGR: 0x2A002340 PLLI2S configuration register
 PLLSAICFGR: 0x24011E00 PLL configuration register
 DKCFGR1: 0x00500001 dedicated clocks configuration register
 DKCFGR2: 0x0B000000 dedicated clocks configuration register
RCC->AHB1ENR:
 OTGHSULPIEN: 0 USB OTG HSULPI clock enable
 OTGHSEN: 0 USB OTG HS clock enable
 ETHMACPTPEN: 0 Ethernet PTP clock enable
 ETHMACRXEN: 0 Ethernet Reception clock enable
 ETHMACTXEN: 0 Ethernet Transmission clock enable
 ETHMACEN: 0 Ethernet MAC clock enable
 DMA2DEN: 0 DMA2D clock enable
 DMA2EN: 1 DMA2 clock enable
 DMA1EN: 1 DMA1 clock enable
 CCMDATARAMEN: 1 CCM data RAM clock enable
 BKPSRAMEN: 1 Backup SRAM interface clock enable
 CRCEN: 0 CRC clock enable
 GPIOKEN: 0 IO port K clock enable
 GPIOJEN: 0 IO port J clock enable
 GPIOIEN: 1 IO port I clock enable
 GPIOHEN: 1 IO port H clock enable
 GPIOGEN: 1 IO port G clock enable
 GPIOFEN: 1 IO port F clock enable
 GPIOEEN: 1 IO port E clock enable
 GPIODEN: 1 IO port D clock enable
 GPIOCEN: 1 IO port C clock enable
 GPIOBEN: 1 IO port B clock enable
 GPIOAEN: 1 IO port A clock enable
SCB:
 CPUID: 0x410FC271 CPUID base register
 ICSR: 0x00C36000 Interrupt control and state register
 VTOR: 0x08010000 Vector table offset register
 AIRCR: 0xFA050000 Application interrupt and reset control register
 SCR: 0x00000000 System control register
 CCR: 0x00070200 Configuration and control register
 SHPR1: 0x00000000 System handler priority registers
 SHPR2: 0x40000000 System handler priority registers
 SHPR3: 0xF0F00000 System handler priority registers
 SHCRS: 0x00010000 System handler control and state register
 CFSR_UFSR_BFSR_MMFSR: 0x00000000 Configurable fault status register
 HFSR: 0x00000000 Hard fault status register
 MMFAR: 0x00000000 Memory management fault address register
 BFAR: 0x00000000 Bus fault address register
DMA1:
 LISR: 0x00000000 low interrupt status register
 HISR: 0x00000000 high interrupt status register
 LIFCR: 0x00000000 low interrupt flag clear register
 HIFCR: 0x00000000 high interrupt flag clear register
 S4CR: 0x00010446 stream x configuration register
 S4NDTR: 0x00000000 stream x number of data register
 S4PAR: 0x4000380C stream x peripheral address register
 S4M0AR: 0x20011A40 stream x memory 0 address register
 S4M1AR: 0x00000000 stream x memory 1 address register
 S4FCR: 0x000000A0 stream x FIFO control register
 S5CR: 0x0801051F stream x configuration register
 S5NDTR: 0x000000FE stream x number of data register
 S5PAR: 0x40004424 stream x peripheral address register
 S5M0AR: 0x2004E000 stream x memory 0 address register
 S5M1AR: 0x00000000 stream x memory 1 address register
 S5FCR: 0x000000A0 stream x FIFO control register
DMA1->S4CR:
 CHSEL: 0 Channel selection
 MBURST: 0 Memory burst transfer configuration
 PBURST: 0 Peripheral burst transfer configuration
 ACK: 0
 CT: 0 Current target (only in double buffer mode)
 DBM: 0 Double buffer mode
 PL: 1 Priority level
 PINCOS: 0 Peripheral increment offset size
 MSIZE: 0 Memory data size
 PSIZE: 0 Peripheral data size
 MINC: 1 Memory increment mode
 PINC: 0 Peripheral increment mode
 CIRC: 0 Circular mode
 DIR: 1 Data transfer direction
 PFCTRL: 0 Peripheral flow controller
 TCIE: 0 Transfer complete interrupt enable
 HTIE: 0 Half transfer interrupt enable
 TEIE: 1 Transfer error interrupt enable
 DMEIE: 1 Direct mode error interrupt enable
 EN: 0 Stream enable / flag stream ready when read low
DMA1->S5CR:
 CHSEL: 4 Channel selection
 MBURST: 0 Memory burst transfer configuration
 PBURST: 0 Peripheral burst transfer configuration
 ACK: 0
 CT: 0 Current target (only in double buffer mode)
 DBM: 0 Double buffer mode
 PL: 1 Priority level
 PINCOS: 0 Peripheral increment offset size
 MSIZE: 0 Memory data size
 PSIZE: 0 Peripheral data size
 MINC: 1 Memory increment mode
 PINC: 0 Peripheral increment mode
 CIRC: 1 Circular mode
 DIR: 0 Data transfer direction
 PFCTRL: 0 Peripheral flow controller
 TCIE: 1 Transfer complete interrupt enable
 HTIE: 1 Half transfer interrupt enable
 TEIE: 1 Transfer error interrupt enable
 DMEIE: 1 Direct mode error interrupt enable
 EN: 1 Stream enable / flag stream ready when read low
SPI2:
 CR1: 0x00000354 control register 1
 CR2: 0x00001700 control register 2
 SR: 0x00000002 status register
 DR: 0x00000000 data register
 CRCPR: 0x00000007 CRC polynomial register
 RXCRCR: 0x00000000 RX CRC register
 TXCRCR: 0x00000000 TX CRC register
 I2SCFGR: 0x00000000 I2S configuration register
 I2SPR: 0x00000002 I2S prescaler register
USART2:
 CR1: 0x0000001D Control register 1
 CR2: 0x00000000 Control register 2
 CR3: 0x00000340 Control register 3
 BRR: 0x00000012 Baud rate register
 GTPR: 0x00000000 Guard time and prescaler register
 RTOR: 0x00000000 Receiver timeout register
 RQR: (not readable) Request register
 ISR: 0x00621230 Interrupt & status register
 ICR: (not readable) Interrupt flag clear register
 RDR: 0x000000CA Receive data register
 TDR: 0x00000052 Transmit data register

waclawek.jan
Super User
October 22, 2020

Wow.

The only way I was able to get DMA stuck was when I (deliberately, as an experiment) mapped two streams to the same trigger...

> HAL_DMAPause()

What's that? Sounds suspicious. I can't find it in CubeF7 https://github.com/STMicroelectronics/STM32CubeF7/search?q=HAL_DMAPause

JW

MJ.1
MJ.1Author
Associate
October 22, 2020

Hi JW,

Thank you for your response! I mis-typed that one, it is actually "HAL_UART_DMAPause()" https://github.com/STMicroelectronics/STM32CubeF7/blob/79acbf8ec060d3ec751f2eaba6ee050269995357/Drivers/STM32F7xx_HAL_Driver/Src/stm32f7xx_hal_uart.c#L1440 . I have updated the initial question to clarify.

waclawek.jan
Super User
October 22, 2020

Avoid this function and report back if the problem persists.

JW

MJ.1
MJ.1Author
Associate
October 23, 2020

Hi JW,

I've replaced "HAL_UART_DMAPause()" and "HAL_UART_DMAResume()" with "CLEAR_BIT(huart2.Instance->CR3, USART_CR3_DMAR)" and "SET_BIT(huart2.Instance->CR3, USART_CR3_DMAR)" respectively. This has not helped the situation.

Unfortunately, I don't think I can remove DMA Pause / Resume entirely, as I rely on them for flow control while the system is under load. I know for a fact that I am successfully pausing / resuming thousands of times before DMA gets stuck. It's my understanding that I should be able to consistently perform UART flow control in this manner.

Here is a snippet of my USART Rx routines, I've put in fairly heavy locking just to be sure.

extern UART_HandleTypeDef huart2;
extern DMA_HandleTypeDef hdma_usart2_rx;
static struct RingBuffer rx_buffer; /* A simple ring buffer, details irrelvant */
 
/* Circular buffer for use by the dma controller */
static uint8_t dmabuf[256];
static size_t old_pos = 0;
 
static uint32_t LOCK(void) {
	uint32_t priMsk = __get_PRIMASK();
	__disable_irq();
	return priMsk;
}
 
static void UNLOCK(uint32_t priMask) {
	__set_PRIMASK(priMask);
}
 
 
static bool rxIsPaused(void) {
	return (HAL_IS_BIT_SET(huart2.Instance->CR3, USART_CR3_DMAR) == 0);
}
 
static size_t getRxBytesAvailable(void) {
	const size_t pos = sizeof(dmabuf) - hdma_usart2_rx.Instance->NDTR;
	size_t num_bytes;
	if (pos == old_pos) {
		num_bytes = 0;
	} else if (pos > old_pos) {
		num_bytes = pos - old_pos;
	} else {
		num_bytes = sizeof(dmabuf) - old_pos + pos;
	}
	return num_bytes;
}
 
static void consumeRxDataBuffer(size_t num_bytes) {
	const size_t remaining = sizeof(dmabuf) - old_pos;
	const size_t write_len = MIN(num_bytes, remaining);
	ring_buffer_write(&rx_buffer, &dmabuf[old_pos], write_len);
 
	const size_t bytes_left = num_bytes - write_len;
	if (bytes_left) {
		/* Wrapping to beginning of dmabuf */
		ring_buffer_write(&rx_buffer, dmabuf, bytes_left);
	}
 
	old_pos = (old_pos + num_bytes) % sizeof(dmabuf);
}
 
static void processIncoming(void)
{
	const uint32_t L = LOCK();
 
	if (rxIsPaused()) {
		/* buffer will be processed prior to resumption */
		goto unlock;
	}
 
	/* Calculate current position in buffer */
	size_t read_amount = getRxBytesAvailable();
	if (read_amount == 0) {
		goto unlock;
	}
 
	if (read_amount > ring_buffer_get_space_available(&rx_buffer)) {
		/* Pause UART DMA until we consume & have enough space again */
		HAL_UART_DMAPause(&huart2);
		goto unlock;
	}
 
	consumeRxDataBuffer(read_amount);
 
unlock:
	UNLOCK(L);
}
 
/* Remember: This is used from both DMA1 an USART2 ISRs */
void HAL_UART_RxHalfCpltCallback(UART_HandleTypeDef* uart)
{
	processIncoming();
}
 
/* Remember: This is used from both DMA1 an USART2 ISRs */
void HAL_UART_RxCpltCallback(UART_HandleTypeDef* uart)
{
	processIncoming();
}
 
void USART2_IRQHandler(void)
{
	if ( __HAL_UART_GET_FLAG(&huart2, UART_FLAG_IDLE) )
	{
		__HAL_UART_CLEAR_IDLEFLAG(&huart2);
		processIncoming();
	}
	else
	{
		HAL_UART_IRQHandler(&huart2);
	}
}
 
 
/* RTOS consumer calls this after reading from the ring buffer */
void resumeRx(void) {
	const uint32_t L = LOCK();
 
	if (rxIsPaused() == false) {
		goto unlock;
	}
 
	size_t read_amount = getRxBytesAvailable();
	if (read_amount > ring_buffer_get_space_available(&rx_buffer)) {
		/* No space, don't resume yet */
		goto unlock;
	}
	consumeRxDataBuffer(read_amount);
	HAL_UART_DMAResume(&huart2);
 
unlock:
	UNLOCK(L);
}

To test it out, I greatly increased the size of my task-owned ring buffer (rx_buffer) such that I never need to pause / resume. Transfers in that case seem to be OK and I haven't been able to reproduce my issue.

Unfortunately, I can't afford that memory and it is my understanding that it should be possible to perform flow control in this manner.

Thanks again,

Marc

waclawek.jan
Super User
October 24, 2020

Hi Marc,

> my understanding that it should be possible to perform flow control in this manner.

Yes, that was my understanding, too. However, the modules in STM32 are very complex, they are written by people, and people do make errors. You are exploring a seldom-used feature which is probably buggy and is beyond ST's understanding or care.

My experience (post mingled courtesy of ST's mania to pay for inferior forum software) with a similar scheme has a somewhat different outcome - missing transfers rather than stuck DMA, but that can be explained by different mcu/family (with potentially different design of the DMA - ST does not care to provide the users with this detail) and/or minute details in setup and events sequencing.

I don't say this *is* the source of your problem, but it sounds very likely, given also results of your experiment.

As there's no point expecting ST will ever acknowledge this as a problem yet alone produce a workaround (which very likely may be none, for the existing chips), the easiest way is to avoid the problem is to avoid the request (trigger) disable/enable at all. I know that means rethinking the logic of your application and I know the amount of work it can present - I went down that path too.

JW