cancel
Showing results for 
Search instead for 
Did you mean: 

HAL_UART_Receive_DMA randomly stops working?

crwper
Senior

I am working with an STM32WB5MMG, but I don't think this issue relates to the radio part of the MCU, so I am posting in the more general community. I hope that's okay.

I am reading data from a GPS receiver over UART using the HAL_UART_Receive_DMA function. Data is coming in at 230400 baud. DMA is configured as a simple (non-circular) buffer of 512 bytes. In the HAL_UART_RxCpltCallback handler, I call HAL_UART_Receive_DMA again to start a new transfer.

I'm doing it this way (as opposed to using circular mode) so that I can count the total number of bytes transferred by counting the number of complete "frames" of 512 bytes and adding (512 - CNDTR). My concern is that if I used a circular buffer, there is some ambiguity about when the frame count is updated relative to the CNDTR update.

Usually, this works fine. But after a couple of minutes of receiving data, it looks like CNDTR stops updating. I've checked that there is still data coming in from the GPS module. What's curious is that when CNDTR stops updating, it is always equal to 511 (i.e., one less than the buffer length).

I stumbled across a way to get a similar result...

  1. Run the firmware in debug mode
  2. Break where I would usually handle incoming data
  3. Remove the breakpoint and continue execution
  4. Set the breakpoint once more

When the debugger breaks the second time, similarly, CNDTR is stuck at 511. It seems likely that this is due to some kind of overflow, which makes me think the same thing might be true for the actual issue I'm having.

However, HAL_UART_ErrorCallback is never called, and when the error occurs, none of the registers seems to indicate an overflow (looking specifically at the DMA2 and DMAMUX1 SFRs).

Roughly speaking, this is the structure of my code:

#define GNSS_RX_BUF_LEN		512		// Circular buffer for UART
 
#define GNSS_UPDATE_MSEC    10
#define GNSS_UPDATE_RATE    (GNSS_UPDATE_MSEC*1000/CFG_TS_TICK_VAL)
 
uint8_t  gnssRxData[GNSS_RX_BUF_LEN];	// data buffer
uint32_t gnssRxIndex = 0;				// read index
 
void HAL_UART_RxCpltCallback(UART_HandleTypeDef *huart)
{
	// Begin DMA transfer
	HAL_UART_Receive_DMA(&huart1, gnssRxData, GNSS_RX_BUF_LEN);
}
 
void GNSS_Init(void)
{
	// Begin DMA transfer
	HAL_UART_Receive_DMA(&huart1, gnssRxData, GNSS_RX_BUF_LEN);
 
	// Initialize GNSS task
	UTIL_SEQ_RegTask(1<<CFG_TASK_GNSS_UPDATE_ID, UTIL_SEQ_RFU, GNSS_Update);
 
	// Initialize GNSS update timer
	HW_TS_Create(CFG_TIM_PROC_ID_ISR, &timer_id, hw_ts_Repeated, GNSS_Timer);
	HW_TS_Start(timer_id, GNSS_UPDATE_RATE);
}
 
static void GNSS_Timer(void)
{
	// Call update task
	UTIL_SEQ_SetTask(1<<CFG_TASK_GNSS_UPDATE_ID, CFG_SCH_PRIO_0);
}
 
static void GNSS_Update(void)
{
	uint32_t cndtr = huart1.hdmarx->Instance->CNDTR;
	uint32_t writeIndex = GNSS_RX_BUF_LEN - cndtr;
 
	while (gnssRxIndex != writeIndex)
	{
		// Handle a byte and increment gnssRxIndex
	}
}

Any help would be greatly appreciated. If an overrun were occurring, I'm not completely sure where I would catch it--I thought the SFRs would change, but maybe not? I'm hoping this issue looks familiar to someone out there.

Thanks!

Michael

1 ACCEPTED SOLUTION

Accepted Solutions

Most likely UART overflow, CNDTR being consistently one-below-full means it's a consequence of DMA restart not occurring fast enough (no wonder, given Cube/HAL bloat). Stick to circular DMA, learn to handle potential NDTR wraparound.

> However, HAL_UART_ErrorCallback is never called,

 Debug it, Cube is open source. Overflow is not DMA/DMAMUX "feature", but UART, so look there. RM is your friend. Check, if the underlying UART ISR is called, if yes, the problem is somewhere in Cube's harness, if no, debug as usually interrupts are debugged.

JW

View solution in original post

4 REPLIES 4

Most likely UART overflow, CNDTR being consistently one-below-full means it's a consequence of DMA restart not occurring fast enough (no wonder, given Cube/HAL bloat). Stick to circular DMA, learn to handle potential NDTR wraparound.

> However, HAL_UART_ErrorCallback is never called,

 Debug it, Cube is open source. Overflow is not DMA/DMAMUX "feature", but UART, so look there. RM is your friend. Check, if the underlying UART ISR is called, if yes, the problem is somewhere in Cube's harness, if no, debug as usually interrupts are debugged.

JW

crwper
Senior

A little more information: I thought I would try changing to circular mode, since this is essentially what I'm doing already. I made the chance in CubeMX, and made the following change to my callback:

void HAL_UART_RxCpltCallback(UART_HandleTypeDef *huart)
{
 
}

With these changes, the firmware runs the same as before, but we now spend 1 us in the (empty) callback instead of 12 us. Functionally, I think this is essentially the same thing except that the hardware is restarting the DMA transfer instead of software.

I'm happy it's working, but it's a poor engineer who makes changes until the software works, without understanding why those changes made it work. My thought is that, at 230400 bps, we're looking at 35 us per byte. The 12 us spent in the callback alone shouldn't cause an issue, but maybe other code in the HAL pushes this out past 35 us, so that a byte is missed while we're stopping and restarting the DMA transfer. This leaves me with one main question:

Is there something I can check in the old firmware to see if this was, in fact, the issue?

Presumably, if DMA is halting because a byte was missed, it will have noted this issue somewhere, but I can't for the life of me find the flag that would indicate this problem. I'd love to build a check into the firmware so that I can correct this issue if it ever (somehow) came up again in the future.

Thanks again for your help!

Michael

Michael,

You should see it as UART overflow, see UART chapter in RM. It should've thrown the respective interrupt, if it's enabled - I don't use Cube but you've probably named the respective callback already.

JW

Thanks for all your help, Jan!

I enabled the UART interrupt with the old firmware and looked for the overflow error, and sure enough it appeared. With circular mode, the error does not appear.

Michael