cancel
Showing results for 
Search instead for 
Did you mean: 

One-time UART DMA data mismatch

robotwhisperer
Visitor

Hello all, 

This is my first post on here, if I am in error at any point please direct me to any relevant posting guidelines. 

I am developing a sender-receiver solution involving two STM32H753 microcontrollers, both on ST's NUCLEO-STM32H753 boards. I am using USART3 to transmit and receive one byte using DMA. The sender receives one byte (for example 'S') and the receiver receives one byte, checks whether it matches 'S', and if it does, it toggles the yellow on-board LED. If it does not match, it toggles the green on-board LED. This check is being done in the HAL_UART_RxCpltCallback() function. 

I am facing an issue where the first time the HAL_UART_RxCpltCallback triggers after the receiver is reset, it toggles the green LED once (meaning that the received byte did not match with the expected data), and then continuously toggles the yellow LED (meaning that the received byte does match the expected data, this is the desired behaviour). When I went to debug the receiver, I see that when debugging, this problem does not exist, and it always only toggles the yellow LED, i.e., the received byte always matches with the expected data, 'S'. 

So, in the debugger, everything works fine. But when not debugging, the first iteration of data does not match but all subsequent ones do. 

Below I have snippets for the UART Callback functions and main functions for the transmitter and receiver. I also have a screenshot of STM32CubeIDE that shows in the debug view the matching data in rx_buffer on the first iteration. 

// ------------------------------------------------------------------------------
// Receiver Code:
/* USER CODE BEGIN 0 */
void HAL_UART_RxCpltCallback(UART_HandleTypeDef *huart) {
	if (huart == &huart3) {
		uart_rx_complete = 1;
		callback_count++;
		if (rx_buffer[0] == 'S') {
			HAL_GPIO_TogglePin(LD2_GPIO_Port, LD2_Pin); // Matched!
		} else {
			HAL_GPIO_TogglePin(LD1_GPIO_Port, LD1_Pin); // Did not match!
		}
	}
	SCB_InvalidateDCache_by_Addr((uint32_t*)rx_buffer, 1);
	HAL_UART_Receive_DMA(&huart3, rx_buffer, 1);
}
/* USER CODE END 0 */

/**
 * @brief  The application entry point.
 * @retval int
 */
int main(void) {

	/* USER CODE BEGIN 1 */

	/* USER CODE END 1 */

	/* MPU Configuration--------------------------------------------------------*/
	MPU_Config();

	/* Enable the CPU Cache */

	/* Enable I-Cache---------------------------------------------------------*/
	SCB_EnableICache();

	/* Enable D-Cache---------------------------------------------------------*/
	SCB_EnableDCache();

	/* MCU Configuration--------------------------------------------------------*/

	/* Reset of all peripherals, Initializes the Flash interface and the Systick. */
	HAL_Init();

	/* USER CODE BEGIN Init */

	/* USER CODE END Init */

	/* Configure the system clock */
	SystemClock_Config();

	/* USER CODE BEGIN SysInit */

	/* USER CODE END SysInit */

	/* Initialize all configured peripherals */
	MX_GPIO_Init();
	MX_DMA_Init();
	MX_USART6_UART_Init();
	MX_USART3_UART_Init();
	/* USER CODE BEGIN 2 */
	SCB_InvalidateDCache_by_Addr((uint32_t*)rx_buffer, 1);
    HAL_UART_Receive_DMA(&huart3, rx_buffer, 1);

	/* USER CODE END 2 */

	/* Infinite loop */
	/* USER CODE BEGIN WHILE */
	while (1) {

		/* USER CODE END WHILE */

		/* USER CODE BEGIN 3 */
	}
	/* USER CODE END 3 */
}

// ------------------------------------------------------------------------------
// Transmitter Code 
/* USER CODE BEGIN 0 */
void HAL_UART_TxCpltCallback(UART_HandleTypeDef *huart) {
    if (huart == &huart3) {
        uart_tx_complete = 1;
        callback_count++;
        HAL_GPIO_TogglePin(LD3_GPIO_Port, LD3_Pin);
    }
}
/* USER CODE END 0 */

/**
  * @brief  The application entry point.
  * @retval int
  */
int main(void)
{

  /* USER CODE BEGIN 1 */

  /* USER CODE END 1 */

  /* MPU Configuration--------------------------------------------------------*/
  MPU_Config();

  /* Enable the CPU Cache */

  /* Enable I-Cache---------------------------------------------------------*/
  SCB_EnableICache();

  /* Enable D-Cache---------------------------------------------------------*/
  SCB_EnableDCache();

  /* MCU Configuration--------------------------------------------------------*/

  /* Reset of all peripherals, Initializes the Flash interface and the Systick. */
  HAL_Init();

  /* USER CODE BEGIN Init */

  /* USER CODE END Init */

  /* Configure the system clock */
  SystemClock_Config();

  /* USER CODE BEGIN SysInit */

  /* USER CODE END SysInit */

  /* Initialize all configured peripherals */
  MX_GPIO_Init();
  MX_DMA_Init();
  MX_USART3_UART_Init();
  /* USER CODE BEGIN 2 */
	uint8_t data[1] = {'S'};
  /* USER CODE END 2 */

  /* Infinite loop */
  /* USER CODE BEGIN WHILE */
	while (1) {
		if (uart_tx_complete) {
			SCB_CleanDCache_by_Addr((uint32_t*)data, 1);
			if (HAL_UART_Transmit_DMA(&huart3, data, 1) != HAL_OK) {
				HAL_GPIO_WritePin(LD3_GPIO_Port, LD3_Pin, 1);
				Error_Handler();
			}
		}
		HAL_GPIO_TogglePin(LD1_GPIO_Port, LD1_Pin);
		HAL_Delay(500);
    /* USER CODE END WHILE */

    /* USER CODE BEGIN 3 */
	}
  /* USER CODE END 3 */
}

 

STM32CubeIDE Debug session:
Screenshot from 2025-09-05 10-48-10.png
 

4 REPLIES 4
robotwhisperer
Visitor

I checked again, it looks like the problem does exist in debug too. If the first point I break is line 73, the if statement ` if(rx_buffer[0] == 'S' ` , then it does show me that the data in the rx_buffer[0] is 0x0. This is strange because I would expect that the DMA would only have triggered if there was data received, and if that is the case I would expect that it would be whatever data was received over UART. So for the first iteration, the data is not showing up. 

bmckenney
Associate III
	SCB_InvalidateDCache_by_Addr((uint32_t*)rx_buffer, 1);

 I suggest you move this to precede the read of rx_buffer[0] (up at the top of the if() block). When you do the read it's been a "long time" since the Invalidate, and it's not unlikely some neighboring variable caused a re-read of the cache line. 

If I can, I set aside one of the alternate SRAM blocks (SRAM1, e.g.) for DMA buffers, and just keep it non-cacheable. A bit wasteful, maybe, but saves a lot of headaches.

TDK
Super User

In addition to invalidating before you read the data, you also need to ensure rx_buffer is cache-page aligned and that nothing else occupies that flash page. Easiest way to do this is to align it and make it the size of a cache page.

If you feel a post has answered your question, please click "Accept as Solution".

The required alignment is 32-bytes, but it's also the minimum width, so surrounding data is subject to collateral damage.

DMA for ONE byte seems to introduce a lot of friction for zero benefit.

Check error handling situations, ie where the UART status has noise, framing or parity errors, and the return values from HAL_UART_...  interactions.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..