cancel
Showing results for 
Search instead for 
Did you mean: 

DMA buffer Question

JayDev
Senior II

I am trying to send large streams of data into the STM32H753 and I'm trying to maximize the baud rate, ideally aiming for a baud rate of 921600. Unfortunately, I don't know the byte size that will be incoming, which makes it difficult to know how to handle this. I currently have it setup to handle one byte at a time, which isn't very efficient.

Below is essentially what I'm using:

#define uartsize	2000
#define dma_buffer_interrupt_size	1
 
char		UART_out_Buffer[100];
uint8_t	UART_in_Buffer[uartsize];
int8_t	buf_len = 0;
uint8_t 	in_byte;
 
uint32_t	head_index = 0;
uint32_t	tail_index = 0;
 
int main(void)
{
  HAL_Init();
  SystemClock_Config();
 
  /* Initialize all configured peripherals */
  MX_GPIO_Init();
  MX_ETH_Init();
  MX_USART3_UART_Init();
  MX_DMA_Init();
  MX_USB_OTG_FS_PCD_Init();
 
  buf_len = sprintf(UART_out_Buffer, "Hello World\r\n");
  HAL_UART_Transmit(&huart3, (uint8_t *)UART_out_Buffer, buf_len, 100);
 
  HAL_UART_Receive_DMA(&huart3, &UART_in_Buffer[head_index], dma_buffer_interrupt_size);
 
  while (1)
  {
 
  }
}
 
void HAL_UART_RxCpltCallback(UART_HandleTypeDef *huart)
{
    head_index = (head_index + dma_buffer_interrupt_size) % uartsize;
    HAL_UART_Receive_DMA(&huart3, &UART_in_Buffer[head_index], dma_buffer_interrupt_size);
}

Below is my peripheral setup:

static void MX_USART3_UART_Init(void)
{
  huart3.Instance = USART3;
  huart3.Init.BaudRate = 115200;
  huart3.Init.WordLength = UART_WORDLENGTH_8B;
  huart3.Init.StopBits = UART_STOPBITS_1;
  huart3.Init.Parity = UART_PARITY_NONE;
  huart3.Init.Mode = UART_MODE_TX_RX;
  huart3.Init.HwFlowCtl = UART_HWCONTROL_NONE;
  huart3.Init.OverSampling = UART_OVERSAMPLING_16;
  huart3.Init.OneBitSampling = UART_ONE_BIT_SAMPLE_DISABLE;
  huart3.Init.ClockPrescaler = UART_PRESCALER_DIV1;
  huart3.AdvancedInit.AdvFeatureInit = UART_ADVFEATURE_NO_INIT;
  if (HAL_UART_Init(&huart3) != HAL_OK)
  {
    Error_Handler();
  }
  if (HAL_UARTEx_SetTxFifoThreshold(&huart3, UART_TXFIFO_THRESHOLD_1_8) != HAL_OK)
  {
    Error_Handler();
  }
  if (HAL_UARTEx_SetRxFifoThreshold(&huart3, UART_RXFIFO_THRESHOLD_1_8) != HAL_OK)
  {
    Error_Handler();
  }
  if (HAL_UARTEx_DisableFifoMode(&huart3) != HAL_OK)
  {
    Error_Handler();
  }
}
 
static void MX_DMA_Init(void)
{
 
  /* DMA controller clock enable */
  __HAL_RCC_DMA1_CLK_ENABLE();
 
  /* DMA interrupt init */
  /* DMA1_Stream0_IRQn interrupt configuration */
  HAL_NVIC_SetPriority(DMA1_Stream0_IRQn, 0, 0);
  HAL_NVIC_EnableIRQ(DMA1_Stream0_IRQn);
 
}

This works fine at 115200 but it won't work at 921600. I decided to try changing the size of the interrupt so I could shift in, say, 10 bytes at a time from the DMA rather than just one but this essentially lead me to getting the first of the 10 bytes stored into my uart buffer and missing the next 10 and did that throughout the transmission (so if I sent 1k bytes, it would get the first byte, I'd get 9 0's and then repeat this all the way through 1000 bytes of the UART buffer). This makes me lead towards me potentially having the buffer setup incorrectly if it's not actually filling the buffer like I would expect it to.

Not sure if anyone can give me some insight on what my issue might be or how I can get around this. I think I'm close but I'm missing something important.

I'm currently running the system clock at 240 MHz and the UART clock is at 75 MHz, if that helps.

5 REPLIES 5
TDK
Guru

One option is to handle every byte individually using an interrupt, and push those into a buffer as you get them, or handle them some other way. This will work but will probably not be fast enough if you use the HAL interrupt handling.

Another way is to use the DMA in circular mode and periodically poll for new incoming bytes by reading the DMA stream's NDTR register. This will use the least CPU resources.

Another way is to use the HAL_UART_ReceiveToIdle_DMA routines which will return at the end of the transfer or when the line goes idle. However, these are prone to overflow if you are not quick enough to start the next reception.

If you feel a post has answered your question, please click "Accept as Solution".

Use DMA to create a circular FIFO which you harvest periodically.

Or use a protocol where the preamble/header defines the remaining data size/expectations.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

With a normal buffer, I think that's essentially what I've been doing above with your first option. It seems to work at 115200 but wasn't working at 921600.

The third option would be a great approach but, due to the size of the incoming data (being much larger than the buffer I planned on creating), this approach probably won't work.

It looks like #2 seems like the best option but, for the life of me, I can't seem to get the circular buffer working correctly. I've got it working on a few other chips but seem to be having issues with the H7 chips.

I did find one page that seemed to suggest there are issues with the DMA on these chips:

https://community.st.com/s/article/FAQ-DMA-is-not-working-on-STM32H7-devices

I tried their suggestion in point 5 (excerpt below):

5. Solution example 3: Use Cache maintenance functions
 
Receiving data:
 
#define RX_LENGTH  (16)
uint8_t rx_buffer[RX_LENGTH];
 
/* Invalidate D-cache before reception */
/* Make sure the address is 32-byte aligned and add 32-bytes to length, in case it overlaps cacheline */
SCB_InvalidateDCache_by_Addr((uint32_t*)(((uint32_t)rx_buffer) & ~(uint32_t)0x1F), RX_LENGTH+32);
 
/* Start DMA transfer */
HAL_UART_Receive_DMA(&huart1, rx_buffer, RX_LENGTH);
/* No access to rx_buffer should be made before DMA transfer is completed */
 
Please note that in case of reception there can be problem if rx_buffer is not aligned to the size of cache-line (32-bytes), because during the invalidate operation another data sharing the same cache-line(s)  with rx_buffer can be lost.

Likewise, I updated my code to reflect this but still haven't had much luck.

#define uartsize	2000
#define dma_buffer_interrupt_size	1
 
char		UART_out_Buffer[100];
uint8_t	UART_in_Buffer[uartsize];
int8_t	buf_len = 0;
uint8_t 	in_byte;
 
uint32_t	head_index = 0;
uint32_t	tail_index = 0;
 
int main(void)
{
  HAL_Init();
  SystemClock_Config();
 
  /* Initialize all configured peripherals */
  MX_GPIO_Init();
  MX_ETH_Init();
  MX_USART3_UART_Init();
  MX_DMA_Init();
  MX_USB_OTG_FS_PCD_Init();
 
  buf_len = sprintf(UART_out_Buffer, "Hello World\r\n");
  HAL_UART_Transmit(&huart3, (uint8_t *)UART_out_Buffer, buf_len, 100);
 
  // Added because of DMA issue?
  // https://community.st.com/s/article/FAQ-DMA-is-not-working-on-STM32H7-devices
  SCB_InvalidateDCache_by_Addr((uint32_t*)(((uint32_t)UART_in_Buffer) & ~(uint32_t)0x1F), dma_buffer_interrupt_size+32);
 
  //HAL_UART_Receive_DMA(&huart3, &UART_in_Buffer[head_index], dma_buffer_interrupt_size);
  //HAL_UART_Receive_DMA(&huart3, UART_in_Buffer, dma_buffer_interrupt_size);
  //HAL_UART_Receive_DMA(&huart3, (uint8_t *)&UART_in_Buffer, dma_buffer_interrupt_size);
  HAL_UART_Receive_DMA(&huart3, UART_in_Buffer, dma_buffer_interrupt_size);
  while (1)
  {
 
  }
}
 
void HAL_UART_RxCpltCallback(UART_HandleTypeDef *huart)
{
    //HAL_UART_Receive_DMA(&huart3, &UART_in_Buffer[head_index], dma_buffer_interrupt_size);
	//HAL_UART_Receive_DMA(&huart3, UART_in_Buffer, dma_buffer_interrupt_size);
	//HAL_UART_Receive_DMA(&huart3, (uint8_t *)&UART_in_Buffer, dma_buffer_interrupt_size);
 
	SCB_InvalidateDCache_by_Addr((uint32_t*)(((uint32_t)UART_in_Buffer) & ~(uint32_t)0x1F), dma_buffer_interrupt_size+32);
	HAL_UART_Receive_DMA(&huart3, UART_in_Buffer, dma_buffer_interrupt_size);
 
	head_index = (head_index + dma_buffer_interrupt_size) % uartsize;
}
 
void HAL_UARTEx_RxFifoFullCallback(UART_HandleTypeDef *huart)
{
 
	//HAL_UART_Receive_DMA(&huart3, (uint8_t *)&UART_in_Buffer, dma_buffer_interrupt_size);
 
	SCB_InvalidateDCache_by_Addr((uint32_t*)(((uint32_t)UART_in_Buffer) & ~(uint32_t)0x1F), dma_buffer_interrupt_size+32);
	HAL_UART_Receive_DMA(&huart3, UART_in_Buffer, dma_buffer_interrupt_size);
 
	head_index = (head_index + dma_buffer_interrupt_size) % uartsize;
}

Below are the (updated) peripheral functions:

static void MX_USART3_UART_Init(void)
{
  huart3.Instance = USART3;
  huart3.Init.BaudRate = 115200;
  huart3.Init.WordLength = UART_WORDLENGTH_8B;
  huart3.Init.StopBits = UART_STOPBITS_1;
  huart3.Init.Parity = UART_PARITY_NONE;
  huart3.Init.Mode = UART_MODE_TX_RX;
  huart3.Init.HwFlowCtl = UART_HWCONTROL_NONE;
  huart3.Init.OverSampling = UART_OVERSAMPLING_16;
  huart3.Init.OneBitSampling = UART_ONE_BIT_SAMPLE_DISABLE;
  huart3.Init.ClockPrescaler = UART_PRESCALER_DIV1;
  huart3.AdvancedInit.AdvFeatureInit = UART_ADVFEATURE_NO_INIT;
  if (HAL_UART_Init(&huart3) != HAL_OK)
  {
    Error_Handler();
  }
  if (HAL_UARTEx_SetTxFifoThreshold(&huart3, UART_TXFIFO_THRESHOLD_1_8) != HAL_OK)
  {
    Error_Handler();
  }
  if (HAL_UARTEx_SetRxFifoThreshold(&huart3, UART_RXFIFO_THRESHOLD_1_8) != HAL_OK)
  {
    Error_Handler();
  }
  if (HAL_UARTEx_EnableFifoMode(&huart3) != HAL_OK)
  {
    Error_Handler();
  }
}
 
static void MX_DMA_Init(void)
{
  /* DMA controller clock enable */
  __HAL_RCC_DMA1_CLK_ENABLE();
 
  /* DMA interrupt init */
  /* DMA1_Stream0_IRQn interrupt configuration */
  HAL_NVIC_SetPriority(DMA1_Stream0_IRQn, 0, 0);
  HAL_NVIC_EnableIRQ(DMA1_Stream0_IRQn);
}

It's a bit frustrating as, conceptually, I've worked with ring buffers before and it's wasn't a big issue (the concept is fairly straight forward), but I can't seem to get the chip to receive the bits outside a "normal" buffer that takes in one byte at a time. But so far, any attempts to get a circular buffer working have resulted on it not receiving anything apart from the first byte into the buffer and no interrupts are being triggered (the latter doesn't bother me as much, I was going to do most the buffer management from main, but I expected it to at least trigger something when it receives some data.

I'm kind of at a loss to know where to look at this point.

Yeah, that makes sense. I'm having trouble getting the DMA to recognize incoming data though. Managing the ring buffer I'm less worried about (I've done that before). I'm just not having issues getting it to recognize incoming data in the DMA with a circular buffer.

I have code/additional updates in the post above. I had found another post that seemed to point towards the issue being with the DMA on this chip series specifically, which isn't great. Not sure if that's the core of the issue or if I'm doing something wrong in the initial setup.

Ok, I did find one bit of info but I'm not sure if it solves the DMA issue but it did give me a clue on why it wasn't at least incrementing the head in the interrupt.

It looks like trying to call the HAL_UART_Receive_DMA() function results in a busy error rather than restarting the DMA for additional data. I would've assumed this is because you don't need to recall this function once the DMA has begun (and that might still be the case), but new data still isn't being written to the DMA/buffer as it's coming in (apart from that first byte). Even if I send 3 bytes in a row, it seems to only want to write the first one to position 0.

Edit: To be clear, if I call HAL_UART_Receive_DMA(&huart3, UART_in_Buffer, 3), it will run the interrupt after it receives 3 bytes but the buffer is only displaying the third byte. It's essentially just writing over itself into the same position without incrementing the memory address as I would've expected it to (which should be the point of the circular buffer rather than manually managing it as I was with a "normal" buffer). Then it never returns back because it still sees it as in a busy state. Likely two separate issues but I could be wrong.