2025-05-15 12:39 AM
Hi,
Printf debugging is a useful tool for software development and it is also a simple way of providing logging functions using a program like TeraTerm. However using printf with a blocking uart driver is very slow at ordinary baud rates such as 115200. For example a message of 29 characters including a CR and LF takes about 30 ms of uart time and if any floating point vaiables are output the sprintf time can be anywhere from 20 to 150 ms.
If you want to debug a loop that runs at 50 Hz with a slack time of 10% the standard blocking printf to uart function is too slow. To get over this problem I wrote a non-blocking dma function called printfdma that takes the string and variables does the formatting and puts the resultant string on a FreeRTOS queue. A seperate lower priority task sends the characters to the uart using dma.
Instead of using the Berkley stdio.h for the print f and sprint f functions I used the one developed by
Eyal Rozenberg <eyalroz1@gmx.com> * 2021-2024, Haifa, Palestine/Israel. You can get it at github here:
https://github.com/eyalroz/printf/graphs/contributors
The advantage of this version is that it is thread safe (re-entrent), it doesn't use the heap and it has some options that make it more suitable for
enbedded systems. It is not fully compliant with the standard C library, but it works for me.
I did some comparative testing against the Berkley printf/sprintf functions and found the following:
Three different number formats were used, Scientific, float to 6 decimals and float to 3 decimals. I tried the Rozenberg with double format on and off,
so there are two data sets for this sprintf function. Of course the maximum time to do an sprintf is the limiting factor in therms of speed and it shows that the Rozenberg
is considerably faster than the stdio for the float with three decimals case. Scientific notation is very slow with a maximum time of 173 us for the stdio and 149 for Rozenberg.
If you need speed go for floating point and keep the precision as small as practical.
The printfdma function is self contained in two files a code file and header. These should be added to your system.
They contain all of the functions necessary including the DMA interrupt config and interrupt request handlers.
Depending on the processor and board you use you will have to set up the uart you intend to use. In this example I have ported it to
UART 2 on an STM32g431 nucleo board. You will also have to set the DMA channels, pins and other paraphernalia for other boards.
The specific board I used was the NUCLEO-G431KB.
The system also has a funtcion to retrieve dat/time onfo from the RTC, but there is a lot of overhead with this so if you need
order of occurrence data for each printfdma call use a suitably configured counter and output that as an integer.
Because the system requries a freertos queue you cannot use the printfdma function until the printf Task is running under the scheduler.
Usage:
#if(PRINTF_DMA_UART1_ON == 1)
#include "printfUart1_DMA_Driver.h"
#else
#include "usart.h"
#endif
int main(void)
{
/* Reset of all peripherals, Initializes the Flash interface and the Systick. */
HAL_Init();
/* Configure the system clock */
SystemClock_Config();
/* Initialize all configured peripherals */
MX_GPIO_Init();
#if(PRINTF_DMA_UART1_ON == 1)
MX_USART2_UART_Init(); /* Initialisation of printfdma system uart, task, mutex etc */
initPrintf_DMA_Uart2(PRIO_PRINTF_DMA);
#else
/* Use the ordinary blocking function */
printf("Initalisation Done Sandard Blocking printf\r\n");
printVersion();
#endif
/* Init scheduler */
osKernelInitialize();
/* Call init function for freertos objects (in cmsis_os2.c) */
MX_FREERTOS_Init();
/* Start scheduler */
osKernelStart();
/* Infinite loop */
/* USER CODE BEGIN WHILE */
while (1)
{
}
}
To call the function you simply write:
printfdma("Float Num Value = %01.6f\r\n", pi);
I used my scope to get some timing:
The top two traces are the output of the uart with a decode. The Uart is set to 115200 baud.
The bottom trace is the time it takes for the printfdma function to decode the run sprintf and place the
data in the queue.
As you can see the printfdma function takes a maximum time of about 75 us, with the transmit time being
being about 2.5 ms. If you print integers or simple short literal strings the time is much faster, probably less than 20 us.
The system could be made much faster by offloading the sprintf conversion to the lower priority printing task
by passing a pointer to print string into the queue. This has a downside in that the print data would have to be
valid for as long as it takes to get printed by the print task. I might have a crack at this one day if I get the time.
I have used this system for three different processors, two H7's and this one
The complete project for the Nucleo is attached
Enjoy