Processor Load calculation and (perhaps) display

Bob Bailey · ‎2020-01-11

The emWin demos show a processor load value on the screen, is there already a method defined for this in the TouchGFX / FreeRTOS configuration. I need to know how "near the edge" I currently am. It will help to know if I have enough processor HP to complete this project with this processor (F746). Its hard to tell just by excercising the current project if I am seeing slowdowns, and I don't know how well optimized the build is currently.

on the TouchGFX screen transitions, do they use the hardware accelerator? This is where I am noticing some slowdown. Sliding transition with 2 of the Gauge widgets on one of the screens (with several enhancements like dual needle, etc)

2 related questions in one thread, thanks all!!

Bob

Bob Bailey · ‎2020-01-15

Martin posted initialization code for Tim2 in the thread I mention above, using that and setting the processor load function to use TIM2 made the load calculations work. No other changes were required. Thanks to all that replied.

The second question still remains somewhat. What are some things to check to make sure I am set as optimally as possible. The H745 seems like its 2x faster just due to the clock speed, but there are other details in the chip that may help as well (memory stuff). The 7B3 demo board is not available yet, the H745 might be worth checking out. Any experience with the H parts in comparison to the F746? What about SDRAM parameters and the QSPI?

What are some hints to improve rendering speed? I'm using the Gauge widget with dual needles, does the container widget have more overhead during the paint? The slide transition seems to slow down right near the end, I really want to use it since it is one of the nicest looking features of touchgfx.

This may really need to be its own "Question" or discussion thread.

Thanks

Bob

View solution in original post

M0NKA · ‎2020-01-12

Hey,

I don't know much about TouchGFX, but usually is just two files: cpu_utils.c and .h, in the "\Utilities\CPU" of the Cube release. The routines

inside hook easily to FreeRTOS and will measure how long the OS spends in the IDLE task.

There is a way to use higher resolution timer than the SysTick and call the code there, in this way you will get more precise reading, maybe

not a bad idea for tracking down UI repaint issues.

BR

Piranha · ‎2020-01-12

__disable_irq();
t1 = DWT->CYCCNT;
__DSB();
__WFI();
t2 = DWT->CYCCNT;
__enable_irq();

Put this in an idle hook function and you can get the cycles spent in sleep mode. Accumulate these for a second, subtract from the system clock value and you'll get the CPU load including context switch times and interrupts.

Martin KJELDSEN · ‎2020-01-13

>on the TouchGFX screen transitions, do they use the hardware accelerator? This is where I am noticing some slowdown.

Generally, yes it does (if enabled). But even with hardware acceleration you're moving a lot of pixels, potentially.

If you want something that is faster, use the cover-transition. Then only towards the very end will you be moving close to all the pixels. With Slide you're always moving all the pixels in the framebuffer.

Bob Bailey · ‎2020-01-13

Thanks, the cover transition does seem a little faster, but still noticably slower on one screen. the slow screen has 2 gauges, each gauge has 2 needles. along with their associated bitmaps. another screen has 3 sliders and 3 text areas with 2 wildcards each and it is much faster. What elements slow the transition the most?

I have been trying to tune the application to improve it, with compiler optimization and other features. But nothing has improved this. My final product will probably be 800x480, so these tests at 480x272 should not have speed issues or I will be in real trouble.

I am looking into the processor load calcs suggested by other community members above, I have not gotten them implemented yet.

Thanks for your help on this.

bob

Bob Bailey · ‎2020-01-13

I have been looking thru some of the setup files, I clearly have more reading to do. But I found these in the HAL layer, but I must be missing a step as

I am getting 0 returned.

in one of my tasks I put this in the loop,

CPU_Load = touchgfx::HAL::getInstance()->getMCULoadPct();

based on the comments in the hpp file, this should give me some sort of CPU load indication, but there must be another step.

In any case, I will continue to dig around and see what other sorts of configuration options exist to improve performance. Additionally it appears the stm32H7 series has more performance (larger internal memory and more than double the clock)

Most people posting seem to be using the STM32F746-DISCO board, probably because its a cheap way to test out some ideas. (this is the case for me). The H7 has some similar boards that seem worth a try. Any feedback from other community members would be great.

In any case, the touchGFX seems very capable, as I learn the ins and outs of creating and managing widgets and screens programmatically. I look forward to trying the new release.

Bob

Bob Bailey · ‎2020-01-13

there seems to be processor load calculation functionality built into the Instrumentation files in the HAL, It seems to have lots of comments from Draupner,

I found this thread, with some input from Martin, and others, it appears my issue is related.

In the init function, the clock is enabled for TIM2, but the init is all done for TIM1,

Here is the thread

https://community.st.com/s/question/0D50X0000B42yHCSQY/getmculoadpct-function

My init function looks like this:

namespace touchgfx
{
static TIM_HandleTypeDef htim1;
 
void STM32F7Instrumentation::init()
{
   RCC_ClkInitTypeDef clkconfig;
    uint32_t uwTimclock, uwAPB1Prescaler = 0U;
    uint32_t pFLatency;
 
    __TIM2_CLK_ENABLE();
 
  TIM_ClockConfigTypeDef sClockSourceConfig = {0};
  TIM_MasterConfigTypeDef sMasterConfig = {0};
  TIM_OC_InitTypeDef sConfigOC = {0};
  TIM_BreakDeadTimeConfigTypeDef sBreakDeadTimeConfig = {0};
  htim1.Instance = TIM1;
  htim1.Init.Prescaler = 0;
  htim1.Init.CounterMode = TIM_COUNTERMODE_UP;
  htim1.Init.Period = 0;
  htim1.Init.ClockDivision = TIM_CLOCKDIVISION_DIV1;
  htim1.Init.RepetitionCounter = 0;
  htim1.Init.AutoReloadPreload = TIM_AUTORELOAD_PRELOAD_DISABLE;
  if (HAL_TIM_Base_Init(&htim1) != HAL_OK)
  {
    Error_Handler( );
  }
 
  sClockSourceConfig.ClockSource = TIM_CLOCKSOURCE_INTERNAL;
  if (HAL_TIM_ConfigClockSource(&htim1, &sClockSourceConfig) != HAL_OK)
  {
    Error_Handler( );
  }
 
  htim1.Instance = TIM1;
  htim1.Init.Prescaler = 0;
  htim1.Init.CounterMode = TIM_COUNTERMODE_UP;
  htim1.Init.Period = 0;
  htim1.Init.ClockDivision = TIM_CLOCKDIVISION_DIV1;
  htim1.Init.RepetitionCounter = 0;
  htim1.Init.AutoReloadPreload = TIM_AUTORELOAD_PRELOAD_DISABLE;
  if (HAL_TIM_PWM_Init(&htim1) != HAL_OK)
  {
    Error_Handler( );
  }
 
  sMasterConfig.MasterOutputTrigger = TIM_TRGO_RESET;
  sMasterConfig.MasterOutputTrigger2 = TIM_TRGO2_RESET;
  sMasterConfig.MasterSlaveMode = TIM_MASTERSLAVEMODE_DISABLE;
  if (HAL_TIMEx_MasterConfigSynchronization(&htim1, &sMasterConfig) != HAL_OK)
  {
    Error_Handler( );
  }
 
  sConfigOC.OCMode = TIM_OCMODE_PWM1;
  sConfigOC.Pulse = 0;
  sConfigOC.OCPolarity = TIM_OCPOLARITY_HIGH;
  sConfigOC.OCNPolarity = TIM_OCNPOLARITY_HIGH;
  sConfigOC.OCFastMode = TIM_OCFAST_DISABLE;
  sConfigOC.OCIdleState = TIM_OCIDLESTATE_RESET;
  sConfigOC.OCNIdleState = TIM_OCNIDLESTATE_RESET;
  if (HAL_TIM_PWM_ConfigChannel(&htim1, &sConfigOC, TIM_CHANNEL_1) != HAL_OK)
  {
    Error_Handler( );
  }
 
  sBreakDeadTimeConfig.OffStateRunMode = TIM_OSSR_DISABLE;
  sBreakDeadTimeConfig.OffStateIDLEMode = TIM_OSSI_DISABLE;
  sBreakDeadTimeConfig.LockLevel = TIM_LOCKLEVEL_OFF;
  sBreakDeadTimeConfig.DeadTime = 0;
  sBreakDeadTimeConfig.BreakState = TIM_BREAK_DISABLE;
  sBreakDeadTimeConfig.BreakPolarity = TIM_BREAKPOLARITY_HIGH;
  sBreakDeadTimeConfig.BreakFilter = 0;
  sBreakDeadTimeConfig.Break2State = TIM_BREAK2_DISABLE;
  sBreakDeadTimeConfig.Break2Polarity = TIM_BREAK2POLARITY_HIGH;
  sBreakDeadTimeConfig.Break2Filter = 0;
  sBreakDeadTimeConfig.AutomaticOutput = TIM_AUTOMATICOUTPUT_DISABLE;
  if (HAL_TIMEx_ConfigBreakDeadTime(&htim1, &sBreakDeadTimeConfig) != HAL_OK)
  {
    Error_Handler( );
  }
 
    /* Get clock configuration */
    HAL_RCC_GetClockConfig(&clkconfig, &pFLatency);
 
    /* TIM2 is on APB1 bus */
    uwAPB1Prescaler = clkconfig.APB1CLKDivider;
 
    if (uwAPB1Prescaler == RCC_HCLK_DIV1)
        uwTimclock = HAL_RCC_GetPCLK1Freq();
    else
        uwTimclock = 2 * HAL_RCC_GetPCLK1Freq();
 
    m_sysclkRatio = HAL_RCC_GetHCLKFreq() / uwTimclock;
 
    HAL_TIM_Base_Start(&htim1);
}

Bob Bailey · ‎2020-01-13

neither TIM1 nor TIM2 CNT registers are incrementing, changed the clock init and this is still the case.

I probably need to dig deeper into the timer initializations,

Bob

Bob Bailey · ‎2020-01-15

Martin posted initialization code for Tim2 in the thread I mention above, using that and setting the processor load function to use TIM2 made the load calculations work. No other changes were required. Thanks to all that replied.

The second question still remains somewhat. What are some things to check to make sure I am set as optimally as possible. The H745 seems like its 2x faster just due to the clock speed, but there are other details in the chip that may help as well (memory stuff). The 7B3 demo board is not available yet, the H745 might be worth checking out. Any experience with the H parts in comparison to the F746? What about SDRAM parameters and the QSPI?

What are some hints to improve rendering speed? I'm using the Gauge widget with dual needles, does the container widget have more overhead during the paint? The slide transition seems to slow down right near the end, I really want to use it since it is one of the nicest looking features of touchgfx.

This may really need to be its own "Question" or discussion thread.

Thanks

Bob