cancel
Showing results for 
Search instead for 
Did you mean: 

Rendering time is abysmal

jchernus-fikst
Associate III

Hi again, (posting twice today)

I am looking at the performance of my STM32H747 system to better understand it. I didn't think there were any problems because I am not visually seeing any artifacts, but I always like to check. As it turns out, my system's rendering times are abysmal.

I have many different screens in my application, and for this test the system cycles from one to another every 3 seconds. I used the Performance Measurement pins to measure VSYNC_FREQ, RENDER_TIME, FRAME_RATE, and MCU_ACTIVE. 

VSYNC looks healthy, measured at 62,352HZ (set to ~62,4Hz).

The MCU appears to be barely active - there are 10ns or less blips occurring frequently.

Frame rate is a consistent 3 seconds (I don't know if that's normal).

But the rendering time is abysmal! The screens are 480x320 and some take hundreds of milliseconds to render when switching screens.

This is the system as it's switching screens:

jchernusfikst_3-1743021749953.png

This is system when there are no updates to render:

jchernusfikst_2-1743021473810.png

The simplest screen takes 25ms to render and it contains a 480x320 background image. The most complicated screen takes 218ms to render and it contains a 480x320 background image, 8 text areas (some contain wildcards), 3 flex buttons, and a few icons, (and one hidden background image that's not being used).

I sense that I am doing something very wrong, I wanted to ask if there are common places to look. 

Things that come to mind for me,

  • speed of writing or reading from RAM (I'm presently using a burst length of 1, and I can put in the work to get this up to 8),
  • caching the background images - sometimes the background don't even change between screens,
  • I have additional images on each screen, but they're hidden using the left-hand panel - these shouldn't be affecting the render time, right?
  • I am using vector fonts (and a custom font at that - Roboto Regular and Robot SemiBold),

Thanks!

Julia

 

 

6 REPLIES 6
GaetanGodart
ST Employee

Hello @jchernus-fikst ,

 

 

Here is a typical measurement from a random board with a regular UI:

GaetanGodart_0-1743068968329.png

Here is the example of measurement that you find on our documentation:

GaetanGodart_1-1743069011663.png

 

As you can see, they are very similar unlike your measurement.

The VSYNC gets high about 60 times per second (you do that right).
The render time starts when the VSYNC gets high (you do that right).
The framerate is the same as the VSYNC (not for you).
The MCU _ACTIVE starts when VSYNC gets high, aka we start rendering as soon as possible (this is not the case for you, perhaps your MCU is busy with something else).

Your framerate should be 60 per second like your VSYNC.
What happens if you try to change screen every second? Are you able to do that or did you set up your board to only be able to render every 3 seconds (I doubt that because there is a render time peak every 16ms)?

Your render time picks every 16ms showing that TouchGFX want to render something every 16ms but it is extremely fast to render, perhaps there is nothing to render as MCU_ACTIVE is not even getting a peak everytime.

Regarding the screenshot during the transition, we can see that the render starts (and last 200+ ms) and once it is finished it does not transfer directly, it waits for the next VSYNC to start a render and then transfer (because FrameRate changes). It is normal because to transfer we wait for VSYNC so we skip frames if the rendering takes too long.

However, we see that when the rendering finishes, the MCU is not even active so there is definitely something wrong.

 

You say that it takes 25ms to render a simple image in 480*320*2 bytes. That is way too much, perhaps you are right, this is a memory limitation if this asset is in external memory.

Can you test to put this asset in internal memory and see if it renders faster?
You can also try to just make a screen with boxes, circles, lines and other widgets that do not require to be stored in memory (this way the memory won't be used, only the rendering will be limiting).

 

It is recommended to disable DCache until you have a stable project.

 

If you have an image that is hidden by some other element(s), the hidden image will not impact the render time as long is the other element on top do not have an alpha value.
So if everything on top is RGB565 it will be fine, but if the element on top is ARGB8888 then we will do blit operations.

 

Vector font do increase render time but your render time is already abnormal for a simple image (25ms).
Note that texture mapper and VSG (including vector font but I think it depends on the size) takes quite some time to render.

 

Can you check the priority of your tasks? Perhaps there is something that has a way higher priority than TouchGFX and is taking all the RTOS's resources.

 

I hope you can investigate and try to change screen every tick (60 times per seconds) with a simple UI.

 

 

Regards,

Gaetan Godart
Software engineer at ST (TouchGFX)
jchernus-fikst
Associate III

Thank you so much, @GaetanGodart, for your lengthy reply. I have finally found some time to work on this task, here is what I've found, and a few answers to your questions:

  1. TouchGFX Task is the highest priority task (3) on the core that it's running in (M7). There is a priority 4 task and some priority 5 interrupts.
  2. D-Cache and I-Cache are both disabled.
  3. I have turned off vector font rendering, as we don't actually need it.

Experiment 1 - Images in Internal Flash

I reduced the GUI to two screens with one image each (240 x 320, as I can't fit two images of size 480 x 320). I placed these images into internal flash. I have a full-size invisible flex button covering each screen which I use to change screens.

jchernusfikst_3-1743545000732.png

Other very wonky things happen, but the render time is reduced to 2ms when using double framebuffers. Each of the captures below show one change of screen. I included two because the FRAME_RATE behaviour toggles, and I wanted to show both.

jchernusfikst_1-1743543512118.png

jchernusfikst_0-1743543452780.png

Wonky things that happen:

  • VSYNC_FREQ experiences a glitch - you'll see a discontinuity in its cycle
  • I would expect every RENDER_TIME to be 2ms but three of them are and the rest are a negligible amount (12us)
  • FRAME_RATE is still behaving incorrectly
  • MCU is still never active

Experiment 2 - Image in external vs internal flash

For this one, I have one screen with no images and one invisible flex button (that I use to change screens). And the second screen has an image of a turtle (stored in internal RAM) and the same invisible flex button.

jchernusfikst_5-1743545483001.png

The capture below is for moving from the blank screen to the turtle screen.

jchernusfikst_7-1743545620163.png

The capture below is for moving from the turtle screen to the blank screen.

jchernusfikst_8-1743545647810.png

Experiment 3 - Transitions

I used the same screens as experiment two, but used a block transition every second:

jchernusfikst_9-1743546032724.png

And here with a wipe transition:

jchernusfikst_10-1743546284437.png

Experiment 4 - Shapes instead of images

Per your recommendation, I tried adding shapes instead of images to see how this affects the system.

jchernusfikst_1-1743544912676.png

The capture is seen below, with 12ms of render time on the double framebuffer setup.

jchernusfikst_0-1743544867250.png

I feel that something is drastically wrong with the setup of our project, but I don't know if I should be looking at hardware or software for the culprit. Would love to hear any advice you have! 

Thanks,

Julia

@GaetanGodart any thoughts on the matter?

Hello,

TouchGFX will only render areas of the screen that has actually changes since the last render. This is the reason that you render time is 0 for many frames. If you want to continously render a screen for testing purposes, you can create an interaction on it as shown in my screen shot.

invalidate.png

The frame rate signal is a representation of how often a newly rendered frame is shown on the screen (the buffers are swapped for double buffer). It will toggle when a render starts and toggle again at the vsync after the render finishes. For LTDC screens without double buffering you will experience tearing if render (half period) time is higher than vsync(full period), for double buffer it will lead to lower frame rates, so choppy animations and so on.

It looks a lot like a memory bottleneck on either you flash or ram.

Can you elaborate on your external memory setup?

Where are your frame buffers placed?

Have you enabled DMA2D and/or GPU2D for TouchGFX?

I believe it should be fine to enable the icache.

jchernus-fikst
Associate III

Hi @mathiasmarkussen, thank you for your reply!

You colleague stated that Frame Rate should match Vsync, and that something seems very wrong about the MCU never being used - this is why I've been trying to investigate what is wrong with our system. 

What you are saying makes sense to me, the point of invalidating objects in TouchGFX is so that you don't re-render everything on every frame. 

We have sped our system up, so nothing is taking 200+ ms to render, but we are still seeing some take upwards of 20ms.

Can you please confirm, aside from the rendering taking too long, does this look like a healthy performance from our system? I am changing the 320 x 480 screen twice during this time - each screen has a full-size background image as well as other image-based assets (buttons, non-vector fonts, images) on top.

jchernusfikst_0-1744388940416.png

Here is a zoomed-in capture of the first screen change, you'll see that Vsync gets stretched:

jchernusfikst_1-1744389180204.png

You'll also notice that there are several long renders, which I didn't expect for one screen change.

If you say this looks healthy, I'll speed our QSPI flash up a little bit more and call it done!

Thanks again for your help,

Julia

PS: Replies to your questions:

Can you elaborate on your external memory setup? QSPI Flash @ 100MHz, FMC (16-bit) SDRAM @ 100MHz and we are not using burst writes (should we be?)

 

jchernusfikst_3-1744389429495.png

Where are your frame buffers placed? SDRAM, like so:

jchernusfikst_2-1744389329744.png

Have you enabled DMA2D and/or GPU2D for TouchGFX? Yes, see above

Marc_LM
Associate III

I might know why the MCU active pin is never toggled.

Check for "MCU_ACTIVE" or "touchgfx::HAL::getInstance()->setMCUActive".
For me, I am using FreeRTOS and it toggles the pin on every IdleTaskHook().

I recommend checking Designer's Demo9 with any of the dev-kit you have.
It should contain this macro.