2022-12-08 11:47 AM
Hi,
Last couple of weeks I am trying to debug a rendering issue between eglfs and what I believe to be a VSYNC/Tearing issue. For testing I am using a bootlin demo build-root image with elgfs support.
The image has Qt5.15, EGLFS and EGL support (more detailed description of the image specifications under: st_stm32mp135f_dk_demo.) Most of the QT examples have apparent vsync (tearing) issues like the one in this video. Weirdly enough, the pure opengl examples like hellowindow work perfectly fine.
I attempted to gather more information about the issue in a few ways:
I tried to set QT_QPA_EGLFS_ALWAYS_SET_MODE=1, QT_QPA_EGLFS_KMS_ATOMIC=1 with no results. I also created a KMS config file. What seemed weird to me is that the FORCEVSYNC and SWAPINTERVAL variables they do not seem affect the image, even though I can see on the EGLFS_DEBUG that the the variables have been applied.
I tested 4 different displays with DSI, and HDMI Bridges. In all displays the issue is the same but the lower the resolution gets the less frequent the issue is. For example in the 480x800 DK2 display tearing appears once every couple of seconds while in an HDMI connected 1080p display the issue is continues.
I had an old qt-image example from 2020 that used to work with no issues. In the 2020 linux 5.10, eglfs distribution all 4 displays work with no issues.
Then I tried the latest qt-example image and although the issue was not identical tearing issue was still there.
Lastly I tried the demo weston image that seems to work just fine with a few dropped frames here and there.
Going back to buildroot, I tried a 79MHz and an 84.85MHz configurations with no luck.
Both in linux 5.10 and 5.15 the issue is the same.
Modetest vsync test works with no issues in all displays.
Attached some of the EGLFS and Modetest logs:
QML debugging is enabled. Only use this in a safe environment.
Failed to move cursor on screen DSI1: -14
Failed to move cursor on screen DSI1: -14
Shmem has been moved to thread
shmemThread has started
Created context for format QSurfaceFormat(version 2.0, options QFlags<QSurfaceFormat::FormatOption>(), depthBufferSize 24, redBufferSize 8, greenBufferSize 8, blueBufferSize 8, alphaBufferSize -1, stencilBufferSize 8, samples -1, swapBehavior QSurfaceFormat::DoubleBuffer, swapInterval 1, colorSpace QSurfaceFormat::DefaultColorSpace, profile QSurfaceFormat::NoProfile) with config:
EGL_BUFFER_SIZE: 24
EGL_ALPHA_SIZE: 0
EGL_BLUE_SIZE: 8
EGL_GREEN_SIZE: 8
EGL_RED_SIZE: 8
EGL_DEPTH_SIZE: 24
EGL_STENCIL_SIZE: 8
EGL_CONFIG_CAVEAT: 12344
EGL_CONFIG_ID: 15
EGL_LEVEL: 0
EGL_MAX_PBUFFER_HEIGHT: 8064
EGL_MAX_PBUFFER_PIXELS: 65028096
EGL_MAX_PBUFFER_WIDTH: 8064
EGL_NATIVE_RENDERABLE: 1
EGL_NATIVE_VISUAL_ID: 875713112
EGL_NATIVE_VISUAL_TYPE: 32
EGL_SAMPLES: 0
EGL_SAMPLE_BUFFERS: 0
EGL_SURFACE_TYPE: 1287
EGL_TRANSPARENT_TYPE: 12344
EGL_TRANSPARENT_BLUE_VALUE: -1
EGL_TRANSPARENT_GREEN_VALUE: -1
EGL_TRANSPARENT_RED_VALUE: -1
EGL_BIND_TO_TEXTURE_RGB: 1
EGL_BIND_TO_TEXTURE_RGBA: 1
EGL_MIN_SWAP_INTERVAL: 1
EGL_MAX_SWAP_INTERVAL: 60
4 33 connected DSI-1 309x83 1 33
modes:
index name refresh (Hz) hdisp hss hse htot vdisp vss vse vtot
#0 1920x515 48.08 1920 1980 2000 2080 515 525 560 565 56500 flags: nhsync, nvsync; type: preferred, driver
props:
1 EDID:
flags: immutable blob
blobs:
value:
2 DPMS:
flags: enum
enums: On=0 Standby=1 Suspend=2 Off=3
value: 0
5 link-status:
flags: enum
enums: Good=0 Bad=1
value: 0
6 non-desktop:
flags: immutable range
values: 0 1
value: 0
4 TILE:
flags: immutable blob
blobs:
I am sure that there are still a lot of places to look but I am starting to get out of ideas. Any help or tips is highly appreciated.
All the best,
Kyr
2022-12-11 08:21 AM
Hello @KChar.1,
Thank you for your very complete feedback and investigations.
I did some Qt Demos myself and observed a same kind of issue on DV-4.0 like you.
It is highly possible that the root cause comes from gc-nano (GPU) delivery in DV 4.0. It looks like a kind of regression on which we are investigating.
It can explains why changing only your kernel does not solve the issue, because gc-nano is built out of tree.
Is it possible for you to test with an older version of gc-nano ?
Kind regards,
Erwan.
PS: Can you confirm that you are working on stm32mp157f-dk2 ?
2022-12-11 10:16 AM
Hi @Erwan SZYMANSKI ,
Thank you very much for looking into it! Indeed it is a 157f-dk2. I did end up to the same assumption on Friday. I used the dev bootlin branch for buildroot that improved the issue significantly.
Then I checked the differences between dev and main branches and besides of migrating to Linux 10.15-r2 the gcnano package for galcore was 6.4.9. With this combination the vsync works as expected but there are still a few drop frames.
I compared the following images trying to pinpoint the issue in galcore:
10.15-r1 with Gcnano 4.6.7. Vsync seems completely unresponsive
10.15-r2 with Gcnano 4.6.9. Vsync seems responsive but with minor drop frames
10.10-r2 with Gcnano 4.6.3. Works with no issues.
I am going to take a deeper look tomorrow and try to decide what would be a good combination for the project.
I will keep you posted for any further findings.
Please let me know if you have any other findings regarding the possible regression issue.
All the best,
Kyr
2022-12-12 09:13 AM
Hi @Erwan SZYMANSKI,
I tried a few more configurations today based on your suggestion.
The only configuration that the issue is completely invisible is:
10.10-r2 with Gcnano 4.6.3.
I attempted to build an image based on 10.15-r2 with Gcnano 4.6.3 but I realised that gcnano versions are not backwards compatible although they are build out-of-tree.
Let me know if I miss something here. I was also wondering if there is any documentation regarding which Galcore versions are compatible with which kernels.
Since I can not use Gcnano 4.6.3 with 10.15-r2, I am thinking to proceed with 10.15-r2, Gcnano 4.6.9 where the issue is not so evident and wait until the issue is resolved.
Thank you very much for your help!
All the best,
Kyr
2022-12-13 06:41 AM
+11
2022-12-13 06:52 AM
+1