cancel
Showing results for 
Search instead for 
Did you mean: 

Crash in TouchGFX draw() function (waiting on framebuffer semaphore)

Michael K
Senior III

I have a page with many textboxes in a swipe container, half of which have wildcards. Occasionally the TouchGFX task will freeze (other RTOS tasks continue as normal). Unfortunately I have not yet found a way to reproduce it, however the symptoms are similar each time.

The UI freezes. I click pause in STM32CubeIDE, I usually end up in a DMA Handler. I step through until the DMA handler completes, and then I land in a TouchGFX textarea draw function.

The stack trace is always something like this:

touchgfx::TextAreaWithOneWildcard::draw(touchgfx::Rect const&) const at 0x8057353
touchgfx::Screen::JSMOC() at 0x......
touchgfx::Screen::JSMOC() at 0x......
touchgfx::Screen::JSMOC() at 0x......
<a number of these JSMOC lines...>
touchgfx::Screen::JSMOC() at 0x......
touchgfx::Screen::startSMOC() at 
touchgfx::draw()
touchgfx::Application::draw()
touchgfx::Application::cacheDrawOperations()
touchgfx::HAL::tick()
touchgfx::HAL::backPorchExited() at HAL.hpp:541

The topmost function is always a TextArea, TextAreaWithOneWildcard, or TextAreaWithTwoWildcards draw() function.

Edit 2021-04-06: I was able to reproduce the issue where the draw function called was a member of Line.

Regardless of the exact object the draw function was part of, the program counter in the disassembly always stops at one of these two instructions:

08057353:   add.w   r1, r2, r3, lsl #1
08057357:   ldrb.w  r2, [r2, r3, lsl #1]

When I click step, the program counter doesn't step to the next one, however instead it continues the execution of the program.

Once, the stack trace said the topmost draw function is at address 0x8057356, but the add.w instruction is at 0x8057353 and the ldrb.w instruction was at 0x8057357.

Please see attached for a screenshot of the CubeIDE during one of these crashes. I am running a custom board, however it is based off of the STM32F769-DISCO application template (either v3.0.1 or v3.0.0).

Thanks for any assistance.

9 REPLIES 9
Michael K
Senior III

Update: I can reproduce the issue by constantly sliding the swipe container. I guess it increases the draw rate, since the page only normally refreshes itself a few times per second. The crash occurs after around 2-3 minutes of sliding. Perhaps there's a race condition somewhere?

Michael K
Senior III

Update 2: I seem to have found a way to stop the crash, however it is seemingly unrelated to any of my TouchGFX code. When running the TouchGFX task in isolation, the issue does not occur. I suspected a stack overflow issue so I increased the stack size of all my tasks, but that didn't solve the problem. More investigation to be done... but the root cause is probably unrelated to TouchGFX.

@Alexandre RENOUX​ , or @Martin KJELDSEN​ , perhaps you could share some information about what those library draw functions are actually doing around the lines of those addresses that would help me narrow down my issue? I'm using TouchGFX 1.16.1 and code optimizations are off. Perhaps a failed assertion?

Hi Michael,

Let me get back to you tomorrow - It's getting a bit late here. I can share some information maybe help track down the problem.

/Martin

What strikes me is that you say you're stuck in draw() - That usually means that TouchGFX is trying to take a the framebuffer semaphore and it can't. Chromart (DMA2D) is also using this semaphore and releases it when its done with its queue of operations. It could be something about this. You could try to count the number of times the DMA2D starts an operation and how many times it tells TouchGFX its done. It sounds like a priority issue of some kind that shows itself when you have other tasks running.

Thanks for your reply. The code that was "causing" the crash was a section in another task dealing with RTOS Event Flags. Strangely this code has been present for months seemingly without issue.

More info that might help...

1) I commented out the add functions in the Base class and took all the elements out of the screen except for the swipe container and a line. The symptoms seemed to change - instead of stopping a function, the debugger would just disconnect. I've never had that happen in other circumstances, however, one time it landed me in the STM32DMA where I was stuck in an infinite loop waiting for a DMA2D register bit to clear. I had recently turned on L8 as the default image format, however when I changed the project back to non-L8 images, the freezing still occurred (and the debugger was disconnecting so I never figured out where it was freezing).

2) I suspected a priority issue as well, but I elevated the DMA2D priority to the highest setting ahead of all my DMA streams etc. Didn't solve the problem.

Michael K
Senior III

Update 3: I thought that loading the texts from the QSPI may have had something to do with it, but moving the Text and Font flash sections to the internal flash yielded the same results. That said, I did get a hang in a Line Draw() method, so that rules out the Texts or Wildcards as being issues specifically. Here is a recent crash. I set the timeout from osWaitForever to 2000, so that I could catch the exact stack trace that was waiting on the framebuffer.

0693W000008yozkQAA.png

Okay, the framebuffer semaphore is not available. Are you running a single framebuffer configuration? Which display interface are you using? You may not be triggering TouchGFX to release the semaphore.

/Martin

Single framebuffer, "Custom" interface (in cubemx) with ChromArt. Built on top of the STM32F769-DISCO 3.0.0 template, though I have added other peripherals. I don't know if this information helps, but I replaced while(1) with giveFrameBufferSemaphore(); takeFrameBufferSemaphore(); in the status check but it remained frozen. I could see from instruction stepping it was looping somewhere in the drawGlyph function.

Michael K
Senior III

Update 4: A newly added feature on another task causes this problem to happen again. In this case, changing the TouchGFX priority to be equal to the other tasks in my system seems to have stopped the problem.