Skip to main content
Associate II
June 22, 2026
Question

STM32U5G9ZJT6Q Video Stop

  • June 22, 2026
  • 0 replies
  • 18 views

[STM32U5G9] MJPEG video playback hangs intermittently with HAL_JPEG_ERROR_DMA — non-deterministic race condition

Environment

Item Value
MCU STM32U5G9ZJT6Q
Custom Board Yes (not STM32U5G9J-DK2)
External Flash Macronix MX25L12833F on OCTOSPI (memory-mapped mode)
Display 800×480 RGB565 LCD via LTDC
TouchGFX 4.26.0
TouchGFX Generator 4.26.0
FreeRTOS V10.6.2 (CMSIS-RTOS2, via X-CUBE-FREERTOS 1.0.1)
STM32CubeIDE 2.1.1
STM32CubeMX Latest
STM32U5 HAL Driver V1.4+
Compiler arm-none-eabi-gcc 14.3.rel1

Issue Description

MJPEG intro video plays inconsistently. The first frame is partially decoded (top 16 pixel rows visible, rest of framebuffer shows stale main-screen content). The TouchGFX task then blocks indefinitely on SEM_WAIT(semDecodingDone) and the system enters the FreeRTOS idle task (prvCheckTasksWaitingTermination).

Same code, same setup, three different runs produce three different outcomes:

Run 1 (best case)

 

dbg_jpeg_irq          = 3
dbg_gpdma_ch0_irq = 1
dbg_gpdma_ch1_irq = 34 (≈ 1.13 frames worth, 30 per frame expected)
hjpeg.State = HAL_JPEG_STATE_READY
hjpeg.ErrorCode = 0
LCD = First frame partially decoded, then stops

Run 2 (long-running then fails)

 

dbg_jpeg_irq          = 260
dbg_gpdma_ch0_irq = 0 ← Input DMA never raised TC IRQ
dbg_gpdma_ch1_irq = 7680 (= 256 frames decoded)
hjpeg.State = HAL_JPEG_STATE_RESET
hjpeg.ErrorCode = 4 (HAL_JPEG_ERROR_DMA)
LCD = Many frames played, then frozen with broken content

Run 3 (fails immediately)

 

dbg_jpeg_irq          = 0
dbg_gpdma_ch0_irq = 513 ← IRQ flood, only 1 byte consumed
dbg_gpdma_ch1_irq = 1
hjpeg.State = HAL_JPEG_STATE_RESET
hjpeg.ErrorCode = 4 (HAL_JPEG_ERROR_DMA)
hjpeg.JpegInCount = 1
LCD = Main screen, video never started

Run 4 (256 frames + IRQ runaway)

 

dbg_jpeg_irq          = 516
dbg_gpdma_ch0_irq = 16,777,216 (= 2^24, runaway IRQ after error)
dbg_gpdma_ch1_irq = 7680
hjpeg.ErrorCode = 4

The dbg_* counters are simple volatile uint32_t incremented inside each IRQ handler.

What I've Verified is NOT the Cause

1. AVI/JPEG data integrity — VERIFIED OK

  • Dumped 5,519,444 bytes from OSPI memory-mapped region (0x904B6B40 to 0x909FA394)
  • Computed SHA-256 hash → identical to source video1_800x480.avi
  • Conclusion: OSPI memory-mapped reads are byte-perfect

2. JPEG decoder output — VERIFIED OK (where it runs)

  • Exported framebuffer (0x20160000 to 0x2021B800, 768,000 bytes RGB565)
  • Compared byte-by-byte to Python-decoded reference of same AVI
  • Result: First 16 pixel rows (Y=015, first MCU line): **9091% byte match** (minor JPEG decoder precision differences expected)
  • Result: Y=16~479: Filled with 0x10 0x84 repeating pattern (= RGB565 0x8410 = gray, the main screen background)
  • Conclusion: Decoder produces correct data for exactly the first MCU line, then stops

Settings Already Aligned with STM32U5G9J-DK2 Demo

Compared with the official STM32U5G9J_Demo (obtained via TouchGFX Designer → Demos → Select Board Setup → U5G9 FreeRTOS) and matched every relevant setting:

Setting Value (both demo and our project)
JPEG_IRQn NVIC priority 7
GPDMA1_Channel0_IRQn priority 5
GPDMA1_Channel1_IRQn priority 5
DMA2D_IRQn priority 5
LTDC_IRQn priority 5
GPU2D_IRQn priority 5
configMAX_SYSCALL_INTERRUPT_PRIORITY 5
configENABLE_FPU 1
GPDMA1 Ch0 (JPEG_RX) SrcBurstLength 8 (changed from default 32)
GPDMA1 Ch0 DestBurstLength 8
GPDMA1 Ch1 (JPEG_TX) SrcBurstLength 8
GPDMA1 Ch1 DestBurstLength 8
HardwareMJPEGDecoder.cpp byte-identical to demo
Task name GUI_Task, Stack 8192, Priority Normal, Code Gen "As external"

Known Differences from Demo (Hardware Constrained)

Item Our Board STM32U5G9J-DK2 Demo
Flash chip MX25L12833F (Macronix QPI) MX66UW1G45G (Octo)
Flash interface OCTOSPI HSPI
Memory-mapped command Quad I/O Read 0xEB, 4 dummy cycles Octo DTR Read
Custom OSPI_EnableMemoryMappedMode() Required Not needed (CubeMX handles HSPI)

What I've Tried (and the Outcome)

Attempt Result
Changed JPEG_IRQn priority from 5 to 7 Helped — reduced ErrorCode=4 frequency
Changed GPDMA1 Ch0 SrcBurstLength from 32 to 8 Helped — matches demo behavior closer
Enabled configENABLE_FPU = 1 Helped — fewer random hangs
Properly configured TouchGFXTask via CubeMX ("As external") Required, working
Tried always-Pause/Resume in DataReadyCallback (per memory note) Failed immediately with ErrorCode=4 on first frame
Tried SEM_POST in DecodeCpltCallback to fix LastJob race Made worse — caused HAL_JPEG_ERROR_HUFF_TABLE (=1) immediately on boot

Code Snippet — HardwareMJPEGDecoder.cpp (TouchGFX-generated, identical to demo)

HAL_JPEG_DataReadyCallback end-of-frame branch:

 

if (line_count >= JPEG_ConvertorParams.endY)
{
Jpeg_OUT_BufferTab[JPEG_OUT_Write_BufferIndex].LastJob = true;
Jpeg_HWDecodingEnd = 1;
HAL_JPEG_Pause(hjpeg, JPEG_PAUSE_RESUME_OUTPUT);
}

if (!DMA2D_reference->isDMARunning())
{
SEM_POST(semDecodingDone);
}

HAL_JPEG_DecodeCpltCallback:

 

void HAL_JPEG_DecodeCpltCallback(JPEG_HandleTypeDef* hjpeg)
{
Jpeg_HWDecodingEnd = 1;
}

Hypothesis

A timing-sensitive race condition between JPEG hardware output trigger and GPDMA Channel 1 (output) reconfiguration in HAL_JPEG_ConfigOutputBuffer. When JPEG raises the next TX request while CH1 is being reconfigured (HAL_DMA_Abort → HAL_DMA_Start_IT internally), a trigger overrun causes HAL_JPEG_ERROR_DMA.

The race window timing depends on:

  • OCTOSPI memory-mapped read latency (input DMA throughput)
  • FreeRTOS critical section duration
  • Interrupt processing latency

When timing is favorable, many frames decode successfully (Run 2 = 256 frames). When unfavorable, fails immediately (Run 3).

Open Questions to the Community

  1. Is this a known issue with STM32U5 GPDMA + JPEG hardware decoder combination?
  2. Is the OCTOSPI memory-mapped mode less suited for feeding JPEG_RX DMA compared to HSPI? Are there specific OCTOSPI settings (Refresh, DelayBlock) required for sustained DMA feed?
  3. Should HAL_JPEG_ConfigOutputBuffer in DataReadyCallback be wrapped with explicit DMA stop/start for robustness?
  4. Are there errata for STM32U5G9 JPEG peripheral or GPDMA related to this scenario?
  5. Why does the STM32U5G9J-DK2 demo work but seemingly identical code on a custom board with OCTOSPI does not?

Any guidance would be greatly appreciated.