STM32U5G9ZJT6Q Video Stop
- June 22, 2026
- 0 replies
- 18 views
[STM32U5G9] MJPEG video playback hangs intermittently with HAL_JPEG_ERROR_DMA — non-deterministic race condition

Environment
| Item | Value |
|---|---|
| MCU | STM32U5G9ZJT6Q |
| Custom Board | Yes (not STM32U5G9J-DK2) |
| External Flash | Macronix MX25L12833F on OCTOSPI (memory-mapped mode) |
| Display | 800×480 RGB565 LCD via LTDC |
| TouchGFX | 4.26.0 |
| TouchGFX Generator | 4.26.0 |
| FreeRTOS | V10.6.2 (CMSIS-RTOS2, via X-CUBE-FREERTOS 1.0.1) |
| STM32CubeIDE | 2.1.1 |
| STM32CubeMX | Latest |
| STM32U5 HAL Driver | V1.4+ |
| Compiler | arm-none-eabi-gcc 14.3.rel1 |
Issue Description
MJPEG intro video plays inconsistently. The first frame is partially decoded (top 16 pixel rows visible, rest of framebuffer shows stale main-screen content). The TouchGFX task then blocks indefinitely on SEM_WAIT(semDecodingDone) and the system enters the FreeRTOS idle task (prvCheckTasksWaitingTermination).
Same code, same setup, three different runs produce three different outcomes:
Run 1 (best case)
dbg_jpeg_irq = 3
dbg_gpdma_ch0_irq = 1
dbg_gpdma_ch1_irq = 34 (≈ 1.13 frames worth, 30 per frame expected)
hjpeg.State = HAL_JPEG_STATE_READY
hjpeg.ErrorCode = 0
LCD = First frame partially decoded, then stops
Run 2 (long-running then fails)
dbg_jpeg_irq = 260
dbg_gpdma_ch0_irq = 0 ← Input DMA never raised TC IRQ
dbg_gpdma_ch1_irq = 7680 (= 256 frames decoded)
hjpeg.State = HAL_JPEG_STATE_RESET
hjpeg.ErrorCode = 4 (HAL_JPEG_ERROR_DMA)
LCD = Many frames played, then frozen with broken content
Run 3 (fails immediately)
dbg_jpeg_irq = 0
dbg_gpdma_ch0_irq = 513 ← IRQ flood, only 1 byte consumed
dbg_gpdma_ch1_irq = 1
hjpeg.State = HAL_JPEG_STATE_RESET
hjpeg.ErrorCode = 4 (HAL_JPEG_ERROR_DMA)
hjpeg.JpegInCount = 1
LCD = Main screen, video never started
Run 4 (256 frames + IRQ runaway)
dbg_jpeg_irq = 516
dbg_gpdma_ch0_irq = 16,777,216 (= 2^24, runaway IRQ after error)
dbg_gpdma_ch1_irq = 7680
hjpeg.ErrorCode = 4
The dbg_* counters are simple volatile uint32_t incremented inside each IRQ handler.
What I've Verified is NOT the Cause
1. AVI/JPEG data integrity — VERIFIED OK
- Dumped 5,519,444 bytes from OSPI memory-mapped region (
0x904B6B40to0x909FA394) - Computed SHA-256 hash → identical to source
video1_800x480.avi - Conclusion: OSPI memory-mapped reads are byte-perfect
2. JPEG decoder output — VERIFIED OK (where it runs)
- Exported framebuffer (
0x20160000to0x2021B800, 768,000 bytes RGB565) - Compared byte-by-byte to Python-decoded reference of same AVI
- Result: First 16 pixel rows (Y=015, first MCU line): **9091% byte match** (minor JPEG decoder precision differences expected)
- Result: Y=16~479: Filled with
0x10 0x84repeating pattern (= RGB5650x8410= gray, the main screen background) - Conclusion: Decoder produces correct data for exactly the first MCU line, then stops
Settings Already Aligned with STM32U5G9J-DK2 Demo
Compared with the official STM32U5G9J_Demo (obtained via TouchGFX Designer → Demos → Select Board Setup → U5G9 FreeRTOS) and matched every relevant setting:
| Setting | Value (both demo and our project) |
|---|---|
JPEG_IRQn NVIC priority | 7 |
GPDMA1_Channel0_IRQn priority | 5 |
GPDMA1_Channel1_IRQn priority | 5 |
DMA2D_IRQn priority | 5 |
LTDC_IRQn priority | 5 |
GPU2D_IRQn priority | 5 |
configMAX_SYSCALL_INTERRUPT_PRIORITY | 5 |
configENABLE_FPU | 1 |
| GPDMA1 Ch0 (JPEG_RX) SrcBurstLength | 8 (changed from default 32) |
| GPDMA1 Ch0 DestBurstLength | 8 |
| GPDMA1 Ch1 (JPEG_TX) SrcBurstLength | 8 |
| GPDMA1 Ch1 DestBurstLength | 8 |
| HardwareMJPEGDecoder.cpp | byte-identical to demo |
| Task name | GUI_Task, Stack 8192, Priority Normal, Code Gen "As external" |
Known Differences from Demo (Hardware Constrained)
| Item | Our Board | STM32U5G9J-DK2 Demo |
|---|---|---|
| Flash chip | MX25L12833F (Macronix QPI) | MX66UW1G45G (Octo) |
| Flash interface | OCTOSPI | HSPI |
| Memory-mapped command | Quad I/O Read 0xEB, 4 dummy cycles | Octo DTR Read |
Custom OSPI_EnableMemoryMappedMode() | Required | Not needed (CubeMX handles HSPI) |
What I've Tried (and the Outcome)
| Attempt | Result |
|---|---|
Changed JPEG_IRQn priority from 5 to 7 | Helped — reduced ErrorCode=4 frequency |
| Changed GPDMA1 Ch0 SrcBurstLength from 32 to 8 | Helped — matches demo behavior closer |
Enabled configENABLE_FPU = 1 | Helped — fewer random hangs |
| Properly configured TouchGFXTask via CubeMX ("As external") | Required, working |
| Tried always-Pause/Resume in DataReadyCallback (per memory note) | Failed immediately with ErrorCode=4 on first frame |
| Tried SEM_POST in DecodeCpltCallback to fix LastJob race | Made worse — caused HAL_JPEG_ERROR_HUFF_TABLE (=1) immediately on boot |
Code Snippet — HardwareMJPEGDecoder.cpp (TouchGFX-generated, identical to demo)
HAL_JPEG_DataReadyCallback end-of-frame branch:
if (line_count >= JPEG_ConvertorParams.endY)
{
Jpeg_OUT_BufferTab[JPEG_OUT_Write_BufferIndex].LastJob = true;
Jpeg_HWDecodingEnd = 1;
HAL_JPEG_Pause(hjpeg, JPEG_PAUSE_RESUME_OUTPUT);
}
if (!DMA2D_reference->isDMARunning())
{
SEM_POST(semDecodingDone);
}
HAL_JPEG_DecodeCpltCallback:
void HAL_JPEG_DecodeCpltCallback(JPEG_HandleTypeDef* hjpeg)
{
Jpeg_HWDecodingEnd = 1;
}
Hypothesis
A timing-sensitive race condition between JPEG hardware output trigger and GPDMA Channel 1 (output) reconfiguration in HAL_JPEG_ConfigOutputBuffer. When JPEG raises the next TX request while CH1 is being reconfigured (HAL_DMA_Abort → HAL_DMA_Start_IT internally), a trigger overrun causes HAL_JPEG_ERROR_DMA.
The race window timing depends on:
- OCTOSPI memory-mapped read latency (input DMA throughput)
- FreeRTOS critical section duration
- Interrupt processing latency
When timing is favorable, many frames decode successfully (Run 2 = 256 frames). When unfavorable, fails immediately (Run 3).
Open Questions to the Community
- Is this a known issue with STM32U5 GPDMA + JPEG hardware decoder combination?
- Is the OCTOSPI memory-mapped mode less suited for feeding JPEG_RX DMA compared to HSPI? Are there specific OCTOSPI settings (Refresh, DelayBlock) required for sustained DMA feed?
- Should
HAL_JPEG_ConfigOutputBufferin DataReadyCallback be wrapped with explicit DMA stop/start for robustness? - Are there errata for STM32U5G9 JPEG peripheral or GPDMA related to this scenario?
- Why does the STM32U5G9J-DK2 demo work but seemingly identical code on a custom board with OCTOSPI does not?
Any guidance would be greatly appreciated.
