cancel
Showing results for 
Search instead for 
Did you mean: 

STM32H743 JPEG Decoder Performance

Hello,

I'm trying to use the HW JPEG decoder on an STM32H743 MCU. When looking at the AN4996 Application Note, I found the following table:

dzipperle5512730769682947E12_0-1722347908686.png

However, I'm unable to come even close to these values... This is what I measured:

 JPEG decodeDMA2D YCbCr
320x2402,5 ms2,5 ms
640x48010 ms12 ms

I checked my code against the Examples from the Firmware Package:

  • JPEG_MJPEG_VideoDecoding
  • JPEG_DecodingFromFLASH_DMA

All my settings look similar. I'm decoding the JPEG image from DTCM RAM to external SDRAM. Compiler optimization has no effect (everything is done by DMA anyway)...

Is there anything else that has to be configured for the JPEG peripheral? Where are the values in the application note coming from?

1 ACCEPTED SOLUTION

Accepted Solutions
KDJEM.1
ST Employee

Hello @d.zipperle.5512730769682947E12,

The performance measurements in the application note were obtained under the conditions presented in the below table:

KDJEM1_0-1723027928549.png

For that please make sure that you have used the same conditions to obtain the same performance::

-1- Board: STM32H743I-EVAL that comes with an 32 bits SDRAM

-2- MDMA output channel destination data increment and size : WORD (hmdmaOut.Init.DestinationInc: MDMA_DEST_INC_WORD; hmdmaOut.Init.DestDataSize: MDMA_DEST_DATASIZE_WORD))

-3- DMA2D format RGB565 (and same for LTDC format)

-4- LCD display turned off during the JPEG operations to reduce the contention on the SDRAM between the LTDC and the DMA2D mainly

-5- The image used is a 4:2:0 image

 

I hope this help you.

Kaouthar

 

 

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

View solution in original post

5 REPLIES 5
KDJEM.1
ST Employee

Hello @d.zipperle.5512730769682947E12,

The performance measurements in the application note were obtained under the conditions presented in the below table:

KDJEM1_0-1723027928549.png

For that please make sure that you have used the same conditions to obtain the same performance::

-1- Board: STM32H743I-EVAL that comes with an 32 bits SDRAM

-2- MDMA output channel destination data increment and size : WORD (hmdmaOut.Init.DestinationInc: MDMA_DEST_INC_WORD; hmdmaOut.Init.DestDataSize: MDMA_DEST_DATASIZE_WORD))

-3- DMA2D format RGB565 (and same for LTDC format)

-4- LCD display turned off during the JPEG operations to reduce the contention on the SDRAM between the LTDC and the DMA2D mainly

-5- The image used is a 4:2:0 image

 

I hope this help you.

Kaouthar

 

 

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

Hello Kaouthar,

thanks for Your valuable explanations.

-1- Board: STM32H743I-EVAL that comes with an 32 bits SDRAM

We also have 32 Bit SDRAM with FMC running @ 250 MHz (SDCLK = 125 MHz)

-2- MDMA output channel destination data increment and size : WORD (hmdmaOut.Init.DestinationInc: MDMA_DEST_INC_WORD; hmdmaOut.Init.DestDataSize: MDMA_DEST_DATASIZE_WORD))

Confirmed

-3- DMA2D format RGB565 (and same for LTDC format)

Confirmed

-4- LCD display turned off during the JPEG operations to reduce the contention on the SDRAM between the LTDC and the DMA2D mainly

I'm not exactly sure what this means: are we supposed to write during VSYNC blanking to the SDRAM?

-5- The image used is a 4:2:0 image

Confirmed.

AFAICS everything looks as suggested, plus we're clocking the CPU @ 480 MHz and SDRAM clock is 25% faster @ 125 MHz. Nevertheless we cannot reach the measured performance and we get display distortion.

There must be somthing we're overseeing...

Thanks,

Osama

KDJEM.1
ST Employee

Hi @osama2 ,

-4- LCD display turned off during the JPEG operations to reduce the contention on the SDRAM between the LTDC and the DMA2D mainly

I'm not exactly sure what this means: are we supposed to write during VSYNC blanking to the SDRAM?

Please try to:

- call BSP_LCD_DisplayOff(0); before starting the decode

- call BSP_LCD_DisplayOn(0); after the end of the DMA2D copy to re-enable the LTDC display

Note that the performance can be optimized more by placing the JPEG YCbCr output buffer at the internal AXI-SRAM @0x24000000.

Could you please test the performance with the same frequencies conditions as motioned in the below table:

KDJEM1_0-1723103742063.png

 Could you please share the performance values you have obtained?

I hope this help you.

Thank you.

Kaouthar

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

Hi @KDJEM.1,

thank you for your feedback.

I was able to identify and fix the problem and my performance now looks like this:

 JPEG decodeDMA2D YCbCr (LTDC disabled)DMA2D YCbCr (LTDC enabled)
640x4804,4 ms6,8 ms10,8 ms

I tested with the same frequency conditions as the table from the application note and now the values look plausible...

As mentioned in my first question, I looked at both example projects from the STM32CubeMX Repository (V1.11.2). As you can see, in both examples the MDMA output destination size + increment is configured for BYTE size.

dzipperle5512730769682947E12_0-1723455497897.png

Looking for this problem did cost a lot of time and I wasn't able to find any reference (other than the example code) for the correct / optimal MDMA configuration...

I would suggest to fix the example code to clarify this for future use. Or is there any specific reason why the MDMA in the example projects isn't configured for optimal performance?

Regards,

Dominik

KDJEM.1
ST Employee

Hello @d.zipperle.5512730769682947E12 ,

 

Glad to know that the issue is fixed and thank you for confirming the source of the problem and for sharing the fix.

So, to give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

Your proposal is tracked internally as a request to enhance performance for this example. I assume that this wasn't the main focus for this example but it is important to consider it.

Internal ticket number: 188510(This is an internal tracking number and is not accessible or usable by customers).

 

Thank you for your contribution in STCommunity. 

Kaouthar

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.