2019-11-20 07:49 AM
Hi @Martin KJELDSEN
On a screen I want to move two large images (one has transparency). I've implemented it like in your birdgame example (with the moveto(x,y) method.
Now I recognized that the animation is very slow and the rendering time is very long (>30ms). I don't know why this take so long. I measured it with the RENDER_TIME GPIO and also in SW. I see that the TouchGFX Library is active for more than 30ms.
Do you have any idea where the issue come from? Below you will find the implementation
Thank you
Best regards
Marco
#include <gui/containers/container_moving_waves.hpp>
#include <BitmapDatabase.hpp>
#include <touchgfx/Color.hpp>
container_moving_waves::container_moving_waves(): m_waveForeground(),
m_waveBackground(),
m_animationState(AnimationState::AnimationRunning),
m_tickCounter(0U),
m_tickinterval(1U) {}
void container_moving_waves::initialize()
{
container_moving_wavesBase::initialize();
initializeLayer(m_waveBackground, BITMAP_WAVE_BACKGROUND_ID, 255U, 1U, -1);
initializeLayer(m_waveForeground, BITMAP_WAVE_FOREGROUND_ID, 204U, 1U, 1);
m_boxWave.setPosition(0, 217, 800, 33);
m_boxWave.setColor(touchgfx::Color::getColorFrom24BitRGB(112, 148, 197));
add(m_boxWave);
}
void container_moving_waves::startAnimation() {
m_animationState = AnimationState::AnimationRunning;
}
void container_moving_waves::stopAnimation() {
m_animationState = AnimationState::NoAnimation;
}
void container_moving_waves::handleTickEvent() {
m_tickCounter++;
if ((m_tickCounter % m_tickinterval) != 0U) {
return;
}
if (m_animationState == AnimationState::AnimationRunning) {
moveLayer(m_waveForeground, m_tickCounter);
moveLayer(m_waveBackground, m_tickCounter);
}
}
void container_moving_waves::initializeLayer(Layer& layer, const BitmapId bmp, uint8_t alpha, const uint32_t animationUpdateSpeed, const int32_t animationWidth)
{
layer.image0.setBitmap(Bitmap(bmp));
layer.image1.setBitmap(Bitmap(bmp));
layer.image0.setXY(0, 0U);
if (animationWidth < 0) {
layer.image1.setXY(layer.image0.getRect().right(), 0U);
}
else {
layer.image1.setXY(layer.image0.getRect().x - layer.image1.getWidth(), 0U);
}
layer.image0.setAlpha(alpha);
layer.image1.setAlpha(alpha);
add(layer.image0);
add(layer.image1);
layer.animationUpdateSpeed = animationUpdateSpeed;
layer.animationWidth = animationWidth;
}
void container_moving_waves::moveLayer(Layer& layer, const uint32_t tickCount) {
if ((tickCount % layer.animationUpdateSpeed) == 0U) {
layer.image0.moveTo(layer.image0.getX() + layer.animationWidth, layer.image0.getY());
layer.image1.moveTo(layer.image1.getX() + layer.animationWidth, layer.image1.getY());
if (layer.animationWidth < 0) {
//when moving left
if (layer.image0.getRect().right() < 0) {
layer.image0.moveTo(layer.image1.getRect().right(), layer.image0.getY());
}
if (layer.image1.getRect().right() < 0) {
layer.image1.moveTo(layer.image0.getRect().right(), layer.image1.getY());
}
}
else {
//when moving right
if (layer.image0.getRect().x > layer.image0.getWidth()) {
layer.image0.moveTo(layer.image1.getRect().x - layer.image0.getWidth(), layer.image0.getY());
}
if (layer.image1.getRect().x > layer.image1.getWidth()) {
layer.image1.moveTo(layer.image0.getRect().x - layer.image1.getWidth(), layer.image1.getY());
}
}
}
}
2019-11-21 10:50 PM
Hi @Martin KJELDSEN
I found the reason why it takes so long. Because of the large images I use, I stored the images as L8_ARGB8888. When I change back to RGB565/ARGB8888 then the method setupDataCopy will called as expected. The rendering time falls from 73ms down to 42ms which is still high but much better. The cpu load falls from >90% down to 4%.
Is there a workaround to use L8_ARGB8888 and DMA2D together?
As written above the rendering time is still high. But I think It makes sense. I tried to calculate the theoretical renderingtime
Read from QSPI (108MHz):
Img1/Img2 (800x250px / 4Byte/px) -> 14.8ms each
Write to SDRAM (16bit data bus / 108MHz):
Background (Box / 800x480px) -> 3.6ms
Img1/Img2 -> 1.9ms each
In total it should take about 37ms. Is my calculation correct?
According to your experience is a rendering time of about 40ms expectable? Or should it go faster?
Do you have any advices how I can optimize the rendering time for large images? Should I may caching the images into the SDRAM instead of reading out of Flash?
Thank you
Marco
2019-11-25 10:25 AM
Marco.R,
These are good questions. I have been trying to figure out an implementation of L8_ARGB8888 myself. I asked a question about it, but have not had any responses.
I am still learning about it, but I did find that for the STM32f746g-Discovery kit (which I am currently using), I found a call that the function Martin mentioned (HAL_DMA2D_BlendingStart_IT() ) located in the file TouchGFX\target\STM32F7DMA.cpp. The function making this call is setupDataCopy() which appears to setup the dma copy based on which blitOp operation being used.
However, there doesn't appear to be an operation for indexed color (BlitOp.hpp):
enum BlitOperations
{
BLIT_OP_COPY = 1 << 0, ///< Copy the source to the destination
BLIT_OP_FILL = 1 << 1, ///< Fill the destination with color
BLIT_OP_COPY_WITH_ALPHA = 1 << 2, ///< Copy the source to the destination using the given alpha
BLIT_OP_FILL_WITH_ALPHA = 1 << 3, ///< Fill the destination with color using the given alpha
BLIT_OP_COPY_WITH_TRANSPARENT_PIXELS = 1 << 4, ///< Deprecated, ignored. (Copy the source to the destination, but not the transparent pixels)
BLIT_OP_COPY_ARGB8888 = 1 << 5, ///< Copy the source to the destination, performing per-pixel alpha blending
BLIT_OP_COPY_ARGB8888_WITH_ALPHA = 1 << 6, ///< Copy the source to the destination, performing per-pixel alpha blending and blending the result with an image-wide alpha
BLIT_OP_COPY_A4 = 1 << 7, ///< Copy 4-bit source text to destination, performing per-pixel alpha blending
BLIT_OP_COPY_A8 = 1 << 8 ///< Copy 8-bit source text to destination, performing per-pixel alpha blending
};
BlitOp.hpp contains the BlitOp struct. That struct does contain a pointer to the CLUT (pClut) , so I am not sure why there is no operation for indexed color.
struct BlitOp
{
uint32_t operation; ///< The operation to perform @see BlitOperations
const uint16_t* pSrc; ///< Pointer to the source (pixels or indexes)
const uint8_t* pClut; ///< Pointer to the source CLUT entires
uint16_t* pDst; ///< Pointer to the destination
uint16_t nSteps; ///< The number of pixels in a line
uint16_t nLoops; ///< The number of lines
uint16_t srcLoopStride; ///< The number of bytes to stride the source after every loop
uint16_t dstLoopStride; ///< The number of bytes to stride the destination after every loop
colortype color; ///< Color to fill
uint8_t alpha; ///< The alpha to use
uint8_t srcFormat; ///< The source format @see BitmapFormat
uint8_t dstFormat; ///< The destination format @see BitmapFormat
};
And as far as I know, ChromeArt supports a CLUT.
2019-11-25 12:21 PM
Hi guys,
I'll try to get to your L8 questions tomorrow. The standard HAL does not support L8, correct.
/Martin
2019-12-04 09:45 PM
Hi @Martin KJELDSEN
Just for information. I updated the HAL according your answer in the thread of @scottSD (here) and I get an improved rendertime. The rendering is now about 20% faster than before (about 30ms) with much less memory space is used then before and the cpu Ioad stays below 10%. If there are another possibilities to improve the rendertime, I appreciate for any hint. But I assume thats the limit for my configuration (see my last post with the calculation). Is that correct?
Thanks a lot for your help.
Marco
2019-12-04 11:47 PM
That's great to hear, Marco. I _think_ i'd need to know more concretely about your application to help optimize. It may be better to simply measure the different read/write times with an oscilloscope to be more accurate. 60ms is a lot if you're aiming for 60HZ - But you also need to know the limitations of your platform in terms of achieving acceptable performance.