cancel
Showing results for 
Search instead for 
Did you mean: 

How to change TouchGFX FrameBuffer to Big Endian?

NRobb
Associate II

Good afternoon all,

I have an STM32 controlling a 320x240 LCD using an ILI9341 controller and communicating over a SPI interface. It all works perfectly, using my own interface routines, and almost works perfectly under my initial basic integration of TouchGFX. However, the framebuffer from TouchGFX is stored in RGB565 as Big Endian (i.e. Red is stored as 0xF800). However, it appears thatI need to send them as Little Endian (0x00F8), as the ILI9341 only allows me to specify endianness if I use it in non-SPI mode.

Is there any support in TouchGFX to change endianness in the framebuffer?

Failing that, any suggestions on the best place to rearrange the bytes? Currently, I transfer entire rows on SPI (I can't do the whole buffer each time as I am catering for partial updates). Falling back to transferring byte-per-byte to flip them seems clunky. But also, having to flip the framebuffer before transmitting also seems unnecessary!

I hope I am missing something fundamental here though 🙂

Any help much appreciated.

Thanks

Nick

9 REPLIES 9
HP
Senior III

are you sure that what you need isn't a red/blue swap?

NRobb
Associate II

Hi HP,

I am pretty sure (always leave room for error!). I made a test with 3 strips of pure red, green, and blue. The red and blue looked pretty close to complete swaps, but the green was a different colour altogether. As it is RGB565 if the red and blue were simply swapped, the middle 6 bits for green should stay the same, and pure green would work.

My quick check for what I think is happening - Pure red is 11111 000000 00000, with only Red bits set. Flipped is 00000 000111 11000, so mainly blue with a bit of green, which seemed to match what the display was showing.

I did notice something that I did not understand when experimenting and looking through the memory browser. I made a test uint8_t array, which had repeating elements of 0x00 and 0x1F (representing how I thought the framebuffer memory would be filled if each pixel was set to 0x001F colour), but the memory browser showed that the array seemed to be stored with each 2 bytes flipped - e.g. 0x1F, 0x00, 0x1F, 0x00 etc. If this was an array of 16-bit ints I could understand this might be the STM32 little-endian storage of 0x001F, but this is an array of bytes, and I can't seem to find the explanation for this.

Nick

the issue is of course that you are using your own interface. but since you're transferring the data yourself can't you do the byte-flip in your transmit function?

The DMA2D have red/blue bytswap (on select mcu's only, unfortunately) - Also I had problems with the endianness during my debugging. In my case the static image I was testing with was the wrong way around.

If you do a static test image you should be able to easily see if it is indeed the bytes that needs to be swapped or if it is something else.

NRobb
Associate II

Hi HP,

Yes - static image test has everything where it should be, just enjoying a pretty psychedelic colour palette!

I was looking at the best way to do it in my transmit. I was hoping to use DMA on SPI just to send that existing frame buffer out without manipulation, and was just looking to see if SPI DMA supported some kind of Endian swapping. I didn't know about DMA2D until your email, and the application note for it mentions it can do bye swap for just this issue on LCDs as well as red/blue swap (and my MCU supports it !), so I think I will look at that next.

Thanks for the pointer 🙂

Nick

I spend way to much time looking at the byte-swap only to find out that it exist only on very few MCUs.. Just a heads-up if you can't seem to find the right setting 🙂

I hope you get it to work!

NRobb
Associate II

Thanks HP - I am getting closer! I added DMA2D as an intermediate step to a secondary frame buffer that I then send to the SPI. The sequence works, and the new frame buffer is sent to the display just fine. The problem is that it does not appear to actually do the byte swap! I tried the red/blue swap and that works just fine (and coincidentally confirms that is not my issue!), but whatever setting I put for hdma2d.Init.BytesSwap, the byte order does not change...

 hdma2d.Instance = DMA2D;

 hdma2d.Init.Mode = DMA2D_M2M_PFC;

 hdma2d.Init.ColorMode = DMA2D_OUTPUT_RGB565;

 hdma2d.Init.OutputOffset = 0;

 hdma2d.Init.BytesSwap = DMA2D_BYTES_SWAP;

 hdma2d.Init.LineOffsetMode = DMA2D_LOM_PIXELS;

 hdma2d.LayerCfg[1].InputOffset = 0;

 hdma2d.LayerCfg[1].InputColorMode = DMA2D_INPUT_RGB565;

 hdma2d.LayerCfg[1].AlphaMode = DMA2D_NO_MODIF_ALPHA;

 hdma2d.LayerCfg[1].InputAlpha = 0;

 hdma2d.LayerCfg[1].AlphaInverted = DMA2D_REGULAR_ALPHA;

 hdma2d.LayerCfg[1].RedBlueSwap = DMA2D_RB_REGULAR;

 hdma2d.LayerCfg[1].ChromaSubSampling = DMA2D_NO_CSS;

The manual for my board/chip NUCLEO-H743ZI with STM32H7 chipset says it supports byte swapping, and the STM32CubeIDE allows me to configure it and auto-generates the above code just fine. It also compiles and runs with no complaints, so I assume that I am just missing some pre-condition. As I pass just a pointer to the output buffer start, I assume it only looks at the InputColorMode to know it is 16-bit and could be swapped.

Does anyone know of any other tweaks that might be needed for the byteswap to work?

Thanks in advance,

Nick

I'm just curious - why is this line the way it is:

hdma2d.LayerCfg[1].RedBlueSwap = DMA2D_RB_REGULAR;

I have no idea how the DMA works but the naming is ambiguous at best.. RB_REGULAR seem to imply that it is working 'normal' - and normal in my book would be no swap.. What are the other options here? could it be that simple?

Glad to hear that your MCU supports the swap! My display uses RGB so my solution would be to just pinswap.. luckily it turned out not to be necessary..

NRobb
Associate II

For the red/blue swap, the options are :

#define DMA2D_RB_REGULAR            0x00000000U  /*!< Select regular mode (RGB or ARGB) */
#define DMA2D_RB_SWAP               0x00000001U  /*!< Select swap mode (BGR or ABGR) */

and they work just as the name imply.

The one I need is Bytesswap, where the options are :

#define DMA2D_BYTES_REGULAR         0x00000000U      /*!< Bytes in regular order in output FIFO */
#define DMA2D_BYTES_SWAP            DMA2D_OPFCCR_SB  /*!< Bytes are swapped two by two in output FIFO */

And it outputs the same whichever option I use....

NRobb
Associate II

Afternoon all,

I got it all working, so thought I would outline it here for anyone who stumbles across this thread in the future....

First off, the ST documentation is confusing when it comes to support for byte-swapping. The STM32H7 datasheet says it supports it in DMA2D, and the application note AN4943 section 5.3.3 talks about byte swapping RGB565 which is exactly what I want. But, on deeper reading of the STM32H7 reference manual RM0433 section 17.4.8 says RGB565 "is supported without byte reordering by the DMA2D.". So, my conclusion is that I can't do it with DMA2D in my particular board, similar to HP's thoughts.

But, I carried on reading, and found that MDMA can also byte-swap, and it worked perfectly on first set-up 🙂 And just to make me a little happier, you can also use the block offsetting to grab the specific update portion of the frame buffer and make one contiguous block of memory that I can then stream out in minimal SPI DMA transfers.

So, the MDMA is configured like this :

static void MX_MDMA_Init(void) 
{
 
  /* MDMA controller clock enable */
  __HAL_RCC_MDMA_CLK_ENABLE();
  /* Local variables */
 
  /* Configure MDMA channel MDMA_Channel1 */
  /* Configure MDMA request hmdma_mdma_channel40_sw_0 on MDMA_Channel1 */
  hmdma_mdma_channel40_sw_0.Instance = MDMA_Channel1;
  hmdma_mdma_channel40_sw_0.Init.Request = MDMA_REQUEST_SW;
  hmdma_mdma_channel40_sw_0.Init.TransferTriggerMode = MDMA_REPEAT_BLOCK_TRANSFER;
  hmdma_mdma_channel40_sw_0.Init.Priority = MDMA_PRIORITY_MEDIUM;
  hmdma_mdma_channel40_sw_0.Init.Endianness = MDMA_LITTLE_BYTE_ENDIANNESS_EXCHANGE;
  hmdma_mdma_channel40_sw_0.Init.SourceInc = MDMA_SRC_INC_HALFWORD;
  hmdma_mdma_channel40_sw_0.Init.DestinationInc = MDMA_DEST_INC_HALFWORD;
  hmdma_mdma_channel40_sw_0.Init.SourceDataSize = MDMA_SRC_DATASIZE_HALFWORD;
  hmdma_mdma_channel40_sw_0.Init.DestDataSize = MDMA_DEST_DATASIZE_HALFWORD;
  hmdma_mdma_channel40_sw_0.Init.DataAlignment = MDMA_DATAALIGN_PACKENABLE;
  hmdma_mdma_channel40_sw_0.Init.BufferTransferLength = 128;
  hmdma_mdma_channel40_sw_0.Init.SourceBurst = MDMA_SOURCE_BURST_SINGLE;
  hmdma_mdma_channel40_sw_0.Init.DestBurst = MDMA_DEST_BURST_SINGLE;
  hmdma_mdma_channel40_sw_0.Init.SourceBlockAddressOffset = 0;
  hmdma_mdma_channel40_sw_0.Init.DestBlockAddressOffset = 0;
  if (HAL_MDMA_Init(&hmdma_mdma_channel40_sw_0) != HAL_OK)
  {
    Error_Handler();
  }
 
  /* MDMA interrupt initialization */
  /* MDMA_IRQn interrupt configuration */
  HAL_NVIC_SetPriority(MDMA_IRQn, 5, 0);
  HAL_NVIC_EnableIRQ(MDMA_IRQn);
 
}

Then in the TouchGFXHAL.cpp, the flushFrameBuffer is modified to copy only the updated portion to a new buffer with byte swapping using MDMA, modifying the SourceBlockAddressOffset as needed for the rect size.

void TouchGFXHAL::flushFrameBuffer(const touchgfx::Rect& rect)
{
    // Calling parent implementation of flushFrameBuffer(const touchgfx::Rect& rect).
    //
    // To overwrite the generated implementation, omit call to parent function
    // and implemented needed functionality here.
    // Please note, HAL::flushFrameBuffer(const touchgfx::Rect& rect) must
    // be called to notify the touchgfx framework that flush has been performed.
 
 
	uint16_t* fb = HAL::lockFrameBuffer(); //SYNC WITH FRAMEWORK. Get pointer to current framebuffer and lock the framebuffer
 
	//Use MDMA to flip 16bpp for changing endianness from Little (STM32) to Big (ILI9341)
 
	// Set up MDMA to consolidate only the required parts of the frame buffer
	hmdma_mdma_channel40_sw_0.Init.SourceBlockAddressOffset = 2*(320-rect.width); // Between lines we need to jump to the start of the next changed block
	  if (HAL_MDMA_Init(&hmdma_mdma_channel40_sw_0) != HAL_OK)
	  {
	    Error_Handler();
	  }
 
	  // Trigger the actual MDMA and then wait for the interrupt-triggered semaphore
	hmdma_hal_status = HAL_MDMA_Start_IT(&hmdma_mdma_channel40_sw_0, (uint32_t)(fb + ((320 * rect.y) + rect.x)), (uint32_t)flippedFrameBufferAddress, rect.width*2, rect.height);
	osSemaphoreWait(MDMA_FrameBuffer_Byteswap_completeHandle, osWaitForever);
 
	// Now we can release the frame buffer
	HAL::unlockFrameBuffer();
 
	// And trigger the display update
	myDisplay_FlushFrameBuffer(rect.x, rect.y, rect.width, rect.height, (void*)flippedFrameBufferAddress); // Call display update
 
	HAL::flushFrameBuffer(rect); // And call the base routine to complete
 
    // If the framebuffer is placed in Write Through cached memory (e.g. SRAM) then we need
    // to flush the Dcache to make sure framebuffer is correct in RAM. That's done
    // using SCB_CleanInvalidateDCache().
 
    // SCB_CleanInvalidateDCache();
}

When that completes, it calls my SPI display update, which sets up SPI without DMA, and then sends the whole updated block to the display using DMA with the minimal number of transactions (SPI DMA has a 65k byte limit for one burst, so the full screen takes 3 transfers to get the whole ~150k bytes sent, but smaller updates are done in one SPI DMA transfer with minimal CPUoverhead.

void myDisplay::FlushTouchGFXFrameBuffer(uint16_t x, uint16_t y, uint16_t width, uint16_t height,  void* FrameBufferAddress)
{
	// FrameBufferAddress is the start of a contiguous memory range with the updated rectangle to be sent to the screen only (i.e. not full 320x20 screen)
	// So we can set up the transfer and then stream the whole range to the SPI, with only the 65k byte SPI block limitation
 
	SetWindow(x, x + width - 1, y, y + height - 1);
 
	HAL_GPIO_WritePin(LCD_DC_PORT, LCD_DC_PIN, GPIO_PIN_SET);
	HAL_GPIO_WritePin(LCD_CS_PORT, LCD_CS_PIN, GPIO_PIN_RESET);
 
	uint32_t BytesToSend = height*width*2;
 
	osSemaphoreWait(FrameBufferFlushToSPI_CompleteHandle, 1); 
 
	 while (BytesToSend>SPI_MAX_BURST)
	 {
		 HAL_SPI_Transmit_DMA(&hspi1, (uint8_t*)((void*)FrameBufferAddress), (uint16_t)SPI_MAX_BURST); 
		 osSemaphoreWait(FrameBufferFlushToSPI_CompleteHandle, osWaitForever);
		 FrameBufferAddress+=SPI_MAX_BURST;
		 BytesToSend-=SPI_MAX_BURST;
 
	 }
	 HAL_SPI_Transmit_DMA(&hspi1, (uint8_t*)((void*)FrameBufferAddress), (uint16_t)BytesToSend); 
	 osSemaphoreWait(FrameBufferFlushToSPI_CompleteHandle, osWaitForever);
 
 
	HAL_GPIO_WritePin(LCD_CS_PORT, LCD_CS_PIN, GPIO_PIN_SET);
 
}

Thanks to HP for the pointers that got me here. Hopefully, this may help someone in the future.

Nick