2026-01-17 8:11 PM
Hi
I'd like to do a memory-to-memory transfer, where I take a 8x12 px rectangle and copy it into a framebuffer. My display is 256x64px and it uses a grayscale 4 bpp format. My framebuffer size is 8192 bytes, and the rectangle that I need to copy is 48 bytes big. I'm confused whether I should use L4 or L8 because L8 seems more straightforward for my byte ordered graphics.
I set my screen byte width to 128, and I used L8. My output image looks garbage, as if the alignment was off (bad stride maybe). What is suspicious is that the garbage image is 4 lines tall, which could mean that the stride is calculated with 2 bytes instead of 1. Based on the RM0468 I think in no PFC mode it only uses the foreground and if I use L8 as hdma2d.LayerCfg[1].InputColorMode = DMA2D_INPUT_L8; (which btw. couldn't be set with CubeMX) it could increment the address correctly.
In addition, if I'm copying a 256x64px image to the frame buffer, it is working properly.
void write_framebuffer(const uint8_t *pSrc, uint16_t x, uint16_t y, uint16_t w, uint16_t h) {
if(x > 255) x = 255;
if(x < 0) x = 0;
if(y > 63) y = 63;
if(y < 0) y = 0;
uint16_t byte_w = w / 2;
uint16_t byte_x = x / 2;
uint32_t stride = SCREEN_WIDTH_BYTES - byte_w;
uint32_t destAddr = (uint32_t) (&framebuffer) + (y * SCREEN_WIDTH_BYTES) + byte_x;
DMA2D->OOR = stride;
DMA2D->NLR = (h << 16) | (byte_w);
DMA2D->FGMAR = (uint32_t) pSrc;
DMA2D->OMAR = destAddr;
DMA2D->CR |= DMA2D_CR_START;
while (DMA2D->CR & DMA2D_CR_START) {
// Spin lock (very fast for small icons)
}
}static void MX_DMA2D_Init(void) {
/* USER CODE BEGIN DMA2D_Init 0 */
/* USER CODE END DMA2D_Init 0 */
/* USER CODE BEGIN DMA2D_Init 1 */
/* USER CODE END DMA2D_Init 1 */
hdma2d.Instance = DMA2D;
hdma2d.Init.Mode = DMA2D_M2M;
hdma2d.Init.ColorMode = DMA2D_OUTPUT_RGB565;
hdma2d.Init.OutputOffset = 0;
hdma2d.LayerCfg[1].InputOffset = 0;
hdma2d.LayerCfg[1].InputColorMode = DMA2D_INPUT_L8;
hdma2d.LayerCfg[1].AlphaMode = DMA2D_NO_MODIF_ALPHA;
hdma2d.LayerCfg[1].InputAlpha = 0;
hdma2d.LayerCfg[1].AlphaInverted = DMA2D_REGULAR_ALPHA;
hdma2d.LayerCfg[1].RedBlueSwap = DMA2D_RB_REGULAR;
hdma2d.LayerCfg[1].ChromaSubSampling = DMA2D_NO_CSS;
if (HAL_DMA2D_Init(&hdma2d) != HAL_OK) {
Error_Handler();
}
if (HAL_DMA2D_ConfigLayer(&hdma2d, 1) != HAL_OK) {
Error_Handler();
}
/* USER CODE BEGIN DMA2D_Init 2 */
/* USER CODE END DMA2D_Init 2 */
}
Solved! Go to Solution.
2026-01-17 10:59 PM
void write_framebuffer(const uint8_t *pSrc, uint16_t x, uint16_t y, uint16_t w, uint16_t h) {
/* 1. Wait if the DMA2D is currently busy */
while (DMA2D->CR & DMA2D_CR_START)
;
/* 2. Configure for "Raw Byte Copy" (L8 Mode) */
// M2M Mode: Memory to Memory (Bits 17:16 = 00)
//DMA2D->CR &= ~(DMA2D_CR_MODE);
// Input Color Mode: L8 (1 byte per pixel)
DMA2D->FGPFCCR = DMA2D_INPUT_L8;
/* 3. Safety Limits */
if (x >= SCREEN_WIDTH_BYTES)
return; // Completely out of bounds
if (x + w > SCREEN_WIDTH_BYTES)
w = SCREEN_WIDTH_BYTES - x;
uint32_t max_h = 8192 / SCREEN_WIDTH_BYTES;
if (y >= max_h)
return; // Completely out of bounds
if (y + h > max_h)
h = max_h - y;
/* 4. Calculate Offsets (Stride) */
// Destination Stride: How many bytes to jump after writing a line
DMA2D->OOR = SCREEN_WIDTH_BYTES - w;
// Source Stride: 0 (Your source is a contiguous packed array)
DMA2D->FGOR = 0;
/* 5. Set Dimensions */
// Format: [Height 16-bit] | [Width 16-bit]
DMA2D->NLR = (w << 16) | (h);
/* 6. Set Addresses */
DMA2D->FGMAR = (uint32_t) pSrc;
DMA2D->OMAR = (uint32_t) (&framebuffer[0]) + (y * SCREEN_WIDTH_BYTES) + x;
/* 7. Start Transfer */
DMA2D->CR |= DMA2D_CR_START;
}After reverting to using bytes as coordinates and sizes, I've noticed that the NLR register had its width and height switched, so I attached the correct function. It is now working properly.
2026-01-17 10:59 PM
void write_framebuffer(const uint8_t *pSrc, uint16_t x, uint16_t y, uint16_t w, uint16_t h) {
/* 1. Wait if the DMA2D is currently busy */
while (DMA2D->CR & DMA2D_CR_START)
;
/* 2. Configure for "Raw Byte Copy" (L8 Mode) */
// M2M Mode: Memory to Memory (Bits 17:16 = 00)
//DMA2D->CR &= ~(DMA2D_CR_MODE);
// Input Color Mode: L8 (1 byte per pixel)
DMA2D->FGPFCCR = DMA2D_INPUT_L8;
/* 3. Safety Limits */
if (x >= SCREEN_WIDTH_BYTES)
return; // Completely out of bounds
if (x + w > SCREEN_WIDTH_BYTES)
w = SCREEN_WIDTH_BYTES - x;
uint32_t max_h = 8192 / SCREEN_WIDTH_BYTES;
if (y >= max_h)
return; // Completely out of bounds
if (y + h > max_h)
h = max_h - y;
/* 4. Calculate Offsets (Stride) */
// Destination Stride: How many bytes to jump after writing a line
DMA2D->OOR = SCREEN_WIDTH_BYTES - w;
// Source Stride: 0 (Your source is a contiguous packed array)
DMA2D->FGOR = 0;
/* 5. Set Dimensions */
// Format: [Height 16-bit] | [Width 16-bit]
DMA2D->NLR = (w << 16) | (h);
/* 6. Set Addresses */
DMA2D->FGMAR = (uint32_t) pSrc;
DMA2D->OMAR = (uint32_t) (&framebuffer[0]) + (y * SCREEN_WIDTH_BYTES) + x;
/* 7. Start Transfer */
DMA2D->CR |= DMA2D_CR_START;
}After reverting to using bytes as coordinates and sizes, I've noticed that the NLR register had its width and height switched, so I attached the correct function. It is now working properly.