AnsweredAssumed Answered

Performance drop due to alignment when using memcpy or memset

Question asked by overgaard.jorgen on Jan 21, 2012
Latest reply on Jan 21, 2012 by Clive One
Hi all! :D

I've been playing around with the SMT32F4-Discovery and generating signals for driving a VGA monitor using two timers (vsync and hsync) and pure software for pixeldata (no DMA). I've gotten this far http://www.youtube.com/watch?v=iZRwqjbeups
It uses doubble buffering and resolution is 320x200 with 256 colors. A 640x200 mode is also available with 16 colors. Works quite well. But when I started looking at how much cycles some routine took I discovered that, when blitting (in this case) a 200x200 pixel image to the framebuffer, every 4th position in x was faster. About 10 times faster. (or the other 10 times slower depending on how you look at it)
Moving the image 4 px at the time (framebuffer is alingned to start with) keeps it steady at fast.
Anyone know a memset and memcpy that is better suited for this stuff?
One solution could be to do the first un-aligned bytes "manually" and then call
memcpy for the rest since then aligned to 4?

Am I making any sense with my question, problem?

Best regards
Jörgen

Outcomes