DMA and Concurrency Concepts

craig · ‎2013-08-22

Posted on August 22, 2013 at 18:12

I'm new to ARM and have been doing a lot of work/research on DMA on the STM32F4 Discovery board. I have multiple DMA streams working correctly for SPI to an OLED and some LED controllers but I have a fundamental question I can't seem to locate the answer to anywhere. My background is in x86 development (including multi-threading) but in higher level languages and I can't figure out how much of my experience applies to ARM and DMA.

For example, I have a byte array containing a ''video frame'' of data for my LED controllers in RAM (think an application like an LED video wall). I start the stream and it begins sending the data to the LED controllers. If I were to immediately start rendering the next frame and manipulate the data that's in the same array while DMA is still transferring and accessing it what happens? Do I need to implement my own locking mechanism to prevent modifying the data until the DMA transfer complete interrupt fires? Should I implement double-buffering to DMA transfer from one buffer while manipulating a second? Is it safe as far as the chip is concerned to modify the data if I don't care that it might send data from a previous frame mixed with data with a currently rendering frame? (I think this would effectively result in a visual ''tearing'' effect in my example case.)

I really feel like I need a really firm grasp of this concept so I can effectively use all of the extra MCU cycles freed up by using DMA without stepping on DMA's toes or wasting RAM. I'll happily read a document on the subject if anyone has such a resource but I've just been unable to locate much information on the topic. Any help is appreciated!

Tesla DeLorean · ‎2013-08-22

Posted on August 22, 2013 at 19:03

Well it's like chasing the raster as it paints the screen, if you get ahead of yourself it's going to be visible.

So yes, you can have alternating frames, and ping/pong between them.

You could wait for completion, if that provides you enough time in the blanking delay to do your work.

Locking probably won't work, you could determine the current place in the transfer and use that as a ''fence'' or ''boundary'' to start or limit operations. You could sort manipulation tasks so they are done in an order most conducive to the display/painting order.

Tips, buy me a coffee, or three.. PayPal Venmo Up vote any posts that you find helpful, it shows what's working..

craig · ‎2013-08-22

Posted on August 22, 2013 at 19:23

Those options all make sense and it helps me understand the behavior much better. Thank you! I might even implement them all just to get really comfortable with DMA. The power is knocking my socks off only having experience in x86 and AVR. ARM sits so perfectly right in the middle and is closer to multi-threaded x86 in terms of flexibility/power than I imagined.

For my future reference, is this type of information about the low level behavior or common architecture principals/practices available anywhere? Perhaps a book one might recommend?

Tesla DeLorean · ‎2013-08-22

Posted on August 22, 2013 at 19:57

Architecturally you'll need to focus on Data Manuals and Reference Manuals for the STM32 part(s) you choose. These cover things peripheral to the core, like DMA and USARTs, etc. Google your part# and go to the

http://www.st.com/web/catalog/mmc/FM141/SC1169/SS1577/LN11/PF252140

tab

They also have Programming Manuals, one deals with Flash memory, the other the core (Cortex-M0/3/4)

ARM has TRM (Techical Reference Manuals) for the core.

Joseph Yiu has a number of books on the Cortex parts, which are pitched differently from the TRM, and useful from a foundational perspective. He has a

http://www.amazon.com/Definitive-Cortex%C2%AE-M3-Cortex%C2%AE-M4-Processors-Edition/dp/0124080820/ref=sr_1_3?ie=UTF8&qid=1377193645&sr=8-3&keywords=joseph+yiu

scheduled for mid November.

The DMA controllers are more similar to those used in the x86 PC during the pre-PCI era (ie ISA), although having the flexibility closer to PCI-master devices.

Tips, buy me a coffee, or three.. PayPal Venmo Up vote any posts that you find helpful, it shows what's working..

root · ‎2013-08-23

Posted on August 23, 2013 at 09:46

Hello,

I totally second Clive. Even high end PC video cards use at least double buffering so they can render one when the other is displayed.

Add Mr Yu's Definitive Guide on Cortex-Mx (for general knowledge on the Cortex-M phylosophy) and technical stuff available from ST (to be able to apply it to STM32). That's what I did and I can now dev a few days without needing Clive's help 😉

Thomas.

PS : if there is a new version of the Definitive Guide to the Cortex-M arriving, I'll probably sell mine to get the updated one.