Skip to main content
Linas L
Senior
May 21, 2022
Question

Performing mathematics on data in SRAM while DMA is coping data from DCMI at maximum speed.

  • May 21, 2022
  • 2 replies
  • 1396 views

Hello, I am making a sensor that uses a camera, and I would like to perform a simple mathematical operation on image data while it is still reading out.

My algorithm is very linear and does not need all data, so in theory, I should be able to do that.

(in FPGA it is extraordinary simple to do in real time)

My main concern is that I will be loading AHB bus while DMA is copying data from DCMI to SRAM. If I set DMA with highest priority, as far as I understand I could get DMA overrun, since ARM core access priority is larger than DMA?

Original idea is to use HSYNC interrupt to count lines, and when I get a new line copied, I could start to do mathematical operations to that line in SRAM, while DMA will be copying the second line. I also get a bit of horizontal blanking time in which DMA is idling.

My Cortex-M33 will be running at 160MHz, and camera will be running at 60MHz 

(theoretical maximum is 64MHz (Frequency ratio DCMI_PIXCLK/f HCLK = 0.4). I am also running 10b of data, meaning DMA FIFO will be capturing 2 pixels for a single DMA transfer, effective frequency will be 32MHz. (word is 32b, and 10b is half a word, so 2x packing)

Any advice on how I can make this work ? Hardware is still under way, so I have no way of testing how it works, and if I get corrupted data.

This topic has been closed for replies.

2 replies

waclawek.jan
Super User
May 21, 2022

Which STM32?

> If I set DMA with highest priority, as far as I understand I could get DMA overrun, since ARM core access priority is larger than DMA?

Where do you have that information from?

There is not much information about the details of arbitration between the busmasters in STM32 bus matrices, but all the information available indicates that in AHB bus matrices it's usually simple round-robin arbitration, i.e. all busmasters stand the same chance to access a single slave bus.

Maybe you want to read AN5593. You can also benchmark on available Nucleo boards, imitating the camera by generating clocks using timers.

I'm not sure you will be able to perform any reasonable computation on a continuous stream of images. As you've said, this is a task for an FPGA.

JW

Tesla DeLorean
Guru
May 21, 2022

I don't think its a priority issue as much as it s a bandwidth/saturation issue.

Doing things at wire speed would seem to suggest a FPGA/CPLD is still a better solution.

Are you able to drop the frame rate, or use a FIFO memory?

Check if your STM32 has different SRAM banks with independent bus matrix plumbing, perhaps you can ping-pong between different area using the DMA "double buffer" modes.

Tips, Buy me a coffee, or three.. PayPal VenmoUp vote any posts that you find helpful, it shows what's working..
Linas L
Linas LAuthor
Senior
May 21, 2022

Hello,

Thank you for replay

0693W00000NqPCmQAN.pngIn case of concurrent accesses from the CPU and the GPDMA, the bus matrix arbitration rules the access to the SRAM1. If the last access is from the CPU, during the next access, the GPDMA wins the bus, and accesses SRAM1. After the CPU can again access SRAM1.

Based on this it looks like if I read data to registers from memory, and perform operation without SRAM usage, GDMA will get priority and write data to memory. So while camera is pumping data to SRAM, i need to have sufficient NOP's operations allowing GDMA to do it's job, and after last byte from camera, i should jump into high performance mode without any NOP's

In this situation, nop's needs to be tailored to bus load so I would never receive overrun.

Is this ok or stupid thinking ?