cancel
Showing results for 
Search instead for 
Did you mean: 

Code in CCM RAM

matic
Associate III
Posted on January 09, 2016 at 20:36

Hi.

I tried to put code in a CCM RAM (STM32F303) as described in AN4296 and I was very impressed. It was more than 30% faster, than if the code is in flash.

Mentioned PDF (AN4296) says that it is not recommended to place both code and data together in the CCM, because there would be a risk of ''collision''. Is that problem the same if I place in the CCM (beside code) a look-up table, which is a constant? Now I have it stored in a flash.

Thanks
14 REPLIES 14
matic
Associate III
Posted on January 11, 2016 at 06:48

I followed the steps described in

http://www.st.com/st-web-ui/static/active/en/resource/technical/document/application_note/DM00083249.pdf

. I use Keil and procedure there is very simple. Only few changes in a scatter file and with __atrubute__((section(''ccmram''))) before function definition you tell where it is going to be.

mikael239955_stm1_st
Associate III
Posted on January 11, 2016 at 17:26

It would have been nice if  CCMRAM  had some dubble buffered adressess for DMA for really efficent DSP so process data in and out could have been effi cent moved from CCMRAM to whatever needed.

.

--------------------------------------------------------------------------------------------------

Subject: Code in CCM RAM

I followed the steps described in

http://www.st.com/st-web-ui/static/active/en/resource/technical/document/application_note/DM00083249.pdf

. I use Keil and procedure there is very simple. Only few changes in a scatter file and with __atrubute__((section(''ccmram''))) before function definition you tell where it is going to be.

Posted on January 11, 2016 at 17:38

The point is CCM is unfettered by DMA contention, why would you need that on executing code? The regular SRAM is the same single cycle stuff as CCM, I can't see a significant downside of using that.

The CCM is already small enough, you don't want to clutter it up with large blocks of data streaming in/out.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
matic
Associate III
Posted on January 11, 2016 at 19:05

I set a breakpoint on the first function in while (1) loop and set a ''pass count'' of a breakpoint to 500. Then, I stopped the program on breakpoint and reset a ''Cycle Count'', which is part of a ''Cortex Status'' tool in a debugger. Thus, I see how much time 500 cycles take. I repeated that for 4 different cases. Below are the results for one program cycle:

- code from CCM without optimization: 10,5 us

- code from CCM with optimization: 7,2 us

- code from flash without optimization: 17,3 us

- code from flash with optimization 11,0 us

I confirmed those times (cycle counts) with ETM trace. In profiler you can see exactly how much time something take. And for both cases when code was running from flash, the times was the same. For CCM I couldn't verify times with ETM, because ETM works only if code is in flash.

mikael239955_stm1_st
Associate III
Posted on January 12, 2016 at 22:31

Talking about processing data as fast as possible, DSP, not necessarily

executing program code. Supriced you see DMA as an obstacle even if not used

with CCM so why limit it in first place to just core program.Some devices have

already very limited amount of SRAM.