cancel
Showing results for 
Search instead for 
Did you mean: 

Improve code processing time

e d
Associate II
Posted on June 27, 2017 at 19:18

I am working with the STM32L486 and we aim to use as low clock speed as possible to conserve power. As a result, I need to improve the run time inside one high rate interrupt to meet the timing requirement. I was trying to place the routine in RAM using ramfunc but it didn't help much. I used to work with TI DSPs and this usually helped considerably. Any ideas why? Or general comments on how to speed things up, aside from running from RAM and turning on code optimization?

Thanks,

ED

Note: this post was migrated and contained many threaded conversations, some content may be missing.
26 REPLIES 26
AvaTar
Lead
Posted on June 27, 2017 at 19:51

The Flash interface is, as far as I remember, accessed 128 bit wide, i.e. four words are fetched at once into a prefetch buffer.

The RAM has no such interface.

RAM functions are useful on STM32 (and many other Cortex M) merely when the Flash is unaccessible (erase/program).

... we aim to use as low clock speed as possible to conserve power.

... I need to improve the run time inside one high rate interrupt to meet the timing requirement.

Both requirements are in contradiction to each other.

If your high-rate-interrupt is required to run all the time, you can hardly save any power.

If not, go into one of the sleep modes as often as possible.

Rob.Riggs
Senior III
Posted on June 27, 2017 at 19:53

What address in SRAM is your code?  This makes a difference.  Code in SRAM2 will run much faster at it's 0x10000000 address base than at it's remapped 0x200xxxxx address.

At slow clock speeds, where Flash wait states are low, the performance difference is negligible especially if ART is enabled.

e d
Associate II
Posted on June 27, 2017 at 20:32

The external clock we will use is 19.2MHz and the high rate interrupt is 100KHz, which gives me 10uS to process things in there. The interrupt is run in 1 second bursts every minute or so. For a slow clock like with the 19.2MHz I would have to be ultra efficient and/or find a way to speed up, hence the question. Going into sleep modes is not an option since we need to run other low speed stuff during 'off burst' time. The hardware guys think we could save battery power by going as slow as possible. I personally want to run at least 40MHz or faster.

Posted on June 27, 2017 at 20:35

In the ART case the prefetch pipe effectively gets the data in the current cycle, ie sub 1-cycle the RAM would take. Only the FLASH is cached, and the flash line is wide, 128-bit in the F4 case as I recall.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
Posted on June 27, 2017 at 20:36

It was in the 0x200xxxxx address. I would need to try SRAM2 as you suggested. Do you know ball park how much faster so I can compare? Also, I will look on how to use the ART if RAM shows no improvement. Thanks guys!

Posted on June 27, 2017 at 20:56

On the F4 the ART add some unpredictability (hit or miss?), but even on a cache miss the linear execution from FLASH tends to be better than RAM, there was a thread where timing was measured. Again on the F4, the CCMRAM is not designed for execution, whereas the F3 does, but doesn't have the ART.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
Posted on June 27, 2017 at 21:13

He's on an L4, not an F4.  The L4's SRAM is executable and SRAM2 is faster than flash, but only barely so.  It *is* faster at lower power because Flash and ART can be turned off to save power.

Posted on June 27, 2017 at 21:26

I just did an experiment where I turned ICEN, DCEN and PRFTEN on and off in real time through the View/Register, and the DCEN seemed to be the most effective. Unfortunately my code had already had the ICEN and DCEN bits turned on by default so the ART speed up idea was already 'built in'. I still need to find somewhere else to squeeze some speed out of. Any other ideas? Thanks a bunch!

Posted on June 28, 2017 at 02:20

What is it that you are doing in the interrupt?  The most common ways to save power is to shut down the core and use DMA, deferring any processing of data until some batch of data is available to work on, then wake up, process the data at high speed, then go back to sleep.

Without a clearer idea about what you are doing, it is really hard to help.