2015-07-24 07:32 AM
I understand the ART accelerator is designed to improve performance at high speeds (when FLASH wait states are nonzero)
But does the ART accelerator (in particular the Data cache) give any benefit if wait states are already zero, for example when the CPU runs at 4 MHz?That is, if the code is straight-line with no looping or branching, does the Data cache help.If it does, is there any document that gives more detailed information how to make best use of this feature (for optimising assembly language routines).thanks.2015-07-24 08:22 AM
Zero wait state still takes 1 cycle, at least in the F4 instruction path the prefetch on a cache hit occurs within the current cycle, instead of one cycle away in some other memory. At that point the flash is actually faster than even tightly coupled memory.
You could look at the flash line width, and cache construction, but I'm not sure it's going to materially improve your assembly optimization beyond trying to keeping the hit rate high, and things aligned within the flash/cache. In all cases you're going to learn more by profiling/benchmarking your code, and quantifying if there is or isn't any speed/current advantage here, vs running with the cache, or from RAM, etcYou could try your local rep for documentation, or app notes, but I tend to think this is one of those trade-secret areas where sufficiently exact mechanics aren't divulged for competitive reasons.2015-07-24 07:45 PM
Thanks. Sounds promising. I need to get my hands on a chip to do some testing. (or the STM32L476 Discovery board that has the in-built Ammeter would be nice).
2015-08-05 04:43 PM
Hmmmm..... Recent posts indicate that some people have been able to get samples of STM32L4 chips (any LQFP chip would be fine). Tried contacting the local sales offices as we are always told to do - samples not possible. Momentum lost.
2015-08-05 05:57 PM
Only really seen any traffic in the last few weeks, and zero general availability. Did they share a release date with you?