Code executing in external flash - what is the execution speed?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2023-12-12 10:27 PM
Hi!
At what speed would code in external flash memory execute? At the clock freq of the external flash or the by main clock freq of the MCU, or by some combination of the two?
I assume once the code is fetched from the external flash, it would execute at the main clock freq, but then again, it needs to be fetched first at the speed of the flash clock freq, right?
Solved! Go to Solution.
- Labels:
-
STM32H7 Series
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2023-12-13 04:47 AM - edited ‎2023-12-13 04:53 AM
Hello,
It's recommended to enable both even for code execution for literal pool data.
It's not recommended to disable the Cache as it increases the performance but in some cases you need to disable cachability in some memory regions using MPU.
The usage of Cache is a bit tricky in some cases mainly for Cache coherency handling and needs to be careful on how to use it and and which case you need to disable it as I said before or handle the cache maintenance by software. You can refer to the AN4839 "Level 1 cache on STM32F7 Series and STM32H7 Series" for more details.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2023-12-13 03:20 AM - edited ‎2023-12-13 03:33 AM
Hello,
A fast answer: on F7/H7 products it depends on if you have enabled the cache.
- If not, you will be limited by the external flash bandwidth.
- If the Cache is enabled the code will be fetched to the cache as long as there is a cache miss (limited with the external flash bandwidth) until the code can fit it. At that time the CPU executes the code from the cache and the code starts to be running at CPU speed as long as there is no Cache miss.
You can also refer to the AN4891 "STM32H72x, STM32H73x, and single-core STM32H74x/75x system architecture and performance" for more details about H7 perfs.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2023-12-13 03:39 AM
The caching helps to significantly lessen the impact of the lower bandwidth. The cache line fill is 32 bytes.
Would suggest benchmarking test code with DWT CYCCNT
And speed of doing a large continuous read.
Up vote any posts that you find helpful, it shows what's working..
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2023-12-13 04:09 AM
Thank you both,
-Is it the ICache or DCache that needs to be enabled, or perhaps both?
-As a general cache question, are there any reasons/situations where it would not be desireable to enable cache, like any drawbacks on H7 or other devices?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2023-12-13 04:47 AM - edited ‎2023-12-13 04:53 AM
Hello,
It's recommended to enable both even for code execution for literal pool data.
It's not recommended to disable the Cache as it increases the performance but in some cases you need to disable cachability in some memory regions using MPU.
The usage of Cache is a bit tricky in some cases mainly for Cache coherency handling and needs to be careful on how to use it and and which case you need to disable it as I said before or handle the cache maintenance by software. You can refer to the AN4839 "Level 1 cache on STM32F7 Series and STM32H7 Series" for more details.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2023-12-13 08:15 AM
Generally in areas that aren't "memory" you want to disable caching.
Peripheral mappings where strict in-order operation is anticipated, ie FIFO's and state-machines. Where order and width are important.
You don't want data writes folded, delayed, or speculative reads.
ST has an app note and MPU configurations designed to help the processor deal with OSPI / QSPI memory in the most desirable manner.
Up vote any posts that you find helpful, it shows what's working..