The CPU performance difference between running in the flash and running in the RAM for STM32F407
Hello,
I try to find the performance difference between running in the flash and running in the RAM for STM32F407.
I write a test source code. The function runs in the flash or in the RAM.
I find that the performance in the RAM is about 20% poorer than in the flash when the CPU frequency is 168MHz.
The datasheet describes that ''the performance achieved thanks to the ART accelerator is equivalent to 0 wait state program execution from Flash memory at a CPU frequency up to 168 MHz''.
And the datasheet also describes that ''RAM memory is accessed (read/write) at CPU clock speed with 0 wait states''.
Both the RAM and flash memory are accessed with 0 wait states.
Why is the performance in the RAM poorer than in the flash? Is it reasonable?Remark:
The compiler is IAR Embedded Workbench for ARM 7.10.1.6735 . optimization=high. My source code: int main(void) { /* Initialize Leds mounted on STM32F4-Discovery board */ STM_EVAL_LEDInit(LED4); STM_EVAL_LEDInit(LED3); STM_EVAL_LEDInit(LED5); STM_EVAL_LEDInit(LED6); GPIO_PORT[1]->BSRRL = GPIO_PIN[1]; code_to_be_measured(); GPIO_PORT[1]->BSRRH = GPIO_PIN[1]; } #ifdef PLACE_IN_RAM __ramfunc void code_to_be_measured() #else void code_to_be_measured() #endif { volatile unsigned long int l_count, l_count_max=1000; volatile int a,b,s; volatile unsigned int j; a = 1000; b = 2000; for (l_count=0;l_count<l_count_max;++l_count) { s = a + b + l_count; } }