2020-01-14 04:44 AM
Tried creating OS task 100 times in Example codes with FreeRTOS taken from STM32CubeMX for both F429 and F767 and found the observations as below.
F429 - 6 Ticks
F767 - 16 Ticks
Difference - 10 Ticks
What is the reason for the delay and Is there any other way to speed up
2020-01-14 05:34 AM
Different number of wait states, Code in RAM? Show relevant parts!
2020-01-15 10:24 PM
Below added code snippet which is used for testing both F429 and F767 boards. Found the Tick difference between part highlighted.
2020-01-20 02:37 AM
Even for simple malloc observed the tick difference between F429 and F767.
For memory allocation 10000 times.
F429 - 68 Ticks
F767 - 78 Ticks
Tick difference - 10Ticks
Below is the code part.
void StartDefaultTask(void const * argument)
{
int *ptr;
/* USER CODE BEGIN 5 */
/* Infinite loop */
for(;;)
{
printf( "Tick_test_1:%d\n", xTaskGetTickCount() );
for(long i=0;i<10000;i++)
{
ptr = (int*) malloc(5*sizeof(int));
}
printf( "Tick_test_2:%d\n", xTaskGetTickCount() );
osDelay(1);
}
}
2020-01-20 03:34 AM
Do you understand that the first printf() and (I guess) UART transmission underneath is included in your measurement? And xTaskCreate() and malloc() both use dynamic memory and are not deterministic in terms of both - processing time and success of result.
2020-01-23 03:07 AM
yes, I tried in other approach. Is this a better method to check the performance.
I tried to increment a variable in one tick count and the results are below.
F429 - a=976
F767 - a=691
F767 is not running as many times F429 is running through the code in specific tick.
And the situation is only task running that is this default task and code base is default simple example code taken from STM32cubemx
2020-01-23 02:26 PM
Disable all interrupts (__disable_irq()/__enable_irq()) and use DWT->CYCCNT for precise measurement.
How are clocks, PLL, buses, flash and cache configured?
2020-01-24 03:34 AM
I tried attaching the complete code but it is not allowed here. I am attaching the main function snapshot and system clock config functions snapshot.
Code is taken from STM32CubeMX V 4.24
Firmware package versions
F429 - STM32Cube_FW_F4_V1.9.0
F767 - STM32Cube_FW_F7_V1.15.0
Nothing else is changed in that example.
Results for the below code when kept variable(a) in live watch:
F429 - a=998
F767 - a=661
2020-01-25 02:57 PM
Compare the how HAL_Init() configures FLASH_ACR in both cases.
2020-01-27 01:45 AM
@Piranha @Uwe Bonnes
Major difference in Hal_init() is data and instruction cache and prefetch .
Tried the combinations and didn't find much diffference.
F429-with cache and prefetch enabled - a=998
F429 with cache and prefetch disabled - a=997
F767-with cache and prefetch enabled - a=661
F767 with cache and prefetch disabled - a=661
F767 is slow because of there is no data caching ?
Hal_init comparison F767-F429
F429_Flash_register_status
F767_Flash_register_status