2021-12-16 10:11 PM
2021-12-16 11:32 PM
DMIPS refers to Dhrystone benchmark which originated in the 80ies and is/was more appropriate to compare the performance of different CPU architectures. It is not the same as MIPS. The best you can get for a (non-superscalar) pipeline is 1 instruction per clock cycle. And the STM32 CPUs can come close to that in many cases thanks to 0 wait state flash, ART accelerator etc.
hth
KnarfB
2021-12-16 11:46 PM
in our project 16MHz is system clock so if 1 instruction per clock cycle then 1/16MHz = 0.000000625 i.e 625ns. we are calculating time duration by using following code. as per above calculation 0.000000625 * 240 = 150us but we are getting 74us on scope. please help me on this
HAL_GPIO_WritePin(BRE_EN_GPIO_Port, BRE_EN_Pin, GPIO_PIN_SET);
for (i =0; i < 240; i++);
HAL_GPIO_WritePin(BRE_EN_GPIO_Port, BRE_EN_Pin, GPIO_PIN_RESET);
2021-12-17 12:49 AM
2 points:
What you can do is putting a bunch of
asm volatile("nop;");
instructions between set/reset. For example, use a macro to group 10 nops and another macro to group 10 groups of 10.
Measure several points and draw an xy-diagram to get an idea about the delay caused by HAL_GPIO_WritePin.
hth
KnarfB
2021-12-17 03:03 AM
we tried method you have suggested but getting 1.1876 us time with nop operation. without nop operations it was 560ns
2021-12-17 06:39 AM
The F1 flash is VERY slow
Zero wait state testing would have been done running code from RAM
One should avoid RWM on GPIO->ODR, but rather use GPIO->BSRR
2021-12-17 06:54 AM
> The F1 flash is VERY slow
Shouldn't matter in this case, as
>> in our project 16MHz is system clock
> One should avoid RWM on GPIO->ODR, but rather use GPIO->BSRR
PGare uses Cube/HAL, are you sure that, this week, it RMWs on GPIO->ODR?
JW
2021-12-17 07:12 AM
HAL was changed quite some time ago (~years?) to use BSRR instead of ODR. Actually, not sure if it ever used ODR for strait set/reset operations. If the compiler fully optimizes the function, HAL_GPIO_WritePin should be just as fast as accessing registers directly.
2021-12-20 04:29 PM
Do the math once more... ;) The period of 16 MHz clock is 62,5 ns and 240 cycles will take 15 us. On ARM the fastest possible loop consists of SUB and BNE instructions and will take at least 2 cycles. Therefore your loop will take at least 30 us.
You don't need NOPs there, just define the variable i with a volatile type qualifier.