How to calculate instruction execution time in stm32f103. In data sheet they have given --72 MHz maximum frequency, 1.25 DMIPS/MHz (Dhrystone 2.1) performance at 0 wait state memory access. please help me on this

PGare.1 · ‎2021-12-16

KnarfB · ‎2021-12-16

DMIPS refers to Dhrystone benchmark which originated in the 80ies and is/was more appropriate to compare the performance of different CPU architectures. It is not the same as MIPS. The best you can get for a (non-superscalar) pipeline is 1 instruction per clock cycle. And the STM32 CPUs can come close to that in many cases thanks to 0 wait state flash, ART accelerator etc.

hth

KnarfB

PGare.1 · ‎2021-12-16

in our project 16MHz is system clock so if 1 instruction per clock cycle then 1/16MHz = 0.000000625 i.e 625ns. we are calculating time duration by using following code. as per above calculation 0.000000625 * 240 = 150us but we are getting 74us on scope. please help me on this

		HAL_GPIO_WritePin(BRE_EN_GPIO_Port, BRE_EN_Pin, GPIO_PIN_SET);
		for (i =0; i < 240; i++);
		HAL_GPIO_WritePin(BRE_EN_GPIO_Port, BRE_EN_Pin, GPIO_PIN_RESET);

KnarfB · ‎2021-12-17

2 points:

check the assembly code. The compiler might optimize "useless" code
GPIO access take >1 cycles.

What you can do is putting a bunch of

asm volatile("nop;");

instructions between set/reset. For example, use a macro to group 10 nops and another macro to group 10 groups of 10.

Measure several points and draw an xy-diagram to get an idea about the delay caused by HAL_GPIO_WritePin.

hth

KnarfB

PGare.1 · ‎2021-12-17

we tried method you have suggested but getting 1.1876 us time with nop operation. without nop operations it was 560ns

Tesla DeLorean · ‎2021-12-17

The F1 flash is VERY slow

Zero wait state testing would have been done running code from RAM

One should avoid RWM on GPIO->ODR, but rather use GPIO->BSRR

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

waclawek.jan · ‎2021-12-17

> The F1 flash is VERY slow

Shouldn't matter in this case, as

>> in our project 16MHz is system clock

> One should avoid RWM on GPIO->ODR, but rather use GPIO->BSRR

PGare uses Cube/HAL, are you sure that, this week, it RMWs on GPIO->ODR?

JW

TDK · ‎2021-12-17

HAL was changed quite some time ago (~years?) to use BSRR instead of ODR. Actually, not sure if it ever used ODR for strait set/reset operations. If the compiler fully optimizes the function, HAL_GPIO_WritePin should be just as fast as accessing registers directly.

https://github.com/STMicroelectronics/STM32CubeF1/blob/f5aaa9b45492d70585ade1dac4d1e33d5531c171/Drivers/STM32F1xx_HAL_Driver/Src/stm32f1xx_hal_gpio.c#L465

If you feel a post has answered your question, please click "Accept as Solution".

Piranha · ‎2021-12-20

Do the math once more... ;) The period of 16 MHz clock is 62,5 ns and 240 cycles will take 15 us. On ARM the fastest possible loop consists of SUB and BNE instructions and will take at least 2 cycles. Therefore your loop will take at least 30 us.

You don't need NOPs there, just define the variable i with a volatile type qualifier.