cancel
Showing results for 
Search instead for 
Did you mean: 

Performance of floating point operations of F429

JWilk.1
Associate II

Hello,

I am measuring execution time of a function that operates with floating point numbers and noticed that it is more than I expect. I am using Nucleo F429ZI, clock is configured to the 50 MHz. For measuring the execution time I use TIM6 configured in the following way:

  TIM6->PSC = 49;
  TIM6->ARR = 0xFFFF;
  TIM6->CNT = 0;

For the test, I am measuring the time for which the following statement is processed:

float a = 1.3;
float b = 0.0007;
float c = 1.0;
float d = 12.25;
 
TIM6->CNT = 0;
TIM6->CR1 |= 1;
c += (a - b) * d;
TIM6->CR1 &= ~1;

TIM6->CNT shows 28 microseconds. Disassembly is:

100       	c += (a - b) * d;
08000548:   vldr    s14, [r7, #20]
0800054c:   vldr    s15, [r7, #16]
08000550:   vsub.f32        s14, s14, s15
08000554:   vldr    s15, [r7, #8]
08000558:   vmul.f32        s15, s14, s15
0800055c:   vldr    s14, [r7, #12]
08000560:   vadd.f32        s15, s14, s15

According to ARM, the total execution time for this instructions should be 11 cycles (according to CYCCNT they are 15).

The point that I am confused is why 15 cycles correspond to nearly 30 microseconds. Is way I am measuring it incorrect? Or there are some waits that I am not aware of?

Best regards.

John

1 ACCEPTED SOLUTION

Accepted Solutions

TIM6->PSC = 49;

TIMx_is preloaded - it means, that this value does not get "active" (loaded into the prescaler counter itself) until Update event occurs - either as overflow, or as a "forced" event through writing 1 to TIMx_EGR.UG.

> TIM6->CNT shows 28 microseconds.

Provided I'm correct in my guess, i.e. that the prescaler counter is still at its default 0 (i.e. no prescaling), this would mean 28 cycles, which is nearly a match, given there are some cycles to stop the timer, and then things like FLASH latency, RAM latency etc.

JW

View solution in original post

7 REPLIES 7

TIM6->PSC = 49;

TIMx_is preloaded - it means, that this value does not get "active" (loaded into the prescaler counter itself) until Update event occurs - either as overflow, or as a "forced" event through writing 1 to TIMx_EGR.UG.

> TIM6->CNT shows 28 microseconds.

Provided I'm correct in my guess, i.e. that the prescaler counter is still at its default 0 (i.e. no prescaling), this would mean 28 cycles, which is nearly a match, given there are some cycles to stop the timer, and then things like FLASH latency, RAM latency etc.

JW

Make sure the chip is clocking at the rate expected. For short intervals I can't seen the need to a prescaler. A 32-bit TIM like TIM2/TIM5 would give broader coverage, but DWT_CYCCNT is the usual method.

30uS sounds a bit long

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
gregstm
Senior III

.. an alternative method is to use the BSRR register to set/reset an IO pin and measure with an oscilloscope - crude I know, but removes the uncertainty of whether the timer is behaving as you hoped..

JWilk.1
Associate II

Thanks to all of you for the quick responses.

I missed the fact that the prescaler value is preloaded. So after configuring the prescaler I added:

TIM6->EGR = TIM_EGR_UG;

And the counter was showing 0. After removing the prescaler at all, I got CNT of 30 (cycles) which looks much more reasonable.

Thanks a lot one more time.

Best regards,

John

What's crude in this?

Au contraire, I find this to be the finest method of all.

JW

I also agree that this is the finest method. However, due to the current situation (COVID, quarantine) I don't have an oscilloscope near at hand.

yes, "crude" is a poor word choice - "simple" probably better. The BSRR register is one of my favourite features/tool of the chip. I hope you and your oscilloscope are reunited soon.