2015-12-15 03:47 AM
Hi everybody
I am trying to implement some mathmatic calculations in STM32F103.I use HSI and PLL to provide 64MHz and 36MHz clock, considering Flash Wait State and enabling Prefetch Buffer. I expect that the calculation time in 64MHz should be faster that 36MHz. As I checked, in both cases, the calculation time is almost same (e.g. 130 usec).For your information, I used the Timer counter value (e.g. Timer 1) in debug mode (DBGMCU_TIM1_STOP enabled) to calculate the time required for calculation processes.Can anybody help me on this issue?2015-12-15 04:17 AM
Hi,
Can you please explain how you measure the calculation time ?Enabling DBGMCU_TIM1_STOP will stop the timer from counting when you are stopped at a breakpoint (MCU is halted).But more to the point, how do you expect to measure a difference between the 2 settings (36MHz and 64MHz) when both are feeding the timer clock ? To be clear, you can also run at 8MHz and I'm quite confident you will find the same result, because the timer clock is derived from sysclk.A better way to measure the calculation time would be to turn ON an output before the calculation and turn OFF that output right after. Then you'll need an o'scope to read the pulse width which would correspond to the calculation time.2015-12-15 04:35 AM
hi, thanks for your response.
your idea to find the calculation time is also ok, but I used the following method:I inserted one break point exactly before first line of mathematics processes and one break point exactly after them.When the execution of my code is stopped at first break point I record the Timer 1 counter value (e.g. C1=1000).Then in second break point, I again record the counter value (e.g. C2=8000).Then I calculate the time by:(C2-C1)*(1+Tim1Prescaler)/Tim1ClockFrequency=(7000*(1+0)/64MHz)=109 usec.Each instruction need some clock cycles to do calculations. Therefore, I expect that if the frequency of Clock is increased, then that instruction will be calculated faster. Am I right?2015-12-15 06:13 AM
Hi,
Thanks for the debug explanation, unfortunately it will not give you what you are looking for.Let me explain: to simplify to the maximum, let say you want to measure the execution time of a given instruction, a nop. A nop takes 1 cycle to execute. So with a 36MHz clock it will take roughly 28ns to execute and roughly 16ns at 64MHz. You say you want to count the ticks with TIM1. What is the clock source of TIM1 in your configuration ? Normally it is derived from sysclk (your 36 or 64 MHz). So the counter will increment the same amount in both cases, that is why you are under the impression that the calculation time is the same. Your measure unit is not microseconds, but cycles.My point is you can not measure the calculation time with a timer fed by the same clock as the core. You can change the prescaler to have the same timer frequency for each test, e.g 1MHz (i.e. a prescaler of 36 for the 36MHz, and a prescaler of 64 for the 64MHz).2015-12-15 06:45 AM
Dear Kraal
your example (nop) is very good and simple to discuss. Let me explain more:As you mentioned nop is required only one clock to operate. During this one clock the counter value of TIM1 is increase to 1, in both 36MHz and 64MHz.but if you check my formula you will find that I divide the counter value to Timer clock frequency. Therefore:in case of 36 MHz SysClock: 1/36000000=28nsin case of 64MHz SysClock: 1/64000000=16nsIt should be note that I don't use only counter value to find calculation time, instead I divide counter value to timer clock frequency, i.e.: (C2-C1)*(1+Tim1Prescaler)/Tim1ClockFrequencyDo you agree with me now?2015-12-15 07:25 AM
Your measure unit is not microseconds, but cycles
Well cursory review of the math looks like he's getting into the time domain, so..Personally I'd use DWT_CYCCNT as it has better granularity.You might find however that the FLASH remains at a slow/constant speed and the F1 doesn't provide any caching mechanism to alleviate that.2015-12-16 02:17 AM
Well cursory review of the math looks like he's getting into the time domain, so..
I completely agree with that. However I still think that using the same prescaler for both measurement is not correct to evaluate the calculation time, as it is OP intention. Setting the timer frequency the same for both measurement should be the correct way to do what OP wants to do.Clive, I've seen other posts where you describe how to use DWT_CYCCNT, but in this case since only sysclk is going to change (I presume the code stays the same), then the number of cycles should be same for each tests, or did I miss something ?
2015-12-16 05:20 AM
Well at micro-seconds I'm not sure you'd need to use a prescaler at all, and it would keep the granularity fine. Measuring time vs time seems like a valid approach.
Now you might expect the cycles to be the same, if all things were equal between the two frequencies, and the code on the faster processor would run in less time, but the flash is going to insert more wait states. I'd expect the code on the faster processor to take more cycles, but still might be quicker in time. It might be a wash, which is what the OP is complaining about. The access time for the flash remains constant. One might want to try running the code from RAM, which is a single cycle source.