Floating point execution times can vary widely between different version of compliers, see ARM GNU below,
I'm in the same boat as you, need floating point, all the integer processors suck at it (figure 100 times slower as a first estimate), below are some times I and a friend compiled, if anyone has any more data, please add on to the list below, thanks! MPS430, 32 bit floats, imagecraft complier, typical cycles add 158 sub 184 mul 332 div 620 AVR, IAR Full opt for speed, 32 bit floats, typical cycles add 173 sub 176 mul 175 div 694 sqrt 2586 log 3255 ARM7, GNU 3.3.1 complier 32 bit floats, typical cycles add 472 sub 478 mul 439 div 652 sqrt 2387 log 13,523 ARM7, GNU 3.4.3 complier 32 bit floats, typical cycles add 73 sub 74 mul 428 div 142 8051, keil complier, 32 bit floats, typical cycles add 199 sub 201 mul 219 div 895 sqrt 1117 log 2006
Thank you a lot for the hint !! You are right and Here I have re-checked again my results and I get new ones : Results are : Cortex-M3, Keil 3.22 , Option -03 Time + ''Use MicroLib Checked'',exact CPU cycles cos 29097 sin 30516 sqrt 2028 atan 29795 add 87 mul 449 div 249 Cortex-M3, Keil 3.22 , Option -03 Time + ''Use MicroLib NOT Checked'',exact CPU cycles cos 2667 sin 2710 sqrt 728 atan 2587 add 35 mul 38 div 68 => However, I see that Program code size is now muliplied by x2 from 6K to 12K which is quite reasonable. Cortex-M3, IAR 5.11 compiler 32 bit floats,exact CPU cycles cos 1864 sin 1946 sqrt 2725 atan 3423 add 40 mul 33 div 53 So Now I see that IAR and Keil are quite similar in average ;) Hi markus, In my program I simply used Systick Counter and removing some software overhead for funtion entries and then stores results in a Results[] Table. /* Configure HCLK/CPU clock as SysTick clock source */ SysTick_CLKSourceConfig(SysTick_CLKSource_HCLK); SysTick_SetReload(0xFFFFFF); /* Enable the SysTick Counter */ SysTick_CounterCmd(SysTick_Counter_Enable); var0[0] = var1; SysTick_CounterCmd(SysTick_Counter_Disable); Dummytiming= 0xFFFFFF - SysTick_GetCounter(); /* Clear the SysTick Counter */ SysTick_CounterCmd(SysTick_Counter_Clear); SysTick_SetReload(0xFFFFFF); /* Enable the SysTick Counter */ SysTick_CounterCmd(SysTick_Counter_Enable); var0[0]=cos(var1); SysTick_CounterCmd(SysTick_Counter_Disable); Results[0]= 0xFFFFFF - SysTick_GetCounter() - Dummytiming; /* Clear the SysTick Counter */ SysTick_CounterCmd(SysTick_Counter_Clear); ..... Cheers, STOne-32. [ This message was edited by: STOne-32 on 11-06-2008 20:33 ]
We just bought the Keil compiler and I got a bit worried about your results, even if floats are not that important to us. I used toolchain version 3.20 and tried with and without the ''MicroLIB''. The difference was huge. With MicroLIB: cos 34443 sin 36140 sqrt 2319 atan 35339 add 120 mult 525 div 327 Without MicroLIB: cos 2957 sin 3006 sqrt 821 atan 2800 add 56 mult 58 div 92 It seems the MicroLIB is not that good at floats. When asking about the drawbacks with using MicroLIB the answer has been ''None, its better''. Now we know at least one difference :) /Niklas
Read from Address 0xE0001004 is always 0: #define CORE_SysTick (*((u32*)0xE0001004)) int Systiming = CORE_SysTick; If I want use cos() then the cpu go to interrupt HardFaultException() Line: var0[0]=cos(var1); The compiller have no errors and warnings, I have all libs included: arm-none-eabi-ld -v -e 0 -LC:/WinARM/CodeSourcery/lib/gcc/arm-none-eabi/4.2.3/thumb -LC:/WinARM/CodeSourcery/arm-none-eabi/lib/thumb -Tprj/STM32F103CB-ROM.ld -Map=main.map out/src/main.o out/src/stm32f10x_it.o out/src/stm32f10x_vector.o -lm -lgcc --output main.elf What make I wrong?
Hi STOne-32 I measured the execution time by running each of the functions in the debugger and checking the CYCCNT register before and afterwards. In Keil uVision this is shown in the Regs tab under Internal, but it can be access at address 0xE0001004 (according to the Cortex-M3 ref. man.). I read a bit about the MicroLIB in Keil's Online manual and it states As MicroLib has been optimized to minimize code size, some functions will execute more slowly than the standard C library routines available in the RealView compilation tools.It also states Microlib can sensibly be used with either --fpmode=std or --fpmode=fast.I tried using --fpmode=fast but I did not notice any difference, but perhaps you need to define the variables as double instead of float to notice. /Niklas
But I have problems with working with floats: volatile float Var1, Var2, Var3; Var2 = Var3 = 3.0; Var1 = Var2 + Var3; On this last Line comes a Exception and the core is going to the interrupt HardFaultException(). In Assembler: 0x0000043a : ldr r0, [sp, #8] 0x0000043c : ldr r1, [sp, #4] 0x0000043e : blx 0x8848 0x00000442 : str r0, [sp, #12] 0x00000444 : mov.w r5, #0 ; 0x0 0x00000448 : mov r7, r5 0x0000044a : mov r6, r5 I have no idea, what is wrong. Can you help me, please?
I'm trying to do floating point operations on STM32 too. I tried to run the code on my IAR KickStart board via IAR JLINK but the results of array var0[] were unavailable. Can I have ur IAR project of this? Thank you. [ This message was edited by: suyongyao on 13-06-2008 11:17 ]
I have results for Rowley Crossworks 1.7 with no optimization, flash debug option and HW cycle counter at address 0xE0001004
cos: 2733 sin: 2637 sqrt: 2310 atan: 3590 add: 46 mul: 50 div: 138 for optimization level 2: cos: 2690 sin: 2594 sqrt: 2267 atan: 3547 add, mul and div were optimized away ;) And this were lib function for double!!! using float versions (32 bit) gives this results for no optimization cosf: 1288 sinf: 1190 sqrtf: 819 atanf: 1720 So ... the winner is? Anyway, we should also look at documentation for float libs. I think that super fast math doesn't even bother to check for valid data, supported types (single, double), etc. Rowley does all that. [ This message was edited by: slawcus on 20-06-2008 13:55 ]