Compilling with float?

- Try calling the linker using the gcc driver, it will probably solve your problem, eg.

arm-none-eabi-gcc -Wl,-LC:/WinARM/CodeSourcery/lib/gcc/arm-none-eabi/4.2.3/thumb -Tprj/STM32F103CB-ROM.ld,-Map=main.map out/lib/src/cortexm3_macro.o out/src/main.o out/src/stm32f10x_it.o out/src/stm32f10x_vector.o out/src/analogout.o -o main.elf

Cheers

sjo

[ This message was edited by: sjo on 10-06-2008 15:08 ] - Hi all,

I've used Both Keil version 3.22 ( Free version 32K) and IAR EWARM v5.11 ( Free version 32K) , not yet GCC from CodeSourcery on a small routine using "cos(), sin(), one addition , one multiplication and one division with only "float" variables using Keil "MCBSTM32" board at 8Mhz and I found that the same function is 7 time faster with IAR than using Keil floating librairies.

I' very intersted to see the result with GCC :-) and I'm wondering if you have experienced similar results ? which seems very strange and keil is not providing optimized floating libraries.

Best Regards,

Asterix.

Ps: I've used #include "math.h" in the header files.

[ This message was edited by: asterix.magigimix on 10-06-2008 17:47 ] - Floating point execution times can vary widely between different version of compliers, see ARM GNU below,

I'm in the same boat as you, need floating point, all the integer processors suck at it (figure 100 times slower as a first estimate), below are some times I and a friend compiled, if anyone has any more data, please add on to the list below, thanks!

MPS430, 32 bit floats, imagecraft complier, typical cycles

add 158

sub 184

mul 332

div 620

AVR, IAR Full opt for speed, 32 bit floats, typical cycles

add 173

sub 176

mul 175

div 694

sqrt 2586

log 3255

ARM7, GNU 3.3.1 complier 32 bit floats, typical cycles

add 472

sub 478

mul 439

div 652

sqrt 2387

log 13,523

ARM7, GNU 3.4.3 complier 32 bit floats, typical cycles

add 73

sub 74

mul 428

div 142

8051, keil complier, 32 bit floats, typical cycles

add 199

sub 201

mul 219

div 895

sqrt 1117

log 2006 - Hi sjo, Asterix,

I've done some basic measurements with RVCT ( Keil 3.22) and EWARM 5.11 from IAR using our Cortex-M3 and here is my function :

#include

#define __pi 3.14159265

float var1 = __pi ;

float var2 = __pi/2;

float var0[10] = {0,0,0,0,0,0,0,0,0,0};

int main(void)

{

var0[0]= cos(var1);

var0[1]= sin(var1);

var0[2]= sqrt(var1);

var0[3]= atan(var1);

var0[4]= var2 + var1;

var0[5]= var2 * var1;

var0[6]= var2 / var1;

while(1);

}

Results are :**Cortex-M3, Keil 3.22 compiler 32 bit floats,exact CPU cycles**

cos 41359

sin 43389

sqrt 3090

atan 42243

add 127

mul 624

div 445**Cortex-M3, IAR 5.11 compiler 32 bit floats,exact CPU cycles**

cos 1864

sin 1946

sqrt 2725

atan 3423

add 40

mul 33

div 53

I'm quite surprised too by the results, where I see that keil is completely off while using trigonometric libraries. May be I'm missing something here ? Could you please try the same at your end and tell me your results for GCC from codesourcery.

Cheers,

STOne-32. - Hi Niklas,

Thank you a lot for the hint !! You are right and Here I have re-checked again my results and I get new ones :

Results are :**Cortex-M3, Keil 3.22 , Option -03 Time + "Use MicroLib Checked",exact CPU cycles**

cos 29097

sin 30516

sqrt 2028

atan 29795

add 87

mul 449

div 249**Cortex-M3, Keil 3.22 , Option -03 Time + "Use MicroLib NOT Checked",exact CPU cycles**

cos 2667

sin 2710

sqrt 728

atan 2587

add 35

mul 38

div 68

=> However, I see that Program code size is now muliplied by x2 from 6K to 12K which is quite reasonable.**Cortex-M3, IAR 5.11 compiler 32 bit floats,exact CPU cycles**

cos 1864

sin 1946

sqrt 2725

atan 3423

add 40

mul 33

div 53

So Now I see that IAR and Keil are quite similar in average ;-)

Hi markus,

In my program I simply used Systick Counter and removing some software overhead for funtion entries and then stores results in a Results[] Table.

/* Configure HCLK/CPU clock as SysTick clock source */

SysTick_CLKSourceConfig(SysTick_CLKSource_HCLK);

SysTick_SetReload(0xFFFFFF);

/* Enable the SysTick Counter */

SysTick_CounterCmd(SysTick_Counter_Enable);

var0[0] = var1;

SysTick_CounterCmd(SysTick_Counter_Disable);

Dummytiming= 0xFFFFFF - SysTick_GetCounter();

/* Clear the SysTick Counter */

SysTick_CounterCmd(SysTick_Counter_Clear);

SysTick_SetReload(0xFFFFFF);

/* Enable the SysTick Counter */

SysTick_CounterCmd(SysTick_Counter_Enable);

var0[0]=cos(var1);

SysTick_CounterCmd(SysTick_Counter_Disable);

Results[0]= 0xFFFFFF - SysTick_GetCounter() - Dummytiming;

/* Clear the SysTick Counter */

SysTick_CounterCmd(SysTick_Counter_Clear);

.....

Cheers,

STOne-32.

[ This message was edited by: STOne-32 on 11-06-2008 20:33 ] - Hi STOne-32

We just bought the Keil compiler and I got a bit worried about your results, even if floats are not that important to us.

I used toolchain version 3.20 and tried with and without the "MicroLIB". The difference was huge.

With MicroLIB:

cos 34443

sin 36140

sqrt 2319

atan 35339

add 120

mult 525

div 327

Without MicroLIB:

cos 2957

sin 3006

sqrt 821

atan 2800

add 56

mult 58

div 92

It seems the MicroLIB is not that good at floats. When asking about the drawbacks with using MicroLIB the answer has been "None, its better". Now we know at least one difference :-)

/Niklas - Hi Markus

Hi STOne-32

I measured the execution time by running each of the functions in the debugger and checking the CYCCNT register before and afterwards. In Keil uVision this is shown in the Regs tab under Internal, but it can be access at address 0xE0001004 (according to the Cortex-M3 ref. man.).

I read a bit about the MicroLIB in Keil's Online manual and it states*As MicroLib has been optimized to minimize code size, some functions will execute more slowly than the standard C library routines available in the RealView compilation tools.*

It also states*Microlib can sensibly be used with either --fpmode=std or --fpmode=fast.*

I tried using --fpmode=fast but I did not notice any difference, but perhaps you need to define the variables as double instead of float to notice.

/Niklas - Hello,

Read from Address 0xE0001004 is always 0:

#define CORE_SysTick (*((u32*)0xE0001004))

int Systiming = CORE_SysTick;

If I want use cos() then the cpu go to interrupt HardFaultException()

Line: var0[0]=cos(var1);

The compiller have no errors and warnings, I have all libs included:

arm-none-eabi-ld -v -e 0 -LC:/WinARM/CodeSourcery/lib/gcc/arm-none-eabi/4.2.3/thumb -LC:/WinARM/CodeSourcery/arm-none-eabi/lib/thumb -Tprj/STM32F103CB-ROM.ld -Map=main.map out/src/main.o out/src/stm32f10x_it.o out/src/stm32f10x_vector.o -lm -lgcc --output main.elf

What make I wrong? - Now, the SysTick Counter is working.

But I have problems with working with floats:

volatile float Var1, Var2, Var3;

Var2 = Var3 = 3.0;

Var1 = Var2 + Var3;

On this last Line comes a Exception and the core is going to the interrupt HardFaultException().

In Assembler:

0x0000043a : ldr r0, [sp, #8]

0x0000043c : ldr r1, [sp, #4]

0x0000043e : blx 0x8848

0x00000442 : str r0, [sp, #12]

0x00000444 : mov.w r5, #0 ; 0x0

0x00000448 : mov r7, r5

0x0000044a : mov r6, r5

I have no idea, what is wrong. Can you help me, please? - Hi Markus,

Thank you for your contribution and the results for GNU CodeSourcery

so we can say that both :

1) Keil 3.22 , Option -03 Time + Not using "MicroLib" Option

2) IAR 5.11 , Option Full Speed

Have quite the most optimized floating libraries running with STM32 in average :-)

Let's see the next builds of both Compilers coming in the next couple of weeks/months ;-)

Cheers, STOne-32. - I have results for Rowley Crossworks 1.7 with no optimization, flash debug option and HW cycle counter at address 0xE0001004

cos: 2733

sin: 2637

sqrt: 2310

atan: 3590

add: 46

mul: 50

div: 138

for optimization level 2:

cos: 2690

sin: 2594

sqrt: 2267

atan: 3547

add, mul and div were optimized away ;)

And this were lib function for double!!!

using float versions (32 bit) gives this results for no optimization

cosf: 1288

sinf: 1190

sqrtf: 819

atanf: 1720

So ... the winner is?

Anyway, we should also look at documentation for float libs. I think that super fast math doesn't even bother to check for valid data, supported types (single, double), etc. Rowley does all that.

[ This message was edited by: slawcus on 20-06-2008 13:55 ] Quote:

On 19-06-2008 at 19:26, Anonymous wrote:**With arm-none-eabi-gcc in thumb2 mode (GNU ld (Sourcery G++ Lite 2008q1-126) 2.18.50.20080215), 32 bit floats, exact CPU cycles**

cos 4316

sin 4319

sqrt ???

atan 4440

add 92

mul 48

div 73

This information is very helpful! Thank you! Could you maybe run the same benchmark, but only with 64-bit floating point numbers (double)? I'm busy with a project that requires 64-bit precision and need to know it the Cortex M3 with GNU GCC will work.

In my project I must use float numbers, but on compiling I become this error:

undefined reference to `__aeabi_ui2f'

undefined reference to `__aeabi_fdiv'

I start the linker with this command:

arm-none-eabi-ld -v -lgcc -LC:/WinARM/CodeSourcery/lib/gcc/arm-none-eabi/4.2.3/thumb -Tprj/STM32F103CB-ROM.ld -Map=main.map out/lib/src/cortexm3_macro.o out/src/main.o out/src/stm32f10x_it.o out/src/stm32f10x_vector.o out/src/analogout.o --output main.elf

The path from parameter -L exists, the file libgcc.a, too.

Or have I forgot a parameter?

Thank you for help. Regards Markus.