2023-05-07 07:54 AM
I have these math functions that I would like to use the FPU for:
sqrtf()
cosf()
sinf()
fabs()
atan2f()
floorf()
ceilf()
truncf()
roundf()
How do I enable the FPU in the STM32H723? I see it is selected in the settings but not sure if it is enabled or not.
Do the above-listed functions call the FPU to process them or am I missing something? I need my application to run as fast as possible without using the software math libraries. Any help would be greatly appreciated.
2023-05-07 08:20 AM
There's a command line option, you can check that.
You can generate a listing file and inspect that.
The compiler can use intrinsic math functions, and can pull a library containing compound functionality.
The FPU isn't advanced, it doesn't support transcendental math functions, and it doesn't carry higher intermediate precision (ie not an 80x87 or 6888x type FPU)
You can always code your own math routines using the HW/FPU
You'll need to check the docs on ST's CORDIC functions.
GNU Double Precision, passed in FPU registers
-mfpu=fpv5-d16 -mfloat-abi=hard
2023-05-07 08:28 AM
Tends to require wrapping, here KEIL/REALVIEW
__hardfp_sqrtf
0x00000000: b510 .. PUSH {r4,lr}
0x00000002: ed2d8b02 -... VPUSH {d8}
0x00000006: eeb18ac0 .... VSQRT.F32 s16,s0
0x0000000a: ee180a10 .... VMOV r0,s16
0x0000000e: f0204000 ..@ BIC r0,r0,#0x80000000
0x00000012: f1c040ff ...@ RSB r0,r0,#0x7f800000
0x00000016: 0fc0 .. LSRS r0,r0,#31
0x00000018: d00a .. BEQ {pc}+0x18 ; 0x30
0x0000001a: ee100a10 .... VMOV r0,s0
0x0000001e: f0204000 ..@ BIC r0,r0,#0x80000000
0x00000022: f1c040ff ...@ RSB r0,r0,#0x7f800000
0x00000026: 0fc0 .. LSRS r0,r0,#31
0x00000028: bf04 .. ITT EQ
0x0000002a: 2001 . MOVEQ r0,#1
0x0000002c: f7fffffe .... BLEQ __set_errno
0x00000030: eeb00a48 ..H. VMOV.F32 s0,s16
0x00000034: ecbd8b02 .... VPOP {d8}
0x00000038: bd10 .. POP {r4,pc}
__hardfp_cosf
0x00000000: 4a44 DJ LDR r2,[pc,#272] ; [0x114] = 0x7e921fb6
0x00000002: ee101a10 .... VMOV r1,s0
0x00000006: b508 .. PUSH {r3,lr}
0x00000008: ebb20f41 ..A. CMP r2,r1,LSL #1
0x0000000c: 4668 hF MOV r0,sp
0x0000000e: d926 &. BLS {pc}+0x50 ; 0x5e
0x00000010: f04f40e6 O..@ MOV r0,#0x73000000
0x00000014: ebb00f41 ..A. CMP r0,r1,LSL #1
0x00000018: bf94 .. ITE LS
0x0000001a: 2000 . MOVLS r0,#0
0x0000001c: f04f30ff O..0 MOVHI r0,#0xffffffff
0x00000020: 9000 .. STR r0,[sp,#0]
0x00000022: 9800 .. LDR r0,[sp,#0]
0x00000024: eddf0a3c ..<. VLDR s1,[pc,#240] ; [0x118] = 0x3f800000
0x00000028: 2800 .( CMP r0,#0
0x0000002a: db5f _. BLT {pc}+0xc2 ; 0xec
0x0000002c: f0100f01 .... TST r0,#1
0x00000030: d045 E. BEQ {pc}+0x8e ; 0xbe
0x00000032: ee600a00 `... VMUL.F32 s1,s0,s0
0x00000036: f0100f02 .... TST r0,#2
0x0000003a: ed9f2a38 ..8* VLDR s4,[pc,#224] ; [0x11c] = 0x394c6d33
0x0000003e: eddf1a38 ..8. VLDR s3,[pc,#224] ; [0x120] = 0x3c0882da
0x00000042: ed9f1a38 ..8. VLDR s2,[pc,#224] ; [0x124] = 0xbe2aaaa0
0x00000046: ee401ac2 @... VMLS.F32 s3,s1,s4
0x0000004a: ee001aa1 .... VMLA.F32 s2,s1,s3
0x0000004e: ee211a20 !. . VMUL.F32 s2,s2,s1
0x00000052: eef00a40 ..@. VMOV.F32 s1,s0
0x00000056: ee400a01 @... VMLA.F32 s1,s0,s2
0x0000005a: d02d -. BEQ {pc}+0x5e ; 0xb8
0x0000005c: e040 @. B {pc}+0x84 ; 0xe0
0x0000005e: 4a32 2J LDR r2,[pc,#200] ; [0x128] = 0x46490e49
0x00000060: f0214300 !..C BIC r3,r1,#0x80000000
0x00000064: 429a .B CMP r2,r3
0x00000066: d93e >. BLS {pc}+0x80 ; 0xe6
0x00000068: f0114f00 ...O TST r1,#0x80000000
0x0000006c: eddf0a2f ../. VLDR s1,[pc,#188] ; [0x12c] = 0x3f22f983
0x00000070: ed9f1a2f ../. VLDR s2,[pc,#188] ; [0x130] = 0x4b000000
0x00000074: ee600a20 `. . VMUL.F32 s1,s0,s1
0x00000078: bf08 .. IT EQ
0x0000007a: ee700a81 p... VADDEQ.F32 s1,s1,s2
0x0000007e: ee700ac1 p... VSUB.F32 s1,s1,s2
0x00000082: bf18 .. IT NE
0x00000084: ee700a81 p... VADDNE.F32 s1,s1,s2
0x00000088: ed9f3a2a ..*: VLDR s6,[pc,#168] ; [0x134] = 0x3fc90000
0x0000008c: eddf2a2a ..** VLDR s5,[pc,#168] ; [0x138] = 0x39fda000
0x00000090: ed9f2a2a ..** VLDR s4,[pc,#168] ; [0x13c] = 0x33a22000
0x00000094: eddf1a2a ..*. VLDR s3,[pc,#168] ; [0x140] = 0x2c34611a
0x00000098: ee000ac3 .... VMLS.F32 s0,s1,s6
0x0000009c: eebd1ae0 .... VCVT.S32.F32 s2,s1
0x000000a0: ee000ae2 .... VMLS.F32 s0,s1,s5
0x000000a4: ee110a10 .... VMOV r0,s2
0x000000a8: f0000003 .... AND r0,r0,#3
0x000000ac: 9000 .. STR r0,[sp,#0]
0x000000ae: ee000ac2 .... VMLS.F32 s0,s1,s4
0x000000b2: ee000ae1 .... VMLS.F32 s0,s1,s3
0x000000b6: e7b4 .. B {pc}-0x94 ; 0x22
0x000000b8: eef10a60 ..`. VNEG.F32 s1,s1
0x000000bc: e010 .. B {pc}+0x24 ; 0xe0
0x000000be: ee200a00 ... VMUL.F32 s0,s0,s0
0x000000c2: ed9f2a20 .. * VLDR s4,[pc,#128] ; [0x144] = 0xbab23ab9
0x000000c6: f0100f02 .... TST r0,#2
0x000000ca: eddf1a1f .... VLDR s3,[pc,#124] ; [0x148] = 0x3d2a9fca
0x000000ce: ed9f1a1f .... VLDR s2,[pc,#124] ; [0x14c] = 0xbeffffdd
0x000000d2: ee401a02 @... VMLA.F32 s3,s0,s4
0x000000d6: ee001a21 ..!. VMLA.F32 s2,s0,s3
0x000000da: ee400a01 @... VMLA.F32 s1,s0,s2
0x000000de: d1eb .. BNE {pc}-0x26 ; 0xb8
0x000000e0: eeb00a60 ..`. VMOV.F32 s0,s1
0x000000e4: bd08 .. POP {r3,pc}
0x000000e6: f7fffffe .... BL __mathlib_rredf2
0x000000ea: e79a .. B {pc}-0xc8 ; 0x22
0x000000ec: ee100a10 .... VMOV r0,s0
0x000000f0: 0040 @. LSLS r0,r0,#1
0x000000f2: f1b04f7f ...O CMP r0,#0xff000000
0x000000f6: d3f3 .. BCC {pc}-0x16 ; 0xe0
0x000000f8: d107 .. BNE {pc}+0x12 ; 0x10a
0x000000fa: f04f0001 O... MOV r0,#1
0x000000fe: f7fffffe .... BL __set_errno
0x00000102: e8bd4008 ...@ POP {r3,lr}
0x00000106: f7ffbffe .... B.W __mathlib_flt_invalid
0x0000010a: e8bd4008 ...@ POP {r3,lr}
0x0000010e: f7ffbffe .... B.W __mathlib_flt_infnan
$d
0x00000112: 0000 .. DCW 0
0x00000114: 7e921fb6 ...~ DCD 2123505590
0x00000118: 3f800000 ...? DCD 1065353216
0x0000011c: 394c6d33 3mL9 DCD 961310003
0x00000120: 3c0882da ...< DCD 1007190746
0x00000124: be2aaaa0 ..*. DCD 3190467232
0x00000128: 46490e49 I.IF DCD 1179192905
0x0000012c: 3f22f983 .."? DCD 1059256707
0x00000130: 4b000000 ...K DCD 1258291200
0x00000134: 3fc90000 ...? DCD 1070137344
0x00000138: 39fda000 ...9 DCD 972922880
0x0000013c: 33a22000 . .3 DCD 866263040
0x00000140: 2c34611a .a4, DCD 741630234
0x00000144: bab23ab9 .:.. DCD 3132242617
0x00000148: 3d2a9fca ..*= DCD 1026203594
0x0000014c: beffffdd .... DCD 3204448221
2023-05-07 08:31 AM - edited 2023-11-20 06:11 AM
I did this:
And this:
Are those correct? All I want to do is to enable the FPU and then I want to call the functions I listed to use the FPU. I don't want to code my own math libraries or anything like that. I just want to be able to process the listed functions with the FPU instead of calling the software math functions which are slower than the FPU.
2023-05-07 09:03 AM
Looks reasonable. I use Keil and GNU/GCC via make
Typically code in SystemInit() enables the co-processor.
Look at the listing and map files to confirm to yourself what's built and which libraries it pulls in. The FPU functions are wrapped in library code to make them compliant with C math.h functional expectations.
2023-05-07 09:07 AM
>Are those correct?
ok so. you set -mfloat-abi=hard , so compiler will use fpu . (no need to check..)
2023-05-07 12:07 PM
How and where do I put -mfloat-abi? I am using CubeMX.
2023-05-07 12:09 PM
I know where the function is for the FPU, I just want to make sure what I did will enable it as I shown in the first picture. I added FPU_USED and FPU_PRESENT. Is this the correct way to do it in CubeMX?
2023-05-09 12:42 AM - edited 2023-11-20 06:12 AM
just for me..: i never added something in Cube for FPU !
its just a compiler setting, so set it ->
and it using FPU. thats all.