How to use the FPU STM32H723

ENort.1 · ‎2023-05-07

I have these math functions that I would like to use the FPU for:

sqrtf()

cosf()

sinf()

fabs()

atan2f()

floorf()

ceilf()

truncf()

roundf()

How do I enable the FPU in the STM32H723? I see it is selected in the settings but not sure if it is enabled or not.

Do the above-listed functions call the FPU to process them or am I missing something? I need my application to run as fast as possible without using the software math libraries. Any help would be greatly appreciated.

Tesla DeLorean · ‎2023-05-07

There's a command line option, you can check that.

You can generate a listing file and inspect that.

The compiler can use intrinsic math functions, and can pull a library containing compound functionality.

The FPU isn't advanced, it doesn't support transcendental math functions, and it doesn't carry higher intermediate precision (ie not an 80x87 or 6888x type FPU)

You can always code your own math routines using the HW/FPU

You'll need to check the docs on ST's CORDIC functions.

GNU Double Precision, passed in FPU registers

-mfpu=fpv5-d16 -mfloat-abi=hard

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

Tesla DeLorean · ‎2023-05-07

Tends to require wrapping, here KEIL/REALVIEW

    __hardfp_sqrtf
        0x00000000:    b510        ..      PUSH     {r4,lr}
        0x00000002:    ed2d8b02    -...    VPUSH    {d8}
        0x00000006:    eeb18ac0    ....    VSQRT.F32 s16,s0
        0x0000000a:    ee180a10    ....    VMOV     r0,s16
        0x0000000e:    f0204000     ..@    BIC      r0,r0,#0x80000000
        0x00000012:    f1c040ff    ...@    RSB      r0,r0,#0x7f800000
        0x00000016:    0fc0        ..      LSRS     r0,r0,#31
        0x00000018:    d00a        ..      BEQ      {pc}+0x18 ; 0x30
        0x0000001a:    ee100a10    ....    VMOV     r0,s0
        0x0000001e:    f0204000     ..@    BIC      r0,r0,#0x80000000
        0x00000022:    f1c040ff    ...@    RSB      r0,r0,#0x7f800000
        0x00000026:    0fc0        ..      LSRS     r0,r0,#31
        0x00000028:    bf04        ..      ITT      EQ
        0x0000002a:    2001        .       MOVEQ    r0,#1
        0x0000002c:    f7fffffe    ....    BLEQ     __set_errno
        0x00000030:    eeb00a48    ..H.    VMOV.F32 s0,s16
        0x00000034:    ecbd8b02    ....    VPOP     {d8}
        0x00000038:    bd10        ..      POP      {r4,pc}
 
    __hardfp_cosf
        0x00000000:    4a44        DJ      LDR      r2,[pc,#272] ; [0x114] = 0x7e921fb6
        0x00000002:    ee101a10    ....    VMOV     r1,s0
        0x00000006:    b508        ..      PUSH     {r3,lr}
        0x00000008:    ebb20f41    ..A.    CMP      r2,r1,LSL #1
        0x0000000c:    4668        hF      MOV      r0,sp
        0x0000000e:    d926        &.      BLS      {pc}+0x50 ; 0x5e
        0x00000010:    f04f40e6    O..@    MOV      r0,#0x73000000
        0x00000014:    ebb00f41    ..A.    CMP      r0,r1,LSL #1
        0x00000018:    bf94        ..      ITE      LS
        0x0000001a:    2000        .       MOVLS    r0,#0
        0x0000001c:    f04f30ff    O..0    MOVHI    r0,#0xffffffff
        0x00000020:    9000        ..      STR      r0,[sp,#0]
        0x00000022:    9800        ..      LDR      r0,[sp,#0]
        0x00000024:    eddf0a3c    ..<.    VLDR     s1,[pc,#240] ; [0x118] = 0x3f800000
        0x00000028:    2800        .(      CMP      r0,#0
        0x0000002a:    db5f        _.      BLT      {pc}+0xc2 ; 0xec
        0x0000002c:    f0100f01    ....    TST      r0,#1
        0x00000030:    d045        E.      BEQ      {pc}+0x8e ; 0xbe
        0x00000032:    ee600a00    `...    VMUL.F32 s1,s0,s0
        0x00000036:    f0100f02    ....    TST      r0,#2
        0x0000003a:    ed9f2a38    ..8*    VLDR     s4,[pc,#224] ; [0x11c] = 0x394c6d33
        0x0000003e:    eddf1a38    ..8.    VLDR     s3,[pc,#224] ; [0x120] = 0x3c0882da
        0x00000042:    ed9f1a38    ..8.    VLDR     s2,[pc,#224] ; [0x124] = 0xbe2aaaa0
        0x00000046:    ee401ac2    @...    VMLS.F32 s3,s1,s4
        0x0000004a:    ee001aa1    ....    VMLA.F32 s2,s1,s3
        0x0000004e:    ee211a20    !. .    VMUL.F32 s2,s2,s1
        0x00000052:    eef00a40    ..@.    VMOV.F32 s1,s0
        0x00000056:    ee400a01    @...    VMLA.F32 s1,s0,s2
        0x0000005a:    d02d        -.      BEQ      {pc}+0x5e ; 0xb8
        0x0000005c:    e040        @.      B        {pc}+0x84 ; 0xe0
        0x0000005e:    4a32        2J      LDR      r2,[pc,#200] ; [0x128] = 0x46490e49
        0x00000060:    f0214300    !..C    BIC      r3,r1,#0x80000000
        0x00000064:    429a        .B      CMP      r2,r3
        0x00000066:    d93e        >.      BLS      {pc}+0x80 ; 0xe6
        0x00000068:    f0114f00    ...O    TST      r1,#0x80000000
        0x0000006c:    eddf0a2f    ../.    VLDR     s1,[pc,#188] ; [0x12c] = 0x3f22f983
        0x00000070:    ed9f1a2f    ../.    VLDR     s2,[pc,#188] ; [0x130] = 0x4b000000
        0x00000074:    ee600a20    `. .    VMUL.F32 s1,s0,s1
        0x00000078:    bf08        ..      IT       EQ
        0x0000007a:    ee700a81    p...    VADDEQ.F32 s1,s1,s2
        0x0000007e:    ee700ac1    p...    VSUB.F32 s1,s1,s2
        0x00000082:    bf18        ..      IT       NE
        0x00000084:    ee700a81    p...    VADDNE.F32 s1,s1,s2
        0x00000088:    ed9f3a2a    ..*:    VLDR     s6,[pc,#168] ; [0x134] = 0x3fc90000
        0x0000008c:    eddf2a2a    ..**    VLDR     s5,[pc,#168] ; [0x138] = 0x39fda000
        0x00000090:    ed9f2a2a    ..**    VLDR     s4,[pc,#168] ; [0x13c] = 0x33a22000
        0x00000094:    eddf1a2a    ..*.    VLDR     s3,[pc,#168] ; [0x140] = 0x2c34611a
        0x00000098:    ee000ac3    ....    VMLS.F32 s0,s1,s6
        0x0000009c:    eebd1ae0    ....    VCVT.S32.F32 s2,s1
        0x000000a0:    ee000ae2    ....    VMLS.F32 s0,s1,s5
        0x000000a4:    ee110a10    ....    VMOV     r0,s2
        0x000000a8:    f0000003    ....    AND      r0,r0,#3
        0x000000ac:    9000        ..      STR      r0,[sp,#0]
        0x000000ae:    ee000ac2    ....    VMLS.F32 s0,s1,s4
        0x000000b2:    ee000ae1    ....    VMLS.F32 s0,s1,s3
        0x000000b6:    e7b4        ..      B        {pc}-0x94 ; 0x22
        0x000000b8:    eef10a60    ..`.    VNEG.F32 s1,s1
        0x000000bc:    e010        ..      B        {pc}+0x24 ; 0xe0
        0x000000be:    ee200a00     ...    VMUL.F32 s0,s0,s0
        0x000000c2:    ed9f2a20    .. *    VLDR     s4,[pc,#128] ; [0x144] = 0xbab23ab9
        0x000000c6:    f0100f02    ....    TST      r0,#2
        0x000000ca:    eddf1a1f    ....    VLDR     s3,[pc,#124] ; [0x148] = 0x3d2a9fca
        0x000000ce:    ed9f1a1f    ....    VLDR     s2,[pc,#124] ; [0x14c] = 0xbeffffdd
        0x000000d2:    ee401a02    @...    VMLA.F32 s3,s0,s4
        0x000000d6:    ee001a21    ..!.    VMLA.F32 s2,s0,s3
        0x000000da:    ee400a01    @...    VMLA.F32 s1,s0,s2
        0x000000de:    d1eb        ..      BNE      {pc}-0x26 ; 0xb8
        0x000000e0:    eeb00a60    ..`.    VMOV.F32 s0,s1
        0x000000e4:    bd08        ..      POP      {r3,pc}
        0x000000e6:    f7fffffe    ....    BL       __mathlib_rredf2
        0x000000ea:    e79a        ..      B        {pc}-0xc8 ; 0x22
        0x000000ec:    ee100a10    ....    VMOV     r0,s0
        0x000000f0:    0040        @.      LSLS     r0,r0,#1
        0x000000f2:    f1b04f7f    ...O    CMP      r0,#0xff000000
        0x000000f6:    d3f3        ..      BCC      {pc}-0x16 ; 0xe0
        0x000000f8:    d107        ..      BNE      {pc}+0x12 ; 0x10a
        0x000000fa:    f04f0001    O...    MOV      r0,#1
        0x000000fe:    f7fffffe    ....    BL       __set_errno
        0x00000102:    e8bd4008    ...@    POP      {r3,lr}
        0x00000106:    f7ffbffe    ....    B.W      __mathlib_flt_invalid
        0x0000010a:    e8bd4008    ...@    POP      {r3,lr}
        0x0000010e:    f7ffbffe    ....    B.W      __mathlib_flt_infnan
    $d
        0x00000112:    0000        ..      DCW    0
        0x00000114:    7e921fb6    ...~    DCD    2123505590
        0x00000118:    3f800000    ...?    DCD    1065353216
        0x0000011c:    394c6d33    3mL9    DCD    961310003
        0x00000120:    3c0882da    ...<    DCD    1007190746
        0x00000124:    be2aaaa0    ..*.    DCD    3190467232
        0x00000128:    46490e49    I.IF    DCD    1179192905
        0x0000012c:    3f22f983    .."?    DCD    1059256707
        0x00000130:    4b000000    ...K    DCD    1258291200
        0x00000134:    3fc90000    ...?    DCD    1070137344
        0x00000138:    39fda000    ...9    DCD    972922880
        0x0000013c:    33a22000    . .3    DCD    866263040
        0x00000140:    2c34611a    .a4,    DCD    741630234
        0x00000144:    bab23ab9    .:..    DCD    3132242617
        0x00000148:    3d2a9fca    ..*=    DCD    1026203594
        0x0000014c:    beffffdd    ....    DCD    3204448221

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

ENort.1 · ‎2023-05-07

I did this:

And this:

Are those correct? All I want to do is to enable the FPU and then I want to call the functions I listed to use the FPU. I don't want to code my own math libraries or anything like that. I just want to be able to process the listed functions with the FPU instead of calling the software math functions which are slower than the FPU.

Tesla DeLorean · ‎2023-05-07

Looks reasonable. I use Keil and GNU/GCC via make

Typically code in SystemInit() enables the co-processor.

Look at the listing and map files to confirm to yourself what's built and which libraries it pulls in. The FPU functions are wrapped in library code to make them compliant with C math.h functional expectations.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

AScha.3 · ‎2023-05-07

>Are those correct?

ok so. you set -mfloat-abi=hard , so compiler will use fpu . (no need to check..)

If you feel a post has answered your question, please click "Accept as Solution".

ENort.1 · ‎2023-05-07

How and where do I put -mfloat-abi? I am using CubeMX.

ENort.1 · ‎2023-05-07

I know where the function is for the FPU, I just want to make sure what I did will enable it as I shown in the first picture. I added FPU_USED and FPU_PRESENT. Is this the correct way to do it in CubeMX?

AScha.3 · ‎2023-05-09

just for me..: i never added something in Cube for FPU !

its just a compiler setting, so set it ->

and it using FPU. thats all.

If you feel a post has answered your question, please click "Accept as Solution".