Why nucleo STM32F429Zi taking too much time to execute user generated program 4096 point FFT?

cAnth · ‎2018-12-05

we are using nucleo STM32F429ZI (Cortex M4 uC ) running with the following configurations.

Compiler : Keil, Truestudio

Freq : 168MHz

FPU : Single Hardware

Optimization: O3 in keil, Os in truestudio

We have tested the user generated program 4096 point FFT. To complete the single time execution of the function take 110ms (FFT function is normally look like a c code, no pheriperals used, and even not initialized. GPIO only initialized for verifying the time of execution).

But we had checked with Renasas (Cortex M4 uC ) and same software routine. We have used E2 STUDIO compiler to compile the code. when i execute the same code it takes around 320us . we compiled the project in E2STUDIO with the following configutation.

Compiler : E2_Studio

freq : 240MHz

FPU : Single Hardware

What should i do to accomplish in the st32M429 to work same as Renasas controller speed (Approx 320us + 10%).

I have attached my code for your reference.

Thanks in advance.

waclawek.jan · ‎2018-12-06

Make sure you use the FPU in the 'F4 (there are switches in the IDEs, I don't use them so you have to look up yourself).

Some of your code compiles as double. Floating point constants need to be suffixed by 'f' if you want to keep them as float, otherwise the whole expression involving those constants will be compiled as double, thus not executed by the FPU but in software, which is orders of magnitude slower.

JW

View solution in original post

waclawek.jan · ‎2018-12-06

Make sure you use the FPU in the 'F4 (there are switches in the IDEs, I don't use them so you have to look up yourself).

Some of your code compiles as double. Floating point constants need to be suffixed by 'f' if you want to keep them as float, otherwise the whole expression involving those constants will be compiled as double, thus not executed by the FPU but in software, which is orders of magnitude slower.

JW

AvaTar · ‎2018-12-06

Additionally, check the used ABI. With FFT code, -mfloat-abi=hard would probably make sense.

cAnth · ‎2018-12-06

Thanks JW

The FPU is already being used.

As you told the floating point constants are suffixed by 'f', Also sin and cos are suffixed with 'f'.

Now the execution time gets reduced to 15ms. Is it possible to reduce furthermore.

cAnth · ‎2018-12-06

Thanks AvaTar,

-mfloat-abi=hard would means about enable the FPU in hardware which is already enable.

AvaTar · ‎2018-12-06

No.

It means floating point arguments are passed in FPU register, not on the stack.

This option must match your libs, i.e. "soft-float" and "hard" functions do not work well together, even if both use the FPU.

cAnth · ‎2018-12-07

When we select used -mfloat-abi=hard in keil5, it says unrecognized option. I have attached images for your reference.

When we select FPU in true studio, -mfloat-abi=hard is added automatically.

AvaTar · ‎2018-12-07

The option " -mfloat-abi" is GCC style, but the Keil compiler is not based on gcc. It seems this option has a different name there, check the documentation.

The options "soft", "soft_fp" and "hard" are defined in ARM's ABI, and (since Keil <is> ARM) "hard" is surely supported in Keil uVision.