2014-03-26 01:00 AM
I am new to the STM32 product family and currently evaluating a STM32F0Discovery, which uses a Cortex M0 CPU.
I would like to implement and benchmark a fixed point FFT on sample data from the a/d converter. Is there a library function provided by ST for fixed point FFT that I can use for this purpose? #fft-f0 #fft-adc-f0-f1-f2-f3-f4-stm32-l02014-03-26 02:38 AM
You can download a DSP/StandardPeripheral library for any of the F3 or F4 parts, like
. The DSP library is provided by ARM, and also contains implementations for M0 and M3 cores, and for q15 and q31 fixed point formats. Some additional work might be required to add the code to your project. However, I hope you noticed that the M0 (and the M3/M4, for that matter) do not natively support fixed point formats, and the M0 has a limited instruction set tailored toward code density, and not performance. Don't expect too much.2014-03-26 12:17 PM
This is very helpful. Thank you. Should I instead consider using an M4F based processor? My goal is to perform an FFT on 200-500 A/D samples in 500us or faster. Maybe I'm too optimistic.
2014-03-26 01:45 PM
Should I instead consider using an M4F based processor? My goal is to perform an FFT on 200-500 A/D samples in 500us or faster. Maybe I'm too optimistic.
A M4 might be an overkill. Since it has an FPU, I do not see any advantage in using fixed point on the M4. I only had evaluated M4 cores, using the FPU. A 2048 point FFT (using float) took about 2.5ms with maximal optimization. That's quite enough for real-time audio (44.1 kHz). The DSP library uses 2^n data points, BTW, so you possibly want 256 or 512. And a M3 core would be a great improvement over the M0, since the latter misses an integer division instruction, amongst others. You only need to avoid float on FPU-less cores. With integer (q15/q31) you may achieve your target, perhaps with a tradeoff in array size.
2014-03-27 01:36 PM
Dear Gentleman,
In your application case, our fast A/Ds are 12-bits, so using Q15 fixed point FFT is enough to process your data. You do not need Q31 or Floating point unit, the FFT you can use might be FFT 512 points in real format not complex.
But here I can give you somerelative data
for FFT-1024 points and FFT-256 points but “complex� not real using CMSIS DSP library we provide on-line. To have an overview in average for only the Core comparison between M0 (STM32F0), M0+ (STM32L0), M3 ( STM32F1,F2, L1) and M4 ( STM32F4,F3), the units are in CPU cycles and using 0 wait-state Flash execution.
so you can compute at each selected frequency.1024-FFT (Complex in Q15 Format)
Cortex-M0 : 855 733 cycles
Cortex-M0+ : 664 531 cyclesCortex-M3 : 204 244 cycles
Cortex-M4 : 89 839 cycles256-FFT (Complex in Q15 Format)
Cortex-M0 :
175 375
Cortex-M0+ : 136 296 cyclesCortex-M3 : 41 430 cycles
Cortex-M4 : 18 480 cyclesNow it is up to you to select the right Frequency and suitable STM32 MCUs, it is always a compromise. Be aware that our STM32F1,F0 and L1 ADCs are able all to process a sample each 1µs and some others even at 0.2µs on our F3 ( Cortex-M4) devices which I recommend to have a look if you need faster ADCs up to 5MSPs.
Cheers,
{STOne-32}
2014-03-29 01:01 PM
In addition to being new to the ST family of Cortex MCUs, I'm also new to the ARM architecture in general, including the software toolchain. Currently I am using IAR which includes the CMSIS and DSP libraries provided by ARM. To enable them, under project options I checked Project Options > General Options > Library Configuration > Use CMSIS (and check use DSP).
With the STM32F051 running at 48MHz, I was able to benchmark a 128 sample Q15 real FFT (using arm_rfft_q15() library function) at 1.367 ms, which misses my target by 0.867ms. Also, it looks like the CMSIS DSP library functions only support sizes of 128, 512, and 2048 for the fixed point Q15 real FFT. I was unable to test the arm_rfft_q31() or arm_rfft_f32() functions as they wouldn't fit on the chip.Some questions about the CMSIS FFT library function:1. I am sampling audio, so using arm_rfft_q15() with a 128-element input array composed of Q15 values outputs a 256-element array with real and imaginary values interweaved, ie real0, imag0, real1, imag1... Ultimately I want magnitude over frequency, so how can I efficiently extract the frequency from the imaginary elements of the output array? Is there a library fucntion for that?2. In the CMSIS documentation for arm_rfft_q15, it says the output format is 7.9 and 6 bits should be upscaled. What is meant by this? Do I have to process the output array to make the contents meaningful?3. One of the arguments for arm_rfft_q15 is bitReverseFlag, described as ''flag that enables or disables it reversal of output. Is this used based on the endianness of the target processor?As for the A/D, I only need to capture values at 128kHz, so the fact that it can run at 1MHz is great. Converting captured A/d conversions to Q31/Q15 only took a few microseconds on the Cortex-M0. I have an M3 discovery board on order, so I'm looking forward to running these same library functions on that more capable processor.Thanks for the help!2014-08-14 02:03 AM
Hi
I am new to STM32F030 and I am trying to use the rfft on STM32F I tried to use rfft_q15 and tested with a 100Hz 4000 amplitude sine wave array. But the result comes from rfft_q15 is 200Hz 2000 amplitude. I don't know if there is any mistake I made in the initialization of FFT. Would you please take a look of my attached code? Thanks a lot. ________________ Attachments : main.c : https://st--c.eu10.content.force.com/sfc/dist/version/download/?oid=00Db0000000YtG6&ids=0680X000006I14q&d=%2Fa%2F0X0000000bk8%2FTRUix4fYvJfMTobLHxaqRkHHB6_5wdsoK89AXpLhoDQ&asPdf=false2015-11-24 11:13 AM
2015-11-24 11:14 AM
please can you provide the sample code for banch mark of the Cortex-M0 1024-FFT ?
2024-09-05 11:28 PM
Hello @Nickname12657_O ,
Can you confirm the source of these values?