2009-07-30 05:52 AM
hand optimized FFT/IFFT for Cortex-M3 attached
2011-05-17 03:27 AM
I was using Crossworks for ARM. I did not have time to try RIDE7 to compile it, but it should not be big problem as tis is stand-alone assembler file. What errors were generated by RIDE7?
Ivan2011-05-17 03:27 AM
If you do not use preprocessor replace #define directives with .req based statements. Example:
#define y R0 // short *y -output complex array #define x R1 // short *x -input complex array #define N R2 .... replace with: y .req R0 // short *y -output complex array x .req R1 // short *x -input complex array N .req R2 .... Keep in mind that you also have to manualy comment unused LATENCY2 section in code. Ivan2011-05-17 03:27 AM
Hi imellen,
The RIDE7 (frome Raisonance, and using GNU compiler) doesn't supporting C preprocessor for the assembly code. So all the ''#define''s are not supported! I think that I will replace every defined variable directly by its corresponding register ?! Thanks for your reply and for the good work! Regards Sword2011-05-17 03:27 AM
Hi STOne-32,
this are the benchmarks for odd power of 2 complex FFT (8 ,32 ,128 ,512 ,2048 points): STM32 FFT benchmarks in CPU cycles based on real hardware measurements, code executed from flash: N - FFT size L - Flash latency F,R - coefficients in Flash or RAM * LATENCY2 option defined .. N ..... L=0 F/R .. L=1 F... L=1 r .... L=2 F* ..... L=2 r* .. 8 ........ 289 ...... 309 .... 309 ...... 335 ...... 335 .. 32 ...... 1659 ..... 1752 ... 1737 .... 2007 .... 1935 .. 128 .... 9027 ...... 9536 ... 9409 .... 11227 .... 10650 .. 512 .... 46298 .... 49206 ... 48439 ... 58390 ... 54932 .. 2048 TBD - not enough RAM on test hardware Code size: N: 8 32 128 512 2048 Unique part code: 170 170 176 186 186 bytes Shared part code: 480 byte for all Coefficient size (Flash or RAM): N: 32 128 512 2048 size: 48 240 1008 4080 bytes Example: only fft32 and fft128 used. Code size= 170+176+480 bytes; coeff size=240 bytes I'm rather busy these days, but I'll try to find some time to clean and comment code before posting it. Ivan - Quote: >Dear Ivan, >Is it possible to provide just the FFT Benchmarks cycles you get with >Radix-2 (size 8,32,126,512,2048) as you did with Radix-4 ? >All Forums users Thank you a lot for your FFT development for STM32 >Cheers, >STOne-32.2011-05-17 03:27 AM
Dear Ivan,
I really appreciate your fast answer. I have tried to open your web page at embedded Signals dot com, Can I contact you on your professional E-mail ? I would like to ask you if you are planning or already have implemented the 32-bits data inputs as well. If yes, I would be very grateful if you can give me a rough estimation about Radix-4 and Radix-2 implementation for 512 and 1024 points where coefficients are in RAM. Thank you a lot in advance, Cheers, STOne-32.2011-05-17 03:27 AM
Hi STOne-32,
sure, if you want to contact me, my email is in posted FFTCM3.s file. Regarding 32 bit FFT version, I do not have it. In principle it is not that difficult to convert existing 16 bit version, as variables in registers are 32 bits already, only multiply, load/store and coefficients has to be upgraded to 32 bits. So far I needed only 16 bit precision, so I was not motivated to write 32 bit version. Ivan2011-05-17 03:27 AM
Source code for complex 16 bit Radix 2 FFT (odd powers of 2) for Cortex-M3; N= 8, 32, 128, 512 and 2048 points
Hi all, I'm getting emails regarding availability of Radix 2 FFT, so I did minor code cleaning and got it ready for posting. Please see attachment for source code. Enjoy, if you notice some problems let me know. Ivan2011-05-17 03:27 AM
Is there anybody that converted the source code FFTr2CM3.s to Keil AA syntax ?
Thanks, Jean-Claude2011-05-17 03:27 AM
Actually, I'm switching toolset from Crossworks to Keil MDK, so I will convert FFT libraries to ARM assembler.
For now, I've converted only 1024 point real FFT. Ivan2011-05-17 03:27 AM
I use Crossworks as well, just curious if there was a reason you are switching to Keil MDK?