cancel
Showing results for 
Search instead for 
Did you mean: 

hand optimized FFT/IFFT for Cortex-M3 attached

imellen
Associate II
Posted on July 30, 2009 at 14:52

hand optimized FFT/IFFT for Cortex-M3 attached

19 REPLIES 19
imellen
Associate II
Posted on May 17, 2011 at 12:27

I was using Crossworks for ARM. I did not have time to try RIDE7 to compile it, but it should not be big problem as tis is stand-alone assembler file. What errors were generated by RIDE7?

Ivan

ivan239955_stm1_st
Associate II
Posted on May 17, 2011 at 12:27

If you do not use preprocessor replace #define directives with .req based statements. Example:

#define y R0 // short *y -output complex array

#define x R1 // short *x -input complex array

#define N R2

....

replace with:

y .req R0 // short *y -output complex array

x .req R1 // short *x -input complex array

N .req R2

....

Keep in mind that you also have to manualy comment unused LATENCY2 section in code.

Ivan

sword_82
Associate II
Posted on May 17, 2011 at 12:27

Hi imellen,

The RIDE7 (frome Raisonance, and using GNU compiler) doesn't supporting C preprocessor for the assembly code. So all the ''#define''s are not supported! I think that I will replace every defined variable directly by its corresponding register ?!

Thanks for your reply and for the good work!

Regards

Sword

imellen
Associate II
Posted on May 17, 2011 at 12:27

Hi STOne-32,

this are the benchmarks for odd power of 2 complex FFT

(8 ,32 ,128 ,512 ,2048 points):

STM32 FFT benchmarks in CPU cycles based on real hardware measurements, code executed from flash:

N - FFT size

L - Flash latency

F,R - coefficients in Flash or RAM

* LATENCY2 option defined

.. N ..... L=0 F/R .. L=1 F... L=1 r .... L=2 F* ..... L=2 r*

.. 8 ........ 289 ...... 309 .... 309 ...... 335 ...... 335

.. 32 ...... 1659 ..... 1752 ... 1737 .... 2007 .... 1935

.. 128 .... 9027 ...... 9536 ... 9409 .... 11227 .... 10650

.. 512 .... 46298 .... 49206 ... 48439 ... 58390 ... 54932

.. 2048 TBD - not enough RAM on test hardware

Code size:

N: 8 32 128 512 2048

Unique part code: 170 170 176 186 186 bytes

Shared part code: 480 byte for all

Coefficient size (Flash or RAM):

N: 32 128 512 2048

size: 48 240 1008 4080 bytes

Example: only fft32 and fft128 used. Code size= 170+176+480 bytes; coeff size=240 bytes

I'm rather busy these days, but I'll try to find some time to clean and comment code before posting it.

Ivan

-

Quote:

>Dear Ivan,

>Is it possible to provide just the FFT Benchmarks cycles you get with >Radix-2 (size 8,32,126,512,2048) as you did with Radix-4 ?

>All Forums users Thank you a lot for your FFT development for STM32

>Cheers,

>STOne-32.

16-32micros
Associate III
Posted on May 17, 2011 at 12:27

Dear Ivan,

I really appreciate your fast answer. I have tried to open your web page at embedded Signals dot com, Can I contact you on your professional E-mail ?

I would like to ask you if you are planning or already have implemented the 32-bits data inputs as well. If yes, I would be very grateful if you can give me a rough estimation about Radix-4 and Radix-2 implementation for 512 and 1024 points where coefficients are in RAM.

Thank you a lot in advance,

Cheers,

STOne-32.

imellen
Associate II
Posted on May 17, 2011 at 12:27

Hi STOne-32,

sure, if you want to contact me, my email is in posted FFTCM3.s file.

Regarding 32 bit FFT version, I do not have it. In principle it is not that difficult to convert existing 16 bit version, as variables in registers are 32 bits already, only multiply, load/store and coefficients has to be upgraded to 32 bits. So far I needed only 16 bit precision, so I was not motivated to write 32 bit version.

Ivan

imellen
Associate II
Posted on May 17, 2011 at 12:27

Source code for complex 16 bit Radix 2 FFT (odd powers of 2) for Cortex-M3; N= 8, 32, 128, 512 and 2048 points

Hi all,

I'm getting emails regarding availability of Radix 2 FFT, so I did minor code cleaning and got it ready for posting. Please see attachment for source code.

Enjoy, if you notice some problems let me know.

Ivan

jcrepetto
Associate II
Posted on May 17, 2011 at 12:27

Is there anybody that converted the source code FFTr2CM3.s to Keil AA syntax ?

Thanks,

Jean-Claude

imellen
Associate II
Posted on May 17, 2011 at 12:27

Actually, I'm switching toolset from Crossworks to Keil MDK, so I will convert FFT libraries to ARM assembler.

For now, I've converted only 1024 point real FFT.

Ivan

jeff239955_stm1
Associate II
Posted on May 17, 2011 at 12:27

I use Crossworks as well, just curious if there was a reason you are switching to Keil MDK?