Showing results for 
Search instead for 
Did you mean: 

Issues using CMSIS Inverse FFT on STM32


Hey everyone. I am working on a project where I am attempting to detect a frequency of a signal using autocorrelation. I am doing this through applying FFTs and IFFTs. Basically, I have been running into some issues trying to get the correct functionality using the CMSIS dsp library and I am really struggling to figure out what is wrong and how to correct it. To provide proof of concept, I first implemented the algorithm in python like so:


def autocorrelation_via_fft(x):
    # Compute the FFT and then (its magnitude)^2
    Fx = np.fft.fft(x)
    Pxx = np.abs(Fx)**2
    # Inverse FFT to get autocorrelation
    autocorrelation = np.fft.ifft(Pxx)
    autocorrelation = np.real(autocorrelation) 
    return autocorrelation[:len(x)//2]


And this works very well for getting an autocorrelation result and combined with some peak detection, I can very accurately determine the frequency being play with harmonics (in my case a guitar). I then proceeded to translate this code into C using the FFTW library like so:


void autocorrelate(double* x, int N, double* autocorrelation) {
    // Allocate input and output arrays for the FFT
    fftw_complex *in, *out;
    in = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * N);
    out = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * N);

    // Copy the real signal into the complex input array
    for (int i = 0; i < N; i++) {
        in[i][0] = x[i]; // Real part
        in[i][1] = 0.0;  // Imaginary part

    // Create a plan to calculate the FFT of the signal
    fftw_plan plan_forward = fftw_plan_dft_1d(N, in, out, FFTW_FORWARD, FFTW_ESTIMATE);
    // Execute the plan to calculate the FFT
    // Calculate the power spectrum (magnitude squared of the FFT)
    for (int i = 0; i < N; ++i) {
        double real = out[i][0];
        double imag = out[i][1];
        out[i][0] = real * real + imag * imag; // Magnitude squared
        out[i][1] = 0.0; // Zero out the imaginary part since it should not affect the result

    // Create a plan to calculate the inverse FFT of the power spectrum
    fftw_plan plan_backward = fftw_plan_dft_1d(N, out, in, FFTW_BACKWARD, FFTW_ESTIMATE);
    // Execute the plan to calculate the inverse FFT
    // Copy the real part of the inverse FFT result into the autocorrelation array
    for (int i = 0; i < N; ++i) {
        autocorrelation[i] = in[i][0] / N; // Scale result by N for proper normalization

    // Free the memory allocated for the FFTW arrays and plans


And this produces the exact same result for autocorrelation as python using the FFTW dsp functions (although I had to do more conditioning/normalizing of the data compared to python). I should note that for testing I have just been using a pure sinusoid at 82 Hz to verify functionality (python works on real audio data I confirmed).

Now that's all great, but I can not seem to get the same functionality as this C code when working on the STM32 with CMSIS and I feel like I've tried everything, so I am not certian that the ifft/fft functions in CMSIS are exactly the same as FFTW or python. Here is my (not working attempt):


void autocorrelate(float32_t* x, uint32_t N, float32_t* autocorrelation) {
    float32_t* p_fft_input;
    float32_t* p_fft_output;
    arm_rfft_fast_instance_f32 fftInstance;

    // Allocate memory for the input and output arrays
    p_fft_input = (float32_t *)malloc(sizeof(float32_t) * N * 2);  // FFT input is 2x for complex numbers?
    p_fft_output = (float32_t *)malloc(sizeof(float32_t) * N * 2); // FFT output is also 2x for complex numbers?

    // Initialize the FFT instance
    arm_rfft_fast_init_f32(&fftInstance, N);

    // Zero-pad the input array and copy the real signal into the input array
    for (uint32_t i = 0; i < N; ++i) {
        p_fft_input[i * 2] = x[i];     // Real part
        p_fft_input[i * 2 + 1] = 0.0f; // Imaginary part, which is zero

    // Calculate the FFT
    arm_rfft_fast_f32(&fftInstance, p_fft_input, p_fft_output, 0);

    // Calculate the power spectrum (magnitude squared)
    for (uint32_t i = 0; i < N; ++i) {
        arm_cmplx_mag_squared_f32(&p_fft_output[i * 2], &p_fft_output[i], 1);

    // Inverse FFT to get autocorrelation
    arm_rfft_fast_f32(&fftInstance, p_fft_output, autocorrelation, 1);

    // Normalize
    for (uint32_t i = 0; i < N; ++i) {
        autocorrelation[i] /= (float32_t)N;



I tried to implement sort of what FFTW is doing, but I am not sure if that is right. But regardless, what I should get for the autocorrelation is
I instead get very off numbers:


I am very stuck on this, so if anyone has any sort of insight as to why the cmsis library is not producing similar results I would appreciate. I am thinking I might have to do more conditioning to the input to the ifft due to some differences, but what that is I am not sure. I pretty much only have a basic understanding of how these functions work. 



Chief II