Dear Community,

to realize a nearly ideal low pass (without phase shift) – filter I need to perform a symmetric running average operation using the STM32F4 core. This operation takes 3000 sampled points of raw data and averages for each filtered point 100 of those raw data points ( +/- 50). Thus at least 290000 additions plus 2900 division are required. Currently the complete process takes about ~30 ms @ 144 MHz clock speed and I think it can be significantly accelerated. I would be glad if someone could give a suggestion what I can do to accelerate this operation.

Here is a piece of Code:

short int Even_Moving_average(short int * data_array, short int windowsize, short int position) {

int count = 0;

int average = 0;

//check

if(position - 50 < 0) return 0;

if(position + 50 + 1 > BLOCK_SIZE -1) return 0;

for(count = 0; count < 100 + 1; count++) {

average = average + data_array[count + position - 50];

}

return (short int)(average/101);

}

//call

for(count = 50; count < BLOCK_SIZE - 52; count ++) {

result[count] = Even_Moving_average(BASEwaveOUT,100,count);

}

// takes ~ 30 ms ;-(

Remember the sum. Except the first and last 50, which you solve separately as lead-in and lead-out, for each new point you then you don't need to perform 100 additions, only subtract subtract x[pos - 50] add x[pos + 50].

Try it with pencil and paper on a short average of 3 or 4.

JW