2023-12-29 08:57 AM
Do you think that it is common for the function bellow (arraySize = 32) to take approximately 125us to execute? The for loop is the main problem as it takes almost 120us to execute.
I am using STM32G474ret6u MCU, with clock at 170Mhz? Also the optimization is set to -03.
Does anyone have any idea in which direction should I go to optimize it? Should I examine the assembler code, or try to use some peripheral like FMAC?
const float reComponent[32] = {1.000000, 0.923880, 0.707107, 0.382683, 0.000000, -0.382683, -0.707107, -0.923880,
-1.000000, -0.923880, -0.707107, -0.382683, -0.000000, 0.382683, 0.707107, 0.923880, 1.000000, 0.923880,
0.707107, 0.382683, 0.000000, -0.382683,-0.707107, -0.923880, -1.000000, -0.923880, -0.707107, -0.382683,
-0.000000, 0.382683, 0.707107, 0.923880 };
const float imComponent[32] = {0.000000, 0.382683, 0.707107, 0.923880, 1.000000, 0.923880, 0.707107, 0.382683, 0.000000,
-0.382683, -0.707107, -0.923880, -1.000000, -0.923880, -0.707107, -0.382683, -0.000000, 0.382683, 0.707107,
0.923880, 1.000000, 0.923880, 0.707107, 0.382683, 0.000000, -0.382683, -0.707107, -0.923880, -1.000000,
-0.923880, -0.707107, -0.382683 };
float DFTphase(uint16_t* inputArray, int arraySize)
{
//local variables
float fkRe=0;
float fkIm=0;
float phase=0;
//Computing of Fourier series
for (int n = 0; n < arraySize; n++)
{
fkRe = fkRe + (*inputArray - 2048.0) * reComponent[n];
fkIm = fkIm + (*inputArray - 2048.0) * imComponent[n];
//Assign address of next element to pointer inputArray
inputArray++;
}
//Evaluation of phase; atan2f function returns angle in the interval [-PI,PI]
phase= atan2f(fkRe,fkIm);
return phase;
}
Solved! Go to Solution.
2023-12-29 09:47 AM
Your input is uint16, so subtracting 2048 (as integer) would be much faster , at same precision as your 2048.0 (as double float); try...and tell , how much faster it is.
+
Cordic can do the atan in about 140ns (i tried on H563 at 250MHz) - if this helps.
2023-12-29 09:47 AM
Your input is uint16, so subtracting 2048 (as integer) would be much faster , at same precision as your 2048.0 (as double float); try...and tell , how much faster it is.
+
Cordic can do the atan in about 140ns (i tried on H563 at 250MHz) - if this helps.
2023-12-29 10:10 AM
Crazy :D . The whole function execution time is now approximately 8us (before it was 126us), I am more than pleased with that :D. I double-checked because I couldn't believe it.
Thank you.
2023-12-29 10:41 AM
Also you can put the const arrays in RAM: fetching from RAM may be faster than flash.
Suggest to massage the code a bit so it doesn't scratch the reviewer's eye...
float DFTphase(uint16_t* inputArray, int arraySize)
{
assert(arraySize <= 32);
float fkRe=0;
float fkIm=0;
for (int n = 0; n < arraySize; n++)
{
float v = (float)(inputArray[n] - 2048U);
fkRe += v * reComponent[n];
fkIm += v * imComponent[n];
}
//Evaluation of phase; atan2f function returns angle in the interval [-PI,PI]
return atan2f(fkRe,fkIm);
}
2023-12-29 11:01 AM
massage
:face_with_tears_of_joy: