How to improve performance of linear interpolated table read

misterstarshine · ‎2017-03-20

Posted on March 20, 2017 at 20:12

Hi.

I'm reading 1025 index float lookup tables where the last item is for looping the data. The input to the function is a 32-bit phase number which is converted to 10-bit for indexing and a 22-bit rest for interpolation.

This is the function

float readinterpolated(const uint32_t x, const float *datapt)
{
uint32_t coarse=x>>22;
float fine=(x%4194304)/41940f;
return datapt[coarse]+(datapt[coarse+1]-datapt[coarse])*fine;
}�?�?�?�?�?�?

and the generated assembly code is

 lsrs r3, r0, #22 @ coarse, x,
 ubfx r2, r0, #0, #22 @ D.12138, x,,
 add r1, r1, r3, lsl #2 @ tmp127, datapt, coarse,
 vmov s15, r2 @ int @ D.12138, D.12138
 vldr.32 s0, [r1] @ D.12139, *_8
 vcvt.fs32 s15, s15, #22 @ fine, D.12138,
 vldr.32 s14, [r1, #4] @ *_12, *_12
 vsub.f32 s14, s14, s0 @ D.12139, *_12, D.12139
 vfma.f32 s0, s15, s14 @, fine, D.12139
 bx lr @�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?�?

Is this the fastest way of doing it? I'm using it in a DSP application to generate waveforms and I need the optimum performance possible since my application is running near full capacity now without all features added to it.

waclawek.jan · ‎2017-03-22

Posted on March 22, 2017 at 10:22

Looks reasonably optimal. Using asm, inline or full, you could tweak the ordering of instructions, that depending on the particular core (M4/M7) might result in better utilization of parallelism between the integer and float execution units. You can also use a secondary table with precalculated differences. The positioning of code and also the tables (FLASH/RAM/TCM or cached memory in M7) might make a difference too. There might be some tweaking available outside of the code you presented too.

Expect hard work and no miracles, though.

JW

Tesla DeLorean · ‎2017-03-22

Posted on March 22, 2017 at 11:55

The dual table might help. Or a 2D array with 2x1024 entries.

Strikes me that a 64 bit load and interleaving the dependencies might buy a few cycles.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..