2013-10-29 08:57 AM
I'm unsure on how to use the FIR interface. From the provided example:
int a[N];/*filter output vector*/ short x[M+N-1] = {x0,x1...,xM+N-1};/*filter input vector*/ short h[M]={h0,h1...,hM-1};/*filter coefficients vector*/ fir_coefs.nh = M; /*Number of Coefficients for FIR*/ fir_coefs.h = h; /*Pointer on FIR coefficient vector*/ fir_16by16_stm32(a,x,&fir_coefs,N);/*performs the FIR filtering*/ Suppose I only have 4 samples (the minimum allowed), I put them into the first 4 elements of x, and call the fir func. But if i have continuous stream, can i just load again the next 4 same way? Trying now but I see wrong stuff seems to come out. (I find the ARM math library fir interface is more clearer in this respect, it keeps the state and input separate)2013-11-01 08:26 AM
answering myself,
if using in continuous mode you have to shift delay line yourself, the lib doesnt do it :\ and i only saw now that the latest/newest sample has to go first in input array array, and same for output (latest is at 0).2013-11-08 01:33 AM
If you commend out the f10xxx includes, then it also builds and runs on other models.
I'm currently using it on stm32l15 device .2013-11-08 02:03 AM
The DSP_lib core is basically vendor and device agnostic. It is provided by ARM, and included by ST into its library framework, as done by others (NXP, Infineon, EnergyMicro, ...).
This core DSP functions only differ between M0, M3 and M4, by way of conditional build.2013-11-14 05:58 AM
Decided to share on using the STM's FIR in my project, vs the simplest (dumb) FIR implementation. (Still hope to improve on the use if the lib, and not sure using it best way...)
///Performance
(=cycles)
of
FIR
the
dumb
way,
vs
STM's
DSPLib
optimized
FIR
///
|---------------------------------------------------|
///
|
STM32L152xB,
8
MHz,
40
taps,
4
samples
|
///
|
Simple(dumb)
FIR:
|
///
|
Dbg(low
opt)
:
cycles
=
4321
<->
uS
=
540
|
///
|
Rel(full
opt)
:
cycles
=
2819
<->
uS
=
352
|
///
|
STM
FIR:
|
///
|
Dbg(low
opt)
:
cycles
=
1276
<->
uS
=
159
|
///
|
Rel(full
opt)
:
cycles
=
1102
<->
uS
=
137
|
///
|---------------------------------------------------|
///Compiler:
IAR
6.60.2.5449
// The simplest FIR//
int
ii;
//
int
accsum;
//
for
(int
i=0;
i
<
NSMPLES;
++i)
//
{
//
shift_arr_[0]
=
smpls_P[i];
//
accsum
=
0;
//
//FIR
//
for
(ii=0;
ii
<
N_TAPS_;
++ii)
//
accsum
+=
static_cast<int32_t>(FIR_FILTER_TAPS_[ii])
*
shift_arr_[ii];
//
output_arr_[i]
=
static_cast<int16_t>(
(accsum
+
HALF)
>>FBITS
);
//
//Shift-delay
//
for
(ii
=
N_TAPS_-1
-
1;
ii
>=0;
--ii)
//
shift_arr_[ii
+
1]
=
shift_arr_[ii];
//
}
//
result_
=
output_arr_[NSMPLES-1];
// FIR with STM lib//Copy
samples
into
the
shift/intput
arr.
(TODO:
rem
this
copying
by
shift_arr_
->
smpls_P)
std
::
memcpy(
shift_arr_
,
smpls_P
,
NSMPLES
*
sizeof
*
smpls_P
);
void
*
vp
=
&
COEFFS_
;
fir_16by16_stm32
(
output_arr_
,
shift_arr_
,
static_cast
<
COEFS
*>(
vp
),
NSMPLES
);
//LAME:
have
to
shift
delay
line
ourselvs!
(TODO:
can't
just
modify
that
ST's
asm
?)
for
(
int
ii
=
(
N_TAPS_
+
NSMPLES
-
1
-
1
)
-
NSMPLES
;
ii
>=
0
;
--
ii
)
shift_arr_
[
ii
+
NSMPLES
]
=
shift_arr_
[
ii
];
//Scale
back
all
results.
for
(
int
i
=
0
;
i
<
NSMPLES
;
++
i
)
output_arr_
[
i
]
=
static_cast
<
int16_t>(
(
output_arr_
[
i
]
+
HALF
)
>>
FBITS
);
//Note:
most
recent
is
at
0.
See
also
filter
init/ctor:
reversed
sample
order.
result_
=
output_arr_
[
0
];
I hope to take their asm fir and modify it so that I don't have to extra shiftdelay, outside of their func