cancel
Showing results for 
Search instead for 
Did you mean: 

DFSDM limited dynamic range at high F_OSR?

mhaun
Associate II
Posted on July 07, 2017 at 05:45

For a while now, I've been wanting to learn the DFSDM (cascaded-integrator-comb, or CIC, filter) peripheral on the STM32L4 and use it for some serious DSP work.  This week I finally had time to dig in and get it working.  The application (for now) is an ultrasonic translator.  The ultrasonic sounds are recorded by an external microphone and digitized at ~ 400 kHz.  In software, I mix this signal with a tunable local oscillator (NCO), then decimate/LPF by 16x, and shove the result out the codec to headphones.  (I am using an STM32L476 discovery board with some custom hardware.)

The 16x decimation was working with a software-only, FFT-based filter, and I've been listening to bats with the thing.  So far so good!  But I wanted to use the hardware CIC (DFSDM) to reduce the processor load.  I also have some other project ideas for the DFSDM, and this was a good testbed for getting started with it.

The DFSDM chapter in the reference manual is rather crudely written, but after some trial and error I got things working.  I am running DFSDM in parallel-input mode with DMA on both sides.  On the input side there is a memory-to-memory DMA from my ADC ping-pong input buffer into the DFSDM, and on the output side there is a peripheral-to-memory DMA into my SAI ping-pong output buffer which is driving the codec.  (I should note that I am NOT using HAL or Cube or whatever the supplied libraries are called nowadays.  In my experience it's much easier to understand what's happening without them.  Hopefully this does not scare anyone away.)

I am seeing some unexpected behavior at high oversampling factors (F_OSR) and/or high filter orders (F_ORD).  For testing purposes I am driving the DFSDM with the full-scale, int16 sinusoid from my digital oscillator, so I should hear a perfect sine wave in the headphones.  (From experience I am fairly adept at telling when the sound is not a pure sinusoid.)  What I find is that for all F_OSR above some critical factor, I get clipping and distortion.  I can remove the distortion by reducing the amplitude of my driving sinusoid, i.e. by reducing its dynamic range to something less than 16 bits.  I made an approximate table showing the distortion threshold for filter orders of 3, 4, and 5, and a range of decimation/oversampling factors:

F_OSR (decim factor)sinc^3sinc^4sinc^5

8161616

12161614

16161612

241613.59

3216127

64158? very small

The values in the cells are the dynamic ranges, in bits, before clipping/distortion occur.  So e.g. 12 would mean that the upper four MSBs of the [nominal] 16-bit input must be zero to avoid problems.  Obviously, if the number is not 16 then the filter is broken, in the sense that the advertised 16-bit dynamic range is unattainable.

Now, anyone who understands CIC filters is bound to say at this point, hey, you bozo, you forgot to set the output bit-shift and you're clipping on the 24-bit output!  (In the DFSDM this field is called DTRBS in the CHyCFGR2 register.)  I really wish this were the case, but it's not.  I am setting the shift correctly, and in fact have checked by intentionally shifting extra to the right.  The output gets quieter, but you can still hear all the same distortion.  The distortion goes away when the input amplitude drops below the (approximate) numbers in the table above.

I double- and triple-checked everything I can think of, and at this point my only remaining theories are

1)  unpublished hardware errata

and

2)  ???

so I am really hoping that someone with intimate knowledge of the DFSDM can suggest an alternative to hypothesis 1 !!

It would really help if someone reading this is using the DFSDM successfully for

F_ORD==3 and F_OSR > 64, or

F_ORD==4 and F_OSR > 24, or

F_ORD==5 and F_OSR > 16

with ~ full-scale 16-bit input, and could reply to say 'yeah, it's working for me'.  Otherwise I am suspecting an errata.  If true, it's a huge letdown, because most of my imagined applications for the DFSDM use higher F_OSR than this

:(

Regards,

Mark

#dfsdm #stm32l4
1 ACCEPTED SOLUTION

Accepted Solutions
Posted on July 07, 2017 at 08:50

You are hitting the

https://en.wikipedia.org/wiki/Cascaded_integrator%E2%80%93comb_filter

at each stage (32-bits, AFAIK):

The equivalence of a CIC to moving average filter allows us to trivially calculate its bit growth as

N log 2 �?� ( R M ) {\displaystyle N\log _{2}(RM)}

https://wikimedia.org/api/rest_v1/media/math/render/svg/fb0381ebbfc3f455d48abe46e76761a054e3d624

.

You may want to review Hogenauer's seminal 1981 paper on the topic.

JW

View solution in original post

2 REPLIES 2
Posted on July 07, 2017 at 08:50

You are hitting the

https://en.wikipedia.org/wiki/Cascaded_integrator%E2%80%93comb_filter

at each stage (32-bits, AFAIK):

The equivalence of a CIC to moving average filter allows us to trivially calculate its bit growth as

N log 2 �?� ( R M ) {\displaystyle N\log _{2}(RM)}

https://wikimedia.org/api/rest_v1/media/math/render/svg/fb0381ebbfc3f455d48abe46e76761a054e3d624

.

You may want to review Hogenauer's seminal 1981 paper on the topic.

JW

mhaun
Associate II
Posted on July 07, 2017 at 19:48

Thanks Jan,

You are correct.  I have some familiarity with CIC filters, but I did not remember that the required width of each integrator/comb stage matches the required width at the filter output, prior to the barrel shifter.  And now that I go back over the reference documentation, it *does* say that the filter path is 32 bits wide.

That being said, IMO this is a pretty significant fail on the part of the IP designers.  On the 496 and 4a6 parts they went to the trouble of adding a direct path from the ADC to the DFSDM, which tells me they are taking parallel-data input seriously; DFSDM is not just for serial streams anymore.  That being the case, why did they spend so much silicon on bells and whistles like the analog watchdog, injected conversions, etc, while leaving the compute path only 32-bits wide?  Surely an increase to 64 bits would have used less silicon area than all of those other features?  At the very least, they could have provided a mux to let you daisy-chain filters in hardware.  (I don't think it's possible to DMA straight from the output of one filter to the input of another, because the rounded 16-bit output lies in the upper 16-bits of the output register, but the next filter input register needs the data to be in the lower 16-bits.  This implies yet another ping-pong buffer in memory, and I'm already running short on DMA channels.)

BTW nowhere in the reference manual does it mention that the oversampling/decimation factor is limited to 32x for sinc^3, 16x for sinc^4, and 8x for sinc^5, with full parallel data.  Instead, what they provide in Section 24.4.8 and Table 156 are the limits for 1-bit inputs:  1024x for sinc^3 (would be 1290x but limited by F_OSR bit field), 215x for sinc^4 and 73x for sinc^5.  Woe to the embedded-systems designer who hasn't spent some quality time in Hogenauer lately!

So, problem solved, I guess.  This is very disappointing however.

Mark