cancel
Showing results for 
Search instead for 
Did you mean: 

ADC is slow using HAL

jim_b
Visitor

I am measuring time time it takes to do ADC conversions on an STM32H753ZI Nucleo board, using the function function HAL_ADC_Start_DMA().  This converts 6 channels of ADC3, using 16 bits, with DMA.  The one function call handles everything including DMA transfer.  Then I am waiting for DMA to complete using interrupt handler HAL_ADC_ConvCpltCallback().  I am surprised how long this is taking and would like to ask if anyone has ideas on how to speed it up, or if perhaps this is expected.  I set this up using CubeMX as follows:  

ADC clock is10MHz. (I think max is 12MHz for 16bit ADC with LQFP144 package)

CPU clock is 100MHz.

ADC3 channels 0 thru 5 all have sample time 2.5 ADC clocks

ADC3 is set for 16 bits, conversion time = 8.5 ADC clocks I believe

This is a total of 11 ADC clock cycles, or 1.1usec

So for 6 channels total time should be about 1.1usec x 6 + DMA time, which should be 7-8usec.  But the time I measure from HAL_ADC_Start_DMA() to  HAL_ADC_ConvCpltCallback is about 40usec.  I am compiling in release mode, with default optimization (-Os).

I normally avoid optimization but I tried -O2 and the measured time decreased to 28usec. I also tried a 200MHz CPU clock (ADC clock still 10MHz) and it decreased further to 22usec.  But still this is slow compared with underlying hardware.  Any thoughts are appreciated, and thanks for the help.

4 REPLIES 4
TDK
Super User

Putting code into ITCMRAM will help quite a bit. Put the vector table and the callback routines in there.

Using DTCMRAM for the stack will help.

Disabling the half-complete callback, if possible, will help. Should be able to disable it after HAL_ADC_Start_DMA but before it's called.

Disabling interrupts entirely and polling for completion would avoid a lot of the slowdown. But now we're deviating from how HAL expects things to be ran. There are sacrifices to be made (size, speed) for the niceties of HAL.

 

I'm surprised compiler optimization settings were able to get it from 40 us in default release mode down to 22 us. That's a lot.

If you feel a post has answered your question, please click "Accept as Solution".
Danish1
Lead III

I think you will see a better average sample rate with a higher number of samples; the overhead of setting up DMA is only needed once per call. That overhead is vastly improved by optimization as you have seen.

For me, the real benefit of DMA is that it allows the stm32’s arm processor to do other things while the ADC (or other slow peripheral) works as fast as it can.

Pavel A.
Super User

Option -Os optimizes for minimal size, not for maximum speed.

 

KnarfB
Super User

Using HAL callbacks has an inherent overhead. Put a breakpoint in the raw interrupt handler (in some *_it.c file) and follow the path through the HAL. 

Ironically, you will be faster off with polling the DMA completion flag. Of course, this will keep the cpu busy whilst polling.

If you do periodic measurements with circular DMA, this interrupts are less of an problem because measurments and callbacks will overlap. The latency still remains.

hth

KnarfB