Showing results for 
Search instead for 
Did you mean: 

STM32H7 faster using floats than using uint32_t

Associate III

I made following two versions of the same software to process DMA data, one using floats and the second using uint32_t to try run faster, but as a surprise the first one run faster:



void process_data_ADC1()
	float samples1=(2.0f/(float)ADC_BUF1);  // =1/samples
	uint32_t media1=0,media2=0;
	for (int i=0;i<ADC_BUF1/16;i++)
			for (int j=0;j<8;j++)
	float med1=(float) media1*samples1;
	float med2=(float) media2*samples1;
	avg16[1]=(uint16_t) (16.0f*med1+0.5f);
	avg16[2]=(uint16_t) (16.0f*med2+0.5f);
	float rms1=0.0f,rms2=0.0f,x;
	for (int i=0;i<ADC_BUF1/16;i++)
			for (int j=0;j<8;j++)
					x=(buffer_ADC1[i*16+j]  -med1);rms1+=x*x;//maximo 2^12 sin overflow para 12 bits
	rms16[1]=(uint16_t) (16.0f*sqrt(rms1*samples1)+0.5f);
	rms16[2]=(uint16_t) (16.0f*sqrt(rms2*samples1)+0.5f);
//161b Usando uints32 en vez de float: ES MAS LENTO!!!!!!!
void process_data_ADC1_fast()
	float samples1=(2.0f/(float)ADC_BUF1);  // =1/samples
	uint32_t media1=0,media2=0,samples00=ADC_BUF1/2;
	for (int i=0;i<ADC_BUF1/16;i++)
			for (int j=0;j<8;j++)
	float med1=(float) media1*samples1;
	float med2=(float) media2*samples1;
	avg16[1]=(uint16_t) (16.0f*med1+0.5f);
	avg16[2]=(uint16_t) (16.0f*med2+0.5f);
	uint32_t rms1=0,rms2=0,x;
	for (int i=0;i<ADC_BUF1/16;i++)
			for (int j=0;j<8;j++)
					x=(buffer_ADC1[i*16+j]  -med1);rms1+=x*x;//maximo 2^12 sin overflow para adc de 12 bits y variables de 32 bits
	x=16.0f*sqrt((float) rms1*samples1);rms16[1]= (uint16_t) (x+0.5f);
	x=16.0f*sqrt((float) rms2*samples1);rms16[2]= (uint16_t) (x+0.5f);




This is the routine used to measure time:



uint32_t measure_time(void)
	uint32_t static start = 0;
	uint32_t time2= SysTick->VAL;
	return (time2);



(It surprised to me that the systick timer runs backward)


It took in debug mode:

49321 ticks the float routine

49321 tics the uint32_t routine 




@JLope.11 wrote:

one using floats and the second using uint32_t to try run faster, but as a surprise the first one run faster:

but then


@JLope.11 wrote:

It took in debug mode:

49321 ticks the float routine

49321 tics the uint32_t routine 

So they actually take the same time?

On a CPU with a hardware floating-point unit, I don't think that's necessarily surprising?

Chief III

Your first "time2" is 


time2 = 0 - systick.   Is negative ...ok? 

next time2 = old systick - new systick  . Also negative...  ed.

If you feel a post has answered your question, please click "Accept as Solution".

Yes SYSTICK down counts and is only 24-bit, and often has a DIV8 prescaler.

However DWT CYCCNT is 32-bit and upcounts processor cycles.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..