Skip to main content
herbert
Associate III
March 12, 2024
Solved

Help needed with STM32 ASM function

  • March 12, 2024
  • 2 replies
  • 1970 views

Hallo All,

I am currently stuck and need your help. The following function counts the no. of 1 in a 32 bit word on an STM32F407. Why does it fail for values which have their bit 31 set? Precisely, it ignores bit 31 and in these cases is off by one.

 .global countBits
.text
.syntax unified
countBits:
mov r3, r0
mov r2, #32
mov r0, #0
next:
lsls r3,r3,#1
it mi
addmi r0, #1
subs r2, #1
bne next
bx lr

I would be glad if anyone could point me to the error which obviously I do not see.

Thanks a lot
Herby

Best answer by STM_Thirty2
lsls r3,r3,#1

 This instruction is shifting the value in R3 to the left by 1. If the 31st bit is set on the initial input number  then you are shifting it out and it is lost.

2 replies

STM_Thirty2
STM_Thirty2Best answer
ST Employee
March 12, 2024
lsls r3,r3,#1

 This instruction is shifting the value in R3 to the left by 1. If the 31st bit is set on the initial input number  then you are shifting it out and it is lost.

"If you feel a post has answered your question, please click ""Accept as Solution"""
Tesla DeLorean
Guru
March 12, 2024

You're using the sign bit, which is already in the high order bit at the first iteration

 .global countBits
 .text
 .syntax unified
countBits:
 mov r3, r0
 mov r0, #0
 orrs r3, r3
next:
 it mi
 addmi r0, #1
 lsls r3,r3,#1
 bne next
 bx lr

There are other faster ways of doing a population count.

popcount32:
	mov.w	r1, #$55555555
	and.w	r1, r1, r0, lsr #1
	subs	r0, r0, r1
	mov.w	r1, #$33333333
	and.w	r1, r1, r0, lsr #2
	and.w	r0, r0, #$33333333
	add	r0, r1
	add.w	r0, r0, r0, lsr #4
	and.w	r0, r0, #$F0F0F0F
	add.w	r0, r0, r0, lsr #8
	add.w	r0, r0, r0, lsr #16
	and.w	r0, r0, #$3F
	bx	lr
Tips, Buy me a coffee, or three.. PayPal VenmoUp vote any posts that you find helpful, it shows what's working..
herbert
herbertAuthor
Associate III
March 13, 2024

Thanks you for your input. Yes, I know and normally use the Gillies-Miller method for sideways addition. I was just drafting this other solution to make some time measuring experiments to show to my students how much faster sideways addition is.

Herby

Tesla DeLorean
Guru
March 13, 2024

If you know where the bits are going to lay you can change the shift direction so the looping method can "finish early". The loop counting wastes time in all situations.

I'd probably have opted for the shift into carry and use ADC r0,#0 type method and avoided the conditional execution entirely.

Tips, Buy me a coffee, or three.. PayPal VenmoUp vote any posts that you find helpful, it shows what's working..