Help needed with STM32 ASM function
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2024-03-12 11:39 AM
Hallo All,
I am currently stuck and need your help. The following function counts the no. of 1 in a 32 bit word on an STM32F407. Why does it fail for values which have their bit 31 set? Precisely, it ignores bit 31 and in these cases is off by one.
.global countBits
.text
.syntax unified
countBits:
mov r3, r0
mov r2, #32
mov r0, #0
next:
lsls r3,r3,#1
it mi
addmi r0, #1
subs r2, #1
bne next
bx lr
I would be glad if anyone could point me to the error which obviously I do not see.
Thanks a lot
Herby
Solved! Go to Solution.
- Labels:
-
STM32Cube MCU Packages
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2024-03-12 12:15 PM - edited ‎2024-03-12 12:16 PM
lsls r3,r3,#1
This instruction is shifting the value in R3 to the left by 1. If the 31st bit is set on the initial input number then you are shifting it out and it is lost.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2024-03-12 12:15 PM - edited ‎2024-03-12 12:16 PM
lsls r3,r3,#1
This instruction is shifting the value in R3 to the left by 1. If the 31st bit is set on the initial input number then you are shifting it out and it is lost.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2024-03-12 12:32 PM
You're using the sign bit, which is already in the high order bit at the first iteration
.global countBits
.text
.syntax unified
countBits:
mov r3, r0
mov r0, #0
orrs r3, r3
next:
it mi
addmi r0, #1
lsls r3,r3,#1
bne next
bx lr
There are other faster ways of doing a population count.
popcount32:
mov.w r1, #$55555555
and.w r1, r1, r0, lsr #1
subs r0, r0, r1
mov.w r1, #$33333333
and.w r1, r1, r0, lsr #2
and.w r0, r0, #$33333333
add r0, r1
add.w r0, r0, r0, lsr #4
and.w r0, r0, #$F0F0F0F
add.w r0, r0, r0, lsr #8
add.w r0, r0, r0, lsr #16
and.w r0, r0, #$3F
bx lr
Up vote any posts that you find helpful, it shows what's working..
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2024-03-13 10:32 AM
Thanks you for your input. Yes, I know and normally use the Gillies-Miller method for sideways addition. I was just drafting this other solution to make some time measuring experiments to show to my students how much faster sideways addition is.
Herby
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2024-03-13 1:12 PM
If you know where the bits are going to lay you can change the shift direction so the looping method can "finish early". The loop counting wastes time in all situations.
I'd probably have opted for the shift into carry and use ADC r0,#0 type method and avoided the conditional execution entirely.
Up vote any posts that you find helpful, it shows what's working..
