2025-02-22 08:52 AM
I finally got the following code to function after forcing a compiler error to see what registers the compiler was selecting for the inline assembly output and inputs. What I found was that the compiler was using register R3 for both output (count) and the first input. To work around this I created an "unused" variable and made it the first item in the input list and did not use argument %1 in the code.
Is this due to a bug or limitation in compilation of ARM inline assembly that results in the same register being used for output and the first input?
uint32_t numOnes(uint32_t value)
{
uint32_t unused = 0; // not used
uint32_t bit = 32; // number of bits
uint32_t count = 0; // number of ones
asm (
"loop:\n\t" // loop
"TST %2,#1\n\t" // mask for bit0
"BEQ continue\n\t" // branch if bit0 is 0
"ADD %0,#1\n\t" // bit0 is 1 so increment count #
"continue:\n\t" // jump to here if bit0 is 0
"LSR %2,#1\n\t" // move next bit into bit0 position
"SUBS %3,#1\n\t" // decrement bit number
"BNE loop\n\t" // next bit
: "=r"(count) : "r"(unused) ,"r"(value) ,"r"(bit)
);
return count;
}