cancel
Showing results for 
Search instead for 
Did you mean: 

Inline assembly code - register usage

Brian12412
Associate II

I finally got the following code to function after forcing a compiler error to see what registers the compiler was selecting for the inline assembly output and inputs. What I found was that the compiler was using register R3 for both output (count) and the first input. To work around this I created an "unused" variable and made it the first item in the input list and did not use argument %1 in the code.

Is this due to a bug or limitation in compilation of ARM inline assembly that results in the same register being used for output and the first input?

 

 

 

uint32_t numOnes(uint32_t value)
{
	uint32_t unused = 0; // not used
	uint32_t bit = 32; // number of bits
	uint32_t count = 0; // number of ones
	asm (
			"loop:\n\t" // loop
			"TST %2,#1\n\t" // mask for bit0
			"BEQ continue\n\t" // branch if bit0 is 0
			"ADD %0,#1\n\t" // bit0 is 1 so increment count #
			"continue:\n\t" // jump to here if bit0 is 0
			"LSR %2,#1\n\t" // move next bit into bit0 position
			"SUBS %3,#1\n\t" // decrement bit number
			"BNE loop\n\t" // next bit
			: "=r"(count) : "r"(unused) ,"r"(value) ,"r"(bit)
	);
	return count;
}

 

4 REPLIES 4

The ARM ABI is going to use R0 for the input parameter#1 (value), and for the return parameter (count)

Simple review, gets more complex >4 parameters as these will be on the stack, but you can look it up if important

R0 Parameter#1

R1 Parameter#2

R2 Parameter#3

R3 Parameter#4

Returns

R0/R1 for 32-bit / 64-bit returns

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

https://learn.microsoft.com/en-us/cpp/build/overview-of-arm-abi-conventions?view=msvc-170#integer-registers

 

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

For calling an assembly routine from C what is observed matches the "R0,R1,R2,..." pattern.  However in this case the code is not calling an assembly routine rather it is executing inline assembly and that is where the result and first input parameter are observed to be both R3.  The second input parameter R2 and the third input parameter R1.  The output parameter is specified as "=r" and GCC/GNU inline assembly specification (that is arguably complex) seems to suggest that whatever register the compiler chooses for output should not be reused for an input parameter and in this case with STM32CubeIDE it is reused.  Perhaps someone from STM will take note of this post and shed some light on why this is happening.

ST doesn't write the compilers, they might have devs that contribute, or report bugs.

If you want full control, stop using in-line methods, they are very difficult to integrate into the flow, and how the compiler wants to allocate/use registers. It's trying to follow a set of rules and be as flexible/non-disruptive as possible. They are building compilers, not assemblers.

They tend to be a source of non-portable code. And can be a drag on compatibility, versions, and regression.

Would suggest you present all the output code so we can see what it generates. You can perhaps dissect that and argue with the compiler devs about how you think this should work in the grand scheme of things.

Generally it's trying to fit within the ABI methods so it can be called, and can call everything/anything else. If you want to use all registers as you wish you can do that in a .S file where you control all the expectations and contractual details. The C compiler is going to push registers and creates stack frames in the epilogue code, and recovers them in the prologue.

 

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..