Hard faults due to (gcc) compiler computing incorrect addresses

lamare · ‎2025-04-29

I've encountered a number of compiler problems in a project to port freemodbus on an STM32G4 nucleo board, which is available on github. There's a demo application specifically for the NUCLEO-G431RB:

https://github.com/alammertink/FreeModbusDemo

This uses the library ported to STM32 using the cmake toolchain:

https://github.com/alammertink/freemodbus

https://github.com/alammertink/freemodbus/blob/master/demo/STM32_CMAKE/README.md

I've tested STM32CubeCLT versions 1.16 to 1.18 on Windows, which all produce the same problem: Hard fault interrupts occurring because of some error in the address calculation of function pointers and even byte arrays, which does not have anything to do with alignment issues.

I've worked around these errors using inline assembly, where the workarounds are conditionally compiled using the

STM32_CMAKE macro. The problems occur in mb.c where a/o a number of callback function pointers are set:

https://github.com/alammertink/freemodbus/blob/master/modbus/mb.c

#ifdef STM32_CMAKE // work around nasty gcc compiler bug
          {
            uint32_t* srcPtr    = NULL;
            uint32_t* destPtr   = NULL;

            __asm__ volatile ("ldr %0, =eMBRTUStart"    : "=r" (pvMBFrameStartCur));
            __asm__ volatile ("ldr %0, =eMBRTUStop"     : "=r" (pvMBFrameStopCur));
            __asm__ volatile ("ldr %0, =eMBRTUSend"     : "=r" (peMBFrameSendCur));
            __asm__ volatile ("ldr %0, =eMBRTUReceive"  : "=r" (peMBFrameReceiveCur));
            
            pvMBFrameCloseCur   = NULL;

            // Assign pxMBFrameCBByteReceived
            __asm__ volatile ("ldr %0, =xMBRTUReceiveFSM"        : "=r" (srcPtr));
            __asm__ volatile ("ldr %0, =pxMBFrameCBByteReceived" : "=r" (destPtr));
            *destPtr = (uint32_t)srcPtr;

            // Assign pxMBFrameCBTransmitterEmpty
            __asm__ volatile ("ldr %0, =xMBRTUTransmitFSM"           : "=r" (srcPtr));
            __asm__ volatile ("ldr %0, =pxMBFrameCBTransmitterEmpty" : "=r" (destPtr));
            *destPtr = (uint32_t)srcPtr;

            // Assign pxMBPortCBTimerExpired
            __asm__ volatile ("ldr %0, =xMBRTUTimerT35Expired"  : "=r" (srcPtr));
            __asm__ volatile ("ldr %0, =pxMBPortCBTimerExpired" : "=r" (destPtr));
            *destPtr = (uint32_t)srcPtr;
          }
#else
            pvMBFrameStartCur = eMBRTUStart;
            pvMBFrameStopCur = eMBRTUStop;
            peMBFrameSendCur = eMBRTUSend;
            peMBFrameReceiveCur = eMBRTUReceive;
            pvMBFrameCloseCur = MB_PORT_HAS_CLOSE ? vMBPortClose : NULL;
            pxMBFrameCBByteReceived = xMBRTUReceiveFSM;
            pxMBFrameCBTransmitterEmpty = xMBRTUTransmitFSM;
            pxMBPortCBTimerExpired = xMBRTUTimerT35Expired;
#endif

And in mbrtu.c , where even the address of a byte buffer array was not computed correctly:

https://github.com/alammertink/freemodbus/blob/master/modbus/rtu/mbrtu.c

#ifdef STM32_CMAKE // work around nasty gcc compiler bug
            volatile UCHAR* destPtr;
    
            __asm__ volatile ("ldr %0, =ucRTUBuf" : "=r" (destPtr));
    
            destPtr += usRcvBufferPos;
    
            *destPtr = ucByte;
            
            usRcvBufferPos++;
#else
            ucRTUBuf[usRcvBufferPos++] = ucByte;
#endif

There are more problems in these two files, all worked around using inline assembly and conditionally compiled using the STM32_CMAKE macro, set in:

https://github.com/alammertink/freemodbus/blob/master/CMakeLists.txt

The problems can be reproduced with just a Nucleo-G431RB board by un-setting the STM32_CMAKE macro. The function pointers are already set in the initialization, which will be hit without any further action. The errors in mbrtu.c require at least some data to be sent over the (virtual) com port, which can be done using modpoll as described in the readme:

https://github.com/alammertink/freemodbus/blob/master/demo/STM32_CMAKE/README.md

As far as I can tell, this is a gcc compiler bug, since I've seen no warning nor errors, while the code in question has been running without problems on various platforms for years.

Ozone · ‎2025-04-29

> Try to write good, standard C (ChatGPT can definitely help with this when it's not in hallucination mood). Inline assembly makes your code hard to understand and maintain by a human.

I think exactly this is the problem here.

I had dealt with similiar libraries/stacks in the past, which where mostly targeting small and cheap 8-bit and 16-bit architectures. Including the respective "professional" compilers with their incompatible extensions and optimizations for said architectures. Not to mention the ugliness, consisting mostly of small sections of actual code in between hundreds of #ifdef lines.

Using or porting such code is a real pain in the ... back.
Most often I concluded writing it from scratch would have been faster, cleaner and easier to maintain - but my project managers used to decide for me ...

Nemui Trinomius · ‎2025-04-30

Dear lamare,

I tested "that a/o simple standard C" souce code on my STM32F411RE-Nucelo with GCC14.2 and got no problem.....what's go wrong???

On Confirmation,I used my STM32F401/411 simplest makefile project as basis.

F411RE have same core as G431RB,compiler-option can use same one.
("-mpure-code" compiler switch was turned OFF on this confirmation)

https://nemuisan.blog.bai.ne.jp/?eid=192848#STM32F401xx

Here is my assembler list(extracted above related routine only).

08000194 <main>:

int main(void)
{
 8000194:	b480      	push	{r7}
 8000196:	b083      	sub	sp, #12
 8000198:	af00      	add	r7, sp, #0

//local
UCHAR           ucByte;

ucRTUBuf[usRcvBufferPos++] = ucByte;
 800019a:	4b09      	ldr	r3, [pc, #36]	@ (80001c0 <main+0x2c>)
 800019c:	881b      	ldrh	r3, [r3, #0]
 800019e:	b29b      	uxth	r3, r3
 80001a0:	1c5a      	adds	r2, r3, #1
 80001a2:	b291      	uxth	r1, r2
 80001a4:	4a06      	ldr	r2, [pc, #24]	@ (80001c0 <main+0x2c>)
 80001a6:	8011      	strh	r1, [r2, #0]
 80001a8:	4619      	mov	r1, r3
 80001aa:	4a06      	ldr	r2, [pc, #24]	@ (80001c4 <main+0x30>)
 80001ac:	79fb      	ldrb	r3, [r7, #7]
 80001ae:	5453      	strb	r3, [r2, r1]
 80001b0:	2300      	movs	r3, #0
}
 80001b2:	4618      	mov	r0, r3
 80001b4:	370c      	adds	r7, #12
 80001b6:	46bd      	mov	sp, r7
 80001b8:	f85d 7b04 	ldr.w	r7, [sp], #4
 80001bc:	4770      	bx	lr
 80001be:	bf00      	nop
 80001c0:	20000100 	.word	0x20000100
 80001c4:	20000000 	.word	0x20000000

To divide where is a problem,Confirm what compiler-option actually send to
arm-none-eabi-gcc.exe on building stage(maybe this caused).

Next,confirm LinkerScript file is suitable for STM32G431RB's memory structure.
Should not fully rely on CubeMX generated LinkerScript.

BTW...ChatGPT is ...I'm also using...,but almost all using for myself illustration generation but not coding.

Best regards,
Nemui.

Pavel A. · ‎2025-04-30

I vaguely recall some LTCG bug warnings in GNU ARM toolchain and advice to disable LTCG until it is fixed. But the STM32 toolchains had no LTCG (or disabled?) so this thing was moot for us. Need to compare gcc & ld switches and .specs files.

To CMake and raw meat enthusiasts: don't drive faster than your guardian angels fly.

Nemui Trinomius · ‎2025-04-30

Modern build systems (CMake, Meson+Ninja and IDEs with automatic code generation, etc., that I know of) often cover up inherent problems.
They “builds” seemingly innocuous code, but it can stop the heart of the MCU (then "XXX doesn't work! What do I do?" we lament).

However,we must continue to learn the modern build system and not just stick to make.

I also struggled with Meson&Ninja to build a "Windows native" Flashrom.exe.
Thanks to that, I sick the "ninja nande".

This is totally off-topic, sorry.

Cheers,
Nemui.

CTapp.1 · ‎2025-05-21

@lamare wrote:
So ucRTUBuf is a statically allocated array, and not a pointer, meaning it should be accessed directly — not via an indirect pointer dereference like:

All array accesses involve pointers - the array subscript operator [ ] takes a pointer and an integer parameter:

pointer[ integer ]

with the access targeting memory at

pointer + integer * sizeof( array-element-type )

Whilst the address of a statically-allocated array is known at build time and direct accessing is possible, there is nothing wrong with using an indirect access.

BTW, the following compare as equal as it does not matter in which order the operands are supplied:

a[ i ] == i[ a ]

It would be interesting to see if you get the same code for both forms of access.