Skip to main content
Trevor Jones
Senior
January 16, 2020
Question

sprintf causes hard fault.

  • January 16, 2020
  • 12 replies
  • 5381 views

I use "sprintf (string,"comment"); everywhere

now for some reason, I am getting a hardfault. in an innocuous print.

length += sprintf(String + length, "Received Packet from ");

length is only 15.

String is declared as

char String[256];

I guess it is using malloc,

How do we check the allocations are being cleared ?

I never use malloc anywhere, so its not me...

    This topic has been closed for replies.

    12 replies

    Trevor Jones
    Senior
    January 16, 2020

    Added to heap and stack but did not fix the problem:

    in STM32H743ZI_flash.lds

    	.heap (NOLOAD) :
    	{
    		. = ALIGN(4);
    		PROVIDE(__heap_start__ = .);
    		KEEP(*(.heap))
    		. = ALIGN(4);
    		. = . + 0x2000;
    		PROVIDE(__heap_end__ = .);
    	} > SRAM
     
    	.reserved_for_stack (NOLOAD) :
    	{
    		. = ALIGN(4);
    		PROVIDE(__reserved_for_stack_start__ = .);
    		KEEP(*(.reserved_for_stack))
    		. = ALIGN(4);
    		. = . + 0x2000;
    		PROVIDE(__reserved_for_stack_end__ = .);
    	} > SRAM

    Tesla DeLorean
    Guru
    January 16, 2020

    Does length get initialized or corrupt?

    Not sure where the allocators is beyond _sbrk, but in other systems you could try walking the heap list.

    I guess look specifically at where it is faulting, and via a listing. See what it is actually using, and perhaps if you can unwind the allocator code and call-tree that's getting you here.

    With strings always make sure everything has a NUL termination, especially if there are some memcpy() or "%s" being used.

    Tips, Buy me a coffee, or three.. PayPal Venmo (See Profile) Up vote any posts that you find helpful, it shows what's working..
    Trevor Jones
    Senior
    January 16, 2020

    Thanks for the good advice @Community member​ however, very difficult to do any debug.

    length is definately only 0x15 when the allocated String locally is 256 bytes.

    now to unwind the allocator code, ?

    "specifically at where it is faulting, and via a listing"

    it is "FRAME OUT OF BOUNDS" from Visual Studio so I cant see anything.

    strangely, all other lines of sprintf around this one seem to operate ok, only this one is a problem. ?

    maybe

    a code boundary ?

    stack nesting ?

    removed a 12 nest if/else without fixing it.

    how can I walk the heap list?

    sprintf always inserts the null.. and using " Sting+length" will override it anyhow.

    dont use the %s operator at all. nor string copy or memcopy.

    ( i use the Null to copy strings, ie while(*ptr) // is not null )

    looking to generate the map file now.

    Trevor Jones
    Senior
    January 16, 2020

    the map file: no boundaries here

    faulty function is inside: ProcessReceivedFrame { }

     .text.ProcessReceivedFrame
     0x0800ba28 0x188 VisualGDB/Debug/nCan.o
     0x0800ba28 ProcessReceivedFrame
     .text.HAL_FDCAN_RxBufferNewMessageCallback
     0x0800bbb0 0x1c VisualGDB/Debug/nCan.o
     0x0800bbb0 HAL_FDCAN_RxBufferNewMessageCallback

    sprintf in map file:

    thumb/fpu/cortex_m7\libc_nano.a(lib_a-sprintf.o)
     0x08001d70 _sprintf_r
     0x08001d70 _siprintf_r
     0x08001dac siprintf
     0x08001dac sprintf
     *fill* 0x08001df0 0x10 

    the heap in Map file, and surrounds

     0x2400fcd4 retSD
     0x2400fcd8 SDPath
     0x2400fcdc SDFile
     0x2400ff0c SDFatFS
     COMMON 0x24010140 0x4 c:/sysgcc/arm-eabi/bin/../lib/gcc/arm-eabi/7.2.0/../../../../arm-eabi/lib/thumb/fpu/cortex_m7\libc_nano.a(lib_a-reent.o)
     0x24010140 errno
     0x24010144 . = ALIGN (0x4)
     0x24010144 _ebss = .
     0x24010144 PROVIDE (__bss_end__, _ebss)
     0x24010144 PROVIDE (end, .)
     
    .heap 0x24010144 0x2000 load address 0x080337d8
     0x24010144 . = ALIGN (0x4)
     [!provide] PROVIDE (__heap_start__, .)
     *(.heap)
     0x24010144 . = ALIGN (0x4)
     0x24012144 . = (. + 0x2000)
     *fill* 0x24010144 0x2000 
     [!provide] PROVIDE (__heap_end__, .)
     
    .reserved_for_stack
     0x24012144 0x2000 load address 0x080337d8
     0x24012144 . = ALIGN (0x4)
     [!provide] PROVIDE (__reserved_for_stack_start__, .)
     *(.reserved_for_stack)
     0x24012144 . = ALIGN (0x4)
     0x24014144 . = (. + 0x2000)
     *fill* 0x24012144 0x2000 
     [!provide] PROVIDE (__reserved_for_stack_end__, .)
     
    .image_storage 0x30000000 0x48000
     0x30000000 . = ALIGN (0x4)
     [!provide] PROVIDE (__Image_Store__, .)
     .image_storage
     0x30000000 0x48000 VisualGDB/Debug/bios.o
     0x30000000 BigBuffer
     
    .aligned_storage
     0x38000000 0x7800
     0x38000000 . = ALIGN (0x4)
     [!provide] PROVIDE (__Aligned_Store__, .)
     .aligned_storage
     0x38000000 0x7800 VisualGDB/Debug/bios.o
     0x38000000 Usart1RxDMABuffer
     0x38000400 Usart1TxDMABuffer
     0x38002400 Usart2RxDMABuffer
     0x38002800 Usart2TxDMABuffer
     0x38003800 string
     0x38004800 flashBuffer
     0x38005800 RamDirectory
    OUTPUT(VisualGDB/Debug/H743_LCD_2 elf32-littlearm)

    Tesla DeLorean
    Guru
    January 16, 2020

    If it corrupts the stack the processor can vector off to locations it has no code to show you. You've seen my fault routines, these allow for an inside-the-box view of the failure if the debugger is unhelpful.

    Typically the way heap work is you have a linked list of allocated and unallocated space, it should fold contiguous unallocated space during free()

    To get to the list you typically subtract space for a structure in front of the pointer returned by malloc(). ie foo = malloc(100); the link list structure can be found at foo-32 (for example). Look at how free() is implemented, and what it does with the pointer you pass in. There is likely a back/forward pointer and a size.

    Not saying it is the heap, the stack tends to be the biggest issue. In ST/GCC they usually have the heap/stack together allowing one to crash into the other.

    Tips, Buy me a coffee, or three.. PayPal Venmo (See Profile) Up vote any posts that you find helpful, it shows what's working..
    Trevor Jones
    Senior
    January 16, 2020

    I have set the Heap/Stack to 8Kbytes, wouldn't think that is an issue after doubling it and now it still fails in the same sprintf

    regarding your hardfault handler:

    this is declared

    /* These types MUST be 32-bit */
    typedef long		LONG;
    typedef unsigned long	DWORD;
     
    /* This type MUST be 64-bit (Remove this for ANSI C (C89) compatibility) */
    typedef unsigned long long QWORD;

    looking inside your hardfault handler: but it just fails...

    void hard_fault_handler_c(unsigned int * hardfault_args, unsigned int r4, unsigned int r5, unsigned int r6)
    {
     DWORD reg1 = hardfault_args[1]; <- crashes here, single step fails to here.
    	DWORD reg2 = hardfault_args[2];
    	DWORD reg3 = hardfault_args[0];
    	DWORD reg4 = r4;

    I have the dataCache off, will try the instructionCache off

    Trevor Jones
    Senior
    January 16, 2020

    still fails with ICache OFF,

    here is the errant code:

    	 ProcessReceivedFrame(dataChannel, FifoMsgLength); // check if this packet contains a valid address
     
    	 { 
    		 if (showRxCanFrames) {
    			 char String[256];
    			 int length = 0;
    			 length += sprintf(String + length, "Received Packet from "); // this line works
    	 
    			 if (have_LCD)
    				 length += sprintf(String + length, "LCD unit ");
    			 //else
    			 if(have_Facex)	
    					length += sprintf(String + length, "Facex unit ");	 
    			 //else		
    			 if(have_FA4)
    					length += sprintf(String + length, "FA4 unit ");
    			 //else		
    			 if(have_FA1)
    					length += sprintf(String + length, "FA1 unit ");
    			 //else
    			 if(have_FA2)
    					length += sprintf(String + length, "FA2 unit ");	 
    			 //else
    			 if(have_AimTA)
    					length += sprintf(String + length, "AIM_TA unit ");
    			 //else
    			 if(have_AimTC) 
    					length += sprintf(String + length, "AIM_TC unit "); // <- fails here
    			 // else
    			 if(have_AimHTC)

    Tesla DeLorean
    Guru
    January 16, 2020

    Can you change behaviour with a "static char String[256]" ?

    Doesn't look bad..

    Does length look problematic? Clearly don't have 256 chars in preceding sprintf

    Tips, Buy me a coffee, or three.. PayPal Venmo (See Profile) Up vote any posts that you find helpful, it shows what's working..
    Trevor Jones
    Senior
    January 16, 2020

    nope, same result,

    attached.... with code stack shown

    indicator of failed point...

    Trevor Jones
    Senior
    January 16, 2020

    will bypass that code.

    berendi
    Principal
    January 16, 2020

    This smells very badly after a corrupted heap.

    Show your _sbrk() function, and set a breakpoint on it to see whether all of its variables make sense, and the heap is where you think it is.

    Make sure you don't ever call (s)printf in an interrupt handler.

    Check the stack pointer register, does it make sense? If the stack is in DTCM (0x20000000 - 0x2001FFFF), malloc() gets badly confused.

    There is a FatFs buffer before the stack. Check FatFs configuration, and make sure the disk i/o function doesn't overflow.

    Does the problem go away if you don't use FatFs?

    Set errno (the variable sitting between the fatfs buffer and the heap) to a guard value, e.g. 0xDEADBEEF early in main(). See if it ever gets changed when it shouldn't, i.e. returning from a failing library function.

    Trevor Jones
    Senior
    January 16, 2020

    have the newlib with floating point support in printf running.

    Yes do use FatFS but can't disable it easily, dont want to break it :(

    dont use it currently but it is installed.

    have set Stack and Heap to 8K each.

    can't find _sbrk function,

    can't find the errno variable.

    don't ever call sprintf from an interrupt

    stack heap was in 0x24000000 now moved to RAMD3 area successfully

    trying to clear the new stack/heap area,

    cant seem to declare a variable in the .lds file (below the reset routine)

    void __attribute__((naked, noreturn)) Reset_Handler()
    {
    	//Normally the CPU should will setup the based on the value from the first entry in the vector table.
    	//If you encounter problems with accessing stack variables during initialization, ensure the line below is enabled.
    	#ifdef sram_layout
    	asm ("ldr sp, =_estack");
    	#endif
     
    	void **pSource, **pDest;
    	for (pSource = &_sidata, pDest = &_sdata; pDest != &_edata; pSource++, pDest++)
    		*pDest = *pSource;
     
    	for (pDest = &_sbss; pDest != &_ebss; pDest++)
    		*pDest = 0;
    	for (pDest = &_heapStart; pDest != &_stackEnd; pDest++) //my new variables wont compile
    			*pDest = 0;
     
    	SystemInit();
    	__libc_init_array();
    	(void)main();
    	for (;;) ;
    }
     
    .heap (NOLOAD) :
    	{
    		. = ALIGN(4);
    		_heapStart =.;
    		PROVIDE(__heap_start__ = _heapStart);
    		KEEP(*(.heap))
    		. = ALIGN(4);
    		. = . + 0x2000;
    		PROVIDE(__heap_end__ = .);
    	} > RAM_D3
     
    	.reserved_for_stack (NOLOAD) :
    	{
    		. = ALIGN(4);
    		PROVIDE(__reserved_for_stack_start__ = .);
    		KEEP(*(.reserved_for_stack))
    		. = ALIGN(4);
    		. = . + 0x2000;
    		_stackEnd =.;
    		PROVIDE(__reserved_for_stack_end__ = _stackEnd);
    	} > RAM_D3

    currently the Memory area for the stack/heap is full of uninitialized data

    Trevor Jones
    Senior
    January 16, 2020

    I am using Visual Studio

    will add that include in the morning , at home now

    newlib is precompiled, so not looking in there I guess.

    the reset handler is installed from the CUBE so HAL based..

    please understand this error is only in this place, sprintf works everywhere else without issue.

    berendi
    Principal
    January 16, 2020

    _sbrk is not part of newlib, it must be provided by the user application. See https://sourceware.org/newlib/libc.html#Syscalls

    HAL has Reset_Handler() written in assembly. Yours is written in C, so it doesn't come from Cube.

    Your Reset_Handler has the naked attribute, which shouldn't be used for C functions. See https://gcc.gnu.org/onlinedocs/gcc/ARM-Function-Attributes.html

    Sorry, I can't help you any further, this stuff is messed up beyond all recognition. Get a sane toolchain.