2018-02-12 02:52 AM
Hello,
I need help, because we are facing hard fault with the STM32F427 MCU.
Same software over some MCUs works perfectly, over someother crash and we are not able to understand the reasons.
Attached screen shot of debugging sections.
Any Idea?
Many Thanks,
Diego
#hard-fault #undefinstr #nocp #stm32f4272018-02-12 03:18 AM
Looks like you use an OS and tasks, which makes it a bit more difficult.
Same software over some MCUs works perfectly, over someother crash and we are not able to understand the reasons.
I understand this as 'crashes on some boards, runs fine on others'.
This might be a clue, but not necessarily. I expect all boards (MCU's ?) to be identical otherwise.
Seemingly 'random' crashes are often cause by resource overflow/corruption (mostly stack) due to external, asynchronous events (interrupts).
The path name of your images imply CoDeSys, which is ... not quite small.
Besides of providing more about your hardware/software environment, I would first work backwards, and decode the hardfault reason from the SCB registers.
Try more than one hardfault event. Is the place and context the same, or does it change ?
Second step would be presumably an instrumentation of routines possible causing it.
2018-02-12 03:47 AM
Look at the code that's actually faulting.
For chip/board specific issues and instability, look at voltages and capacitors on VCAP pins.
2018-02-12 04:43 AM
yes, the very strange is some board yes and some other not.
I've tried to replace the MCU and that board is now working properly, no more problems like before.
Some how I've the feeling it is the MCU having something, but honestly I'm not sure.
The place is not always the same, it looks that somehow there is a repetition of 5/6 different value of program counter, but...
Errors are NOCP, UNDEFINSTR, INVSTATE talking about Usage Fault, few times I get also Bus Fault PRECISERR.
Another strange things happening, is that if I place the PC to the value got by the hard fault handler, sometime the instruction is executed if I do that step by step, sometime that line is not executed, it looks the program stack there and no way to go further.
Where to see SCB register?
About interrupts, I was thinking about that, because a way to have more frequently the problem is to increase the amount of Tx/Rx interrupt fron CAN BUS interface. But again, very strange that on some board it happens quite easy and on some other board never.
Many Thanks and waiting further info.
2018-02-12 04:53 AM
We don't find the code that is faulting, because the PC is always different and it goes in place where code is normally working.
About hardware, could be a point. On one board only, the only one where we tried, by changing the MCU the problem looks solved. Maybe is not the microcontroller, but a soldering of a VCAP pin as you suggect.
I try to look.
Thanks
2018-02-12 05:00 AM
The place is not always the same, it looks that somehow there is a repetition of 5/6 different value of program counter, but...
Errors are NOCP, UNDEFINSTR, INVSTATE talking about Usage Fault, few times I get also Bus Fault PRECISERR.
This is a typical symptom of a stack overflow, when 'odd' variables are interpreted as return addresses.
I would use a stack-check feature. Some toolchains have this option, and FreeRTOS too.
About interrupts, I was thinking about that, because a way to have more frequently the problem is to increase the amount of Tx/Rx interrupt fron CAN BUS interface.
This is another overflow symptom, because interrupts pound on the stack, too.
Though they (interrupts) are not the cause, they use to add a 'random' note, and complicate debugging.
Code instrumentation is good for synchronous problem (i.e. systematic bugs in the code), but not for stack overflows.
Where to see SCB register?
2018-02-12 06:10 AM
but why some MCU yes and some other not?
we should face the same 'random' problem on every board in the same condition if we have a stack overflow, isn't it?
I've difficulties to understand how a software problem can create problems over one hardware and not on another hardware...
2018-02-12 06:39 AM
2018-02-12 07:09 AM
A correlation of hardfaults and individual MCUs is not very likely.
One exception that come to my mind - if your Flash interface (clock rates, wait states) settings are at the limit.
Your firmware might contain a race condition, wich could depend on hardware differences (like delays).
Like nested interrupts or specific error conditions in interrupts.
You can try to implement stub code for the other exceptions that escalate to hardfaults if unhandled.
Do you use the FPU, and the long stack frame (with FPU regs) ?
2018-02-12 07:31 AM
I've already tryied to slow down from 180Mhz to 120Mhz, but the behaviour is the same.
Following my system init proc:
void SystemInit(void)
{ /* FPU settings ------------------------------------------------------------*/ &sharpif (__FPU_PRESENT == 1) && (__FPU_USED == 1) SCB->CPACR |= ((3UL << 10*2)|(3UL << 11*2)); /* set CP10 and CP11 Full Access */ &sharpendif /* Reset the RCC clock configuration to the default reset state ------------*/ /* Set HSION bit */ RCC->CR |= (uint32_t)0x00000001;/* Reset CFGR register */
RCC->CFGR = 0x00000000;/* Reset HSEON, CSSON and PLLON bits */
RCC->CR &= (uint32_t)0xFEF6FFFF;/* Reset PLLCFGR register */
RCC->PLLCFGR = 0x24003010;/* Reset HSEBYP bit */
RCC->CR &= (uint32_t)0xFFFBFFFF;/* Disable all interrupts */
RCC->CIR = 0x00000000;&sharpif defined (DATA_IN_ExtSRAM) || defined (DATA_IN_ExtSDRAM) || defined (PREMAIN_FSMC_SETUP) /* Keil */
SystemInit_ExtMemCtl(); &sharpendif /* DATA_IN_ExtSRAM || DATA_IN_ExtSDRAM || defined (PREMAIN_FSMC_SETUP) */ /* Keil */ /* Configure the System clock source, PLL Multiplier and Divider factors, AHB/APBx prescalers and Flash settings ----------------------------------*/ SetSysClock();/* Configure the Vector Table location add offset address ------------------*/
&sharpifdef VECT_TAB_SRAM SCB->VTOR = SRAM_BASE | VECT_TAB_OFFSET; /* Vector Table Relocation in Internal SRAM */&sharpelse SCB->VTOR = FLASH_BASE | VECT_TAB_OFFSET; /* Vector Table Relocation in Internal FLASH */&sharpendif}where...
&sharpdefine __FPU_PRESENT 1 /*!< FPU present */+
then I suppose is un use
Have I to do like following for the hard fault handler for the othr faults?
HardFault_Handler\
PROC EXPORT HardFault_Handler [WEAK] TST LR, &sharp4 ITE EQ MRSEQ R0, MSP MRSNE R0, PSP B hard_fault_handler_c ENDPMemManage_Handler\ PROC EXPORT MemManage_Handler [WEAK] B . ENDPBusFault_Handler\ PROC EXPORT BusFault_Handler [WEAK] B . ENDPUsageFault_Handler\ PROC EXPORT UsageFault_Handler [WEAK] B . ENDPcan I find sample of the other fault handling in C language?