cancel
Showing results for 
Search instead for 
Did you mean: 

Strange STM32L152CBT6 Hard Fault issues

dadman
Associate II
Posted on January 21, 2014 at 22:43

Hi all, 

I''ve spent three strange nights of trying to find solution for the following issue.

No success yet.

I'll be very glad for any advice how to solve it.

I've designed a board with STM32L152CBT6 (48pins)

Low power MCU 128kb Flash, 16kb RAM 

Running on 32Mhz HSI

gcc 4.7.2 CodeSourcery, C/C++ with stdlib.

Stack size 0x400

The board contains just the MCU, SWD pins for Segger JLink connection for debugging and CC11001 RF 868Mhz module from Alibaba attached via 2mm spacing header and LED.

The power source is 1A DCDC converter TracoPower 3,3V.

The MCU has proper blocking capacitors 100nf (+1uF for VDDA).

The MCU uses SPI2 (PB12 CS, PB13 CLK, PB14 MISO, PB15 MOSI) and PB5 for IRQ handler EXTI_Line5.

I'm running my RF packet library for the Radio as a test case.

At first I need to mention that the same library/test I'm running without any issue on STMF4 Discovery Kit, self-designed board with STM32F103VC and as well on STM32L152 Discovery board. 

The last STM32L152RBT6 MCU is almost the same MCU with the same memory configuration, just with 64 pins.

So what is the issue:

The issue is Hard Faults during execution. The Hard Fault occurrence is irregular but typically it happens when a packet is sent via RF. Sometimes this is the first packet sometimes it is running stable but when i touch a board the Hard fault occurres. The packet send consumes up to 30mA current.

After three days of testing I have found a workaround how deal with.

It is enough just to put this code during board initialization

        GPIO_InitStructure.GPIO_Mode = GPIO_Mode_OUT;

GPIO_InitStructure.GPIO_OType = GPIO_OType_PP;

GPIO_InitStructure.GPIO_PuPd  = GPIO_PuPd_NOPULL;

GPIO_InitStructure.GPIO_Speed = GPIO_Speed_40MHz;

GPIO_InitStructure.GPIO_Pin =  GPIO_Pin_6;

GPIO_Init(GPIOB, &GPIO_InitStructure);

The code only initializes the pin PB6 as output on already running GPIOB clock.

That's enough. No setting on/off the pin is needed and the board immediately running perfectly stable like on other boards I've mentioned before.

The PB6 is connected via  SMD LED a resistor 750R to ground.

The voltage on the pin is at about 5mV.

When I've tried to remove the LED from board the Hard Fault returned back immediately.

I've tried to build another copy of the board to eliminate the MCU problem, the PCB problem, soldering problem etc.

And the result? The second board has the same strange problem.

Thank you very  much for your advice

Regards

Jakub

#stm-forum-software-pisses-me-off #stm32-hard-fault
31 REPLIES 31
Posted on January 21, 2014 at 23:50

So what can you tell me about the assembler instructions that are faulting, and the register content when that occurs? The subroutines that code is associated with?

Have you checked the state and maximal depth of the stack?
Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
os_kopernika
Associate II
Posted on January 21, 2014 at 23:57

You didn't provide much information.

What was the cause of the HardFault?

A uC does not raise this IRQ just because of a bad mood.

And of course you did analyze the call stack during the crash?

P.S. Have I already written that this forum software pisses me off? Not today, I think.

dadman
Associate II
Posted on January 22, 2014 at 00:16

Additional information:

Hard Fault handler routine (see attachment for ASM code) output:

[Hard fault handler - all numbers in hex]

R0 = 20004000

R1 = 800c405 - ResetHandler

R2 = 800c459 - NMIHandler

R3 = 80001d8 - HardFault_handler

R12 = 800c581 - MemManage_handler

LR [R14] = 800c589 subroutine call return address - BusFaultHandler

PC [R15] = 800c591 program counter - UsageFault Handler

PSR = 0

BFAR = e000edf8

CFSR = 30000

HFSR = 40000000

DFSR = 3

AFSR = 0

SCB_SHCSR= 0

Registers when Hard Fault occurred

0690X0000060550QAA.png

Eclipse stack trace

0690X0000060555QAA.png

And the code when HardFault occured - wait cycle, just decrementing uint32_t and wainting for modification of volatile integer from IRQ handler.

0690X00000604xXQAQ.png

Thank you very much for your help.

________________

Attachments :

registers.png : https://st--c.eu10.content.force.com/sfc/dist/version/download/?oid=00Db0000000YtG6&ids=0680X000006I0fB&d=%2Fa%2F0X0000000bc5%2FDLcCUXAPfbHEm568l9XGavMLhdXK_OUUe86e7s01RTo&asPdf=false

stack_eclipse.png : https://st--c.eu10.content.force.com/sfc/dist/version/download/?oid=00Db0000000YtG6&ids=0680X000006I0f6&d=%2Fa%2F0X0000000bc6%2FR0godK7uoy_O8KuElt.k3SrLLP_kE5zduQU0pqITLRc&asPdf=false
os_kopernika
Associate II
Posted on January 22, 2014 at 02:12

STM Forum ''software'' pissed me off again two hours after it had pissed me off for the 255th time yesterday.

Ok, I am out.

You need someone more patient, like clive.

I wonder if the post crashes again........

Posted on January 22, 2014 at 02:42

Your Hard Fault handler hex output suggests it's dumping the vector table (at zero), and not register or stack content.

PSP = 0 ??

It would be helpful if you display the other register list in hex, you look at the faulting instructions in assembler, and you hex dump the stack.

If it really goes into process mode, as perhaps a result of a corrupted stack return from interrupt, and the stack is pointing at zero, you can surely expect it to fault.

So figure out the scope of your stack (0x20004000 .. 0x20003F58), see how deep it really gets, look at what you have as local/auto allocations within normal code, and under interrupt. Fill the stack with a known marker so you can observe the low water mark.
Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
dadman
Associate II
Posted on January 22, 2014 at 10:14

I did several tests and found another workaround for the HardFault issue.

I've attached Oscilloscope probe to the SPI MISO Pin and IRQ Pin from RF and the problem disappeared and the application is running perfectly stable. (The same is done by initializing GPIO6 PIN).

When the probe is disconnected it immediately Hard faults.

I've also tried to connect IRQ PIN a SPI MISO via 10k resistors to ground but it doesn't help.

I've also tried to connect to IRQ PIN a SPI MISO 100pF to ground and it helped a little. The system is now stable but when I have touched the bootom of the PCB the hard fault occurred again.

I'm still more and more convinced that this is some hardware-related problem around GPIO or SPI bus.

Thanks for any advice.

Jakub

 
dadman
Associate II
Posted on January 22, 2014 at 10:14

Registers when Hard fault occurred:

R0 -      0x7a

R1 -      0x200007bb

R2 -      0

R3 -      0x20004014

R4 -      0x800c439

R5-       0x848d

R6-       0x80001d8

R7-       0x85b5

R8-       0

R9-       0

R10-     0

R11-     0

R12-     0

SP-      0x20003f40

LR-       0xfffffff9

PC-      0x80001d8

XPSR-  0x41000003

MSP-   0x20003f40

PSP-          0

PRIMASK-  0

BASEPRI-   0

FAULTMASK 0

CONTROL   0

PSP is 0 during the whole program execution.

The IRQ handler is very simple.

0690X00000604hOQAQ.png

The problem occurs in routine for sending packets.

0690X000006052RQAQ.png

I've traced the all instructions in debugger up to the place where Hard fault occured. 

No suspicious local variables (only int result codes etc). 

No memory operations like memcpy on stack variables.

Stack dump after Hard fault

0690X00000604v2QAA.png

Another run with stack markers

0690X00000604hiQAA.png

Posted on January 22, 2014 at 16:58

I can't particularly understand why the Hard Fault routine was passed 0, here with an LR entry of 0xFFFFFFF9 it pulls the stack from MSP

Looking at the stack frame as is, we have

0x20 r0

0x20 r1

1 r2

0x20003F50 r3

0x20003F78 r12

0xFFFFFFF9 lr

0x00020000 pc

So it looks to be coming from some other interrupt or fault (LR), the PC is invalid (EVEN)

Your code itself looks innocuous, even without the disassembly, or knowing if the variables are volatile. The issue is elsewhere.

You should examine all your interrupt vector entries, looking for the unfilled ones, or ones containing 0x20000

You should consider what other interrupts or DMA you have enabled, or could be occurring.

You want to double check your clock settings, flash wait state and prefetch configuration.

You want to see if there is some voltage spike, or short, or GPIO connected with the RF sending/completing.
Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
dadman
Associate II
Posted on January 22, 2014 at 17:18

Hi Clive, 

at first thank for your help.

>So it looks to be coming from some other interrupt or fault (LR), the PC is invalid (EVEN)

>Your code itself looks innocuous, even without the disassembly, or knowing if the variables are volatile. The issue is elsewhere.

Yes variables are static volatile

>You should examine all your interrupt vector entries, looking for the unfilled ones, or ones containing 0x20000

>You should consider what other interrupts or DMA you have enabled, or could be occurring.

There is no other IRQ handler enabled, only SysTick_Handler is running.

static volatile uint32_t TimingDelay;

void SysTick_Handler(void)

{

if (TimingDelay != 0x00) {

TimingDelay--;

}

}

>You want to double check your clock settings, flash wait state and prefetch configuration.

The clock is 32Mhz for SYSCLK, HCLK, PCLK1, PCLK1.

The system startup is generated by ST Clock tool excel. I'm using it long time without aby problem.

The system startyp part of sequence:

/* Enable 64-bit access */

    FLASH->ACR |= FLASH_ACR_ACC64;

    

    /* Enable Prefetch Buffer */

    FLASH->ACR |= FLASH_ACR_PRFTEN;

    /* Flash 1 wait state */

    FLASH->ACR |= FLASH_ACR_LATENCY;

    

    /* Power enable */

    RCC->APB1ENR |= RCC_APB1ENR_PWREN;

  

    /* Select the Voltage Range 1 (1.8 V) */

    PWR->CR = PWR_CR_VOS_0;

>You want to see if there is some voltage spike, or short, or GPIO connected with the RF sending/completing.

I had all lines and VDD on oscilloscope without any suspicious event.

I built two pieces of the board. Both boards have the same issue.