cancel
Showing results for 
Search instead for 
Did you mean: 

Hard fault when drawing bit map using STemWin on STM32F4

rbs3rd
Associate II
Posted on May 19, 2014 at 19:07

When I try to display a bit map on my LCD, the STM32F4 jumps to the hard fault  handler. Because the library is in a binary format, I can't really see whats causing the fault. If I comment out the line, the hard fault doesn't happen. I am using other STemWin functions, GUI_SetColor, GUI_FillRect, GUI_SetFont, GUI_DispStringAt, and the LCD behaves exactly as I expect it to. I know that writing to a protected bank of SDRAM will cause an AHB error, and hard fault, so I've verified that the write protect bits are cleared. I'm looking for a recommendation on how to assess the root cause of the hard fault. 

#stemwin-stm32f429
14 REPLIES 14
Posted on May 19, 2014 at 19:15

You should still be able to see the assembly code and registers at the fault point, and perhaps the position of that code with respect to public symbols exported by the library.

With a bit map, one might consider if the format is correct for the routine being used, and it's not excessively large, or interpreted as such, and if it impacts only specific images.

Segger does have a forum, and you could provide a clean/clear test case there.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
rbs3rd
Associate II
Posted on May 19, 2014 at 20:33

Thank you for the prompt reply. The bit map I'm using is one I developed for the Disco board, and displays fine on it. Its just when I transition the firmware over from demo board, to my prototype board that things fall apart. Since I posted earlier, I commented out all instances where I display bit maps to try to move forward with testing. I notice the HW fault now happens intermittently when the touch screen is used.

On what could be a related note, due to a PCB error in which we incorrectly tied BYPASS_REG to the power plane, disabling the internal regulator. We now drive the VCAP_X pins with 1.2 volts. If the externally supplied 1.2 V is not exactly at the value the data sheet specifies, could that induce the hard fault condition I'm seeing? We are re-spinning the board to correct that, and other issues. If this faults is a byproduct of using external power, I'll stop trying to figure this out. Otherwise, I will continue digging to understand what the root cause of this exception is. 

Posted on May 19, 2014 at 20:58

STM32F429I-DISCO?

I would focus on the where/what/why of the Hard Fault. These tend to occur for very specific reasons, like stack overflow, or out-of-bound memory access. Look at Joseph Yiu's Hard Fault handler code, it can provide useful stats on the failure, and confirm if it occurs at a repeatable point with specific registers, or if it is more random/spurious in nature.
Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
rbs3rd
Associate II
Posted on May 19, 2014 at 22:02

The code was running on the STM32F429I-Disco board. It was a huge help, allowing me to develop a lot of code while waiting for my prototype. Kudos to the ST team who developed the board. Regarding the HardFault, I'm guessing it is a stack issue. Looking at the disassembly, I see a  JLINKMEM_SendChar function/macro. That function/macro results is implemented using a UXTB instruction, and a BL instruction. Those instructions are followed by a stack pop that I'm guessing should provide the destination address for the  branch instruction. Immediately after that pop, the ARM core jumps to the HardFault handler.  0690X00000602r7QAA.jpg

   

 Would you agree that this looks like a stack problem? If so, is it safe to assume that what caused the problem happened well before this screen shot?

Where do I find Joseph Yiu's Hard Fault handler?

Posted on May 20, 2014 at 02:09

Where do I find Joseph Yiu's Hard Fault handler?

There's always Google.

http://blog.frankvh.com/2011/12/07/cortex-m3-m4-hard-fault-handler/

Now if it faults on the pop it's more indicative that the stack frame has been corrupted. You could inspect the memory pointed to by SP/R13, and the value of R13 itself. If R13 is outside the scope of the current stack, that is a problem. In Keil the stack size is specified in startup_stm32fxxx.s, is usually small, and can get corrupted if heap access ploughs into the bottom of the stack, or by statically defined variables if the stack drops below it's limit.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
os_kopernika
Associate II
Posted on May 20, 2014 at 11:20

''Regarding the HardFault, I'm guessing it is a stack issue.''

I am guessing you are guessing how exception handling works.

rbs3rd
Associate II
Posted on May 20, 2014 at 15:15

Thanks for the link. I did not know if there was something ST specific, like STemWin, or if I should just use the public HardFault handler.

Two followup questions. Given that this code worked when I was connecting to the Disco board using ST-Link, is it unusual that  now that I'm using my prototype and J-Link, that this behavior would present itself? I'm just trying to assess if this is just due to a change in the development environment, or if I'm looking at a bigger hardware design issue. Second, do you know where IAR specifies the stack size. 

Posted on May 20, 2014 at 15:57

For IAR it's probably a GUI option or something in the .ICF file

I'd probably start by validating the SDRAM really thoroughly.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
rbs3rd
Associate II
Posted on May 20, 2014 at 16:52

Very insightful on the SDRAM. There is a schematic error there! The LDQM and UDQM signals are swapped. This results in data bits 0:15 showing up on bits 16:31 and vice versa. But, there are no errors on the address bus, bank select bits, or other controls. My thinking is that since the bits are getting swapped on the writes, then swapped back on the reads, the error probably cancels itself out. I do get some funky background colors on my text, on the LCD, I was attributing to this, but I did not think that error could cause the M4 to experience a bus fault/hard fault. 

I'm adding in the fault handler. Seems like a good idea to have it in there.