2013-06-04 12:50 AM
Hello,
I have a kind of strange issue. When I really stress my system (it handles serial packets, get the right data, and send it via serial, I send thousands of messages without waiting for reply), in an unpredictable way, I sometimes get an invalid PC load usage fault of invalid state fault. When I get an invalid PC load usage, the program counter is like 0x0 or 0x5, or sometimes it contains a ram address, but I don't have code in ram, and looking at the stack trace, I have the feeling there is a stack pointer corruption somewhere because some of the registers have flash address of branch code in them (and LR has weird stuff, obviously not flash code nor ram address). Here are my stack traces :****************************
HARD FAULT !Stack = 0x20000660
Invalid PC load usage fault at Program counter = 0x200082B0Stack frame :
R0 = 0x400264B8 R1 = 0x20008318 R2 = 0x3C R3 = 0x200082B0 R12 = 0x0 LR = 0x8002417 PC = 0x200082B0 PSR = 0x20008318****************************
Or :
****************************
HARD FAULT !Stack = 0x20000688
Invalid state usage fault at Program counter = 0x20008270Stack frame :
R0 = 0x20008288 R1 = 0x20008EA8 R2 = 0x3C R3 = 0x200082B0 R12 = 0x0 LR = 0x20008270 PC = 0x20008270 PSR = 0x20000200****************************
Or again:
****************************
HARD FAULT !Stack = 0x20000670
Invalid PC load usage fault at Program counter = 0x1Stack frame :
R0 = 0x0 R1 = 0x80023D7 R2 = 0x8003B26 R3 = 0x21000200 R12 = 0x0 LR = 0x8003279 PC = 0x1 PSR = 0x200082B0****************************
The problem seems to happen (tried to track it down but it's very hard) on the service call interrupt exit after a malloc call (but there is like 10000 malloc calls without problem first). My process stacks are far from full (half empty at min), I have 8k of system stack. The hard fault happens with user stack, but again, seems to trigger when popping rgisters at service call exit. Spent about 10 hours trying to fix this, but no luck so far, do youguys have any advice for me? Thomas. #dma22013-06-04 03:41 AM
Hello,
I'm using TASKING for ARM. I don't have any active breakpoint and the default views (they are refreshing only when process stops). I now processed >2M packets at full speed, and still working good (J-link is physically connected but not attached). Thomas.2013-06-04 04:25 AM
Hello,
It's now running for more than half an hours, processed millions of packets, and still running. What do I do now ? Just classify it as Shroedingbug and continue, doesn't seem like a clever idea ... Thomas.2013-06-04 07:14 AM
I've had similar problems with DMA2. In my case it manifests as spurious interrupts from the NVIC. I found that turning on the FIFO mode in DMA fixes the problem, same as you. There are some known errata on DMA2, perhaps there are some additional problems with the FIFO unit not yet documented.
Jack Peacock2013-06-04 07:40 AM
Hello,
I turned on FIFO, but still have the hard faults. Thomas.2013-07-03 12:45 AM
Hello Thomas,
Could you please let me know which product are you using? Thanks, -Mayla-To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.
2013-08-14 09:22 AM
Hello,
Using STM32F205RCT6. I'm already using the same reference for other projects. I'm trying to use this code for a new project (for now it is a simple pass through serial to serial, for testing). These hard faults are VERY hard to fix because it seems they happen anywhere is the code, and the level of optimization seems to change the occurence rate (on some code it runs quite good on level 0 optimization, pretty much the same with level 2, doesn't work at all on level 2, and work for a few iterations with level 3 before it crashes), plus the fault seems to happen a LOT less when the debugger is not connected. All this leads me to think that there is a timing problem somewhere (the debugger probably affects the speed a bit too), but where ???? Now I seem to get almost always wrong PC at 0x01, and it still looks like a faulty stack (looks like everything is moved one rank, like stacked PSR contains a memory address when hard fault triggers). I really need some help or at least some advice on how to track it down ! Thomas.2013-08-14 10:25 AM
Can you fill the stack or add some guards to make sure it's not going wrong there?
I remember a number of people tickling a prefetch problem with GNU/GCC compilers. The ART seems to have a critical path errata, normally tickled with supplies <2.1V. I'll see if I can pin down a cite for the code generated. A PC relative LDR as I recall. Does attaching a debugger mess with the supply voltage?http://www.st.com/st-web-ui/static/active/en/resource/technical/document/errata_sheet/DM00027213.pdf
2013-08-14 10:27 AM
2013-08-16 01:46 AM
Hello,
Power supply is 3.3V coming out of an M5239 LDO (powered by USB or lab power supply), that's the same power stage I'm using on pretty much all projects I design, never had a problem.Compiler is the one from TASKING, which is supposed not to be a derivate from GCC/GNU.I will do some tests disabling some of the optimizations (ART, prefetch, etc) ...Thomas.[EDIT] I'm using it at 120MHz with ''only'' 3 flash wait states, I'll also try with 4 wait states. But I already tested it at 40MHz with 1 wait state, which means lower flash speed, with same result.2013-08-16 02:03 AM
Well I just tried with flash state = 5 and all optimizations disabled (no cache, no prefetch), with the exact same result (hard fault invalid state usage at PC = 0x0).
So not related to flash speed.Thomas.