How to configure the stack so that stack overflow triggers a fault ?

HMont.12 · ‎2019-05-06

Hello,

First of all, I am using a STM32F446 with Keil IDE. I used CubeMX to generate the Keil project.

I recently encoutered a problem that seems to be some kind of stack overflow: the static variables that were located just before the stack in the RAM address space were modified without being accessed in the code. This seemed to happen randomly, generally after a couple of minutes.

Increasing the size of the stack in the startup file seems to have solved the problem. However, diagnosing the issue took me a while, especially because there was no way to trigger it, and the behavior of the code with these random values could have damaged the machine I am controlling. So I am looking for a way to configure the stack that would make sure a hard fault is triggered when the stack is full.

Moreover, I am really confused about the direction the stack fills up...

I already started a thread and posted the details on the Keil discussion forum : http://www.keil.com/forum/64363/

Thank you

Ozone · ‎2019-05-06

Stack overflows tend to produce random results, as they are usually interference between the stack and (global) variables.

> Increasing the size of the stack in the startup file seems to have solved the problem.

This is the only way, basically. Some tools provide an estimate based on static code analysis, with a high uncertainty factor.

And solid toolchains provide a runtime stack check or post-mortem analyzing tool.

> This seemed to happen randomly, generally after a couple of minutes.

Usually triggered by nested interrupts, or interrupts hitting at times of highest stack usage.

Or runtime-evaluated out-of-bound indices.

> So I am looking for a way to configure the stack that would make sure a hard fault is triggered when the stack is full.

You try to use MPU sections for that. That implies quite rigid alignment and size restraints, though.

Danish1 · ‎2019-05-06

The stack starts at the highest address of its allocation and grows to lower memory.

I don't think that there's anything in the arm processor core that is dedicated to trapping a stack that has grown too big.

But there are ways to see how much of your allocated stack space is being used.

My development environment, Rowley Crossworks, likes to put guard words 0xFACEFEED at either end of the stack. You can periodically check if those words are intact, and if not you know the stack has grown too long (or you have wrongly popped too much off the stack - very unlikely as the primary cause of failure if you're using a high-level language such as C). But it isn't foolproof because:

You only know after the event that the guard-word has been clobbered
It is possible, if your code allocates space on the stack but then does not use all of it, for the stack to grow too big but not clobber the guard word.

But I like to pre-load the space between those 0xFACEFEEDs with a known value, say 0xFE. Then I can look from the end of the stack and see the first byte that isn't the starting value, and I know I have used stack up to that point. Again this doesn't prevent the stack from growing too large.

What could you do?

I suppose you could put a hardware debug data write breakpoint on memory just beyond the stack. That should force a break at the point the stack guard-word is overwritten.

Maybe you could do the same with the memory-protection unit. I don't know if that will trigger a double-fault, so it might be difficult to recover the state when the fault occurred.

Or the check could be done automagically in software. I think the Keil environment has a way to check at every context-switch according to:

http://www.keil.com/support/man/docs/rlarm/rlarm_ar_cfgstchk.htm

(google search for "keil arm stack overflow"). But then that's for multi-threaded code using their RTX. The way you've written the question, you might be only using one thread.

Hope this helps,

Danish

waclawek.jan · ‎2019-05-06

> I don't think that there's anything in the arm processor core that is dedicated to trapping a stack that has grown too big.

Why, the MPU.

JW

turboscrew · ‎2019-05-06

Two things comes to mind. Configure the MPU such that there are privileged-access-only-regions around the stack area. If your stack has grown too big or "negative", the MPU catches that. Maybe even watch-areas are usable in that. That, of course, requires, that you have a device with MPU or DWT.