cancel
Showing results for 
Search instead for 
Did you mean: 

Timing-dependant hard fault error during SPI communication

SMüll.2
Associate

Hello, I am facing a strange hard fault error during SPI interaction loop with the display that only takes place under certain circumstances, prob. related to changes in timing. Minor changes in code at random places alter WHEN it happens.

This is my first electronics project and also my first project in plain c, so it is not unlikely that I do not understand a basic fact/aspect. Also, I run out of ideas, would be great if you could help me there.

So here comes the detailed description:

My project is based on the STM32F411CEU and I face at some circumstances a hard fault error, seamlingly during SPI interaction with the display. My project has a few peripherals(CDC, SPI, USART2), one is an ILI9341 connected via SPI (SPI2). I use FreeRTOS with several tasks, one takes care of the interaction with the display.

Though very little changes in code can "stabilize" the system so that the error does not occur again but others changes may lead to the problem appearing again. Currently I have a situation where the problem is relatively stable.

I tried to print a rectangle on my display, that worked a million times before. The first image shows the loop that is used to draw it on the screen. I added the if and the spicount variable to be able to jump directly to when it happens during debuging. I use the ST-Link v2 for debuging.

When I try to step into the HAL_SPI_Transmit function for the 71725th time, the hard fault error occurs deep down in HAL_SPI_Transmitat line 904 (image 2). When I try to step into that function (F5), it immediately turns to HardFault error handler. However, when turning on the "Intruction Stepping Mode", the program continues, as if there where no problem, I can switch back then to line stepping mode and the problem will occur later (but always before stepping into a function). I looked at the CSFR register (image 3), it says precise error, but BFAR (image 4) content seems to be strange to me (the memory adress points to memory mapped peripherals and it comepletely empty [image 5&6]). Also, I added the FreeRTOS task view, the Fault Analyzer output and the stack trace (which is inaccurate btw., this is not the function that caused the crash). I am really starting to run out of ideas how to continue. Thanks for reading:)

EDIT: Sorry, the order of the images is vice versa, what I called first is last.

0693W00000DmEW4QAN.png0693W00000DmEVkQAN.png0693W00000DmEVpQAN.png0693W00000DmEVfQAN.png0693W00000DmEVQQA3.png0693W00000DmEVVQA3.png0693W00000DmEUrQAN.png0693W00000DmEUmQAN.png0693W00000DmEToQAN.png0693W00000DmET5QAN.png

4 REPLIES 4

Points to an illegal address, so the handle or instance data is getting corrupted.

Start sanity checking those values, and identify when they first get corrupted, and before you hard fault on them.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
SMüll.2
Associate

Hi, thanks for the answer. Do you mean the lcdSPIHandle / the uint8_t data[] array in TFT9341_FillRect? I checked that during debuging, the values are stil the same as in the 71724 walks through the loop before. Would that explain the "not stepping into function" behaviour? What could be the reason for corruption? I already tried to find possible buffer overflows and invalid free calls.

> and the stack trace (which is inaccurate btw., this is not the function that caused the crash)

Why do you think so?

It appears to be part of the RTOS (task switcher?), it may quite well crash, e.g. due to stack corruption (read: try increasing task stack or whatever they call it in RTOS, I don't use RTOS).

JW

LcdSPIHandle & hspi->Instance for starters

In heavy interrupt nesting one might also look at worst-case stack size expectations, and perhaps monitor the depth they get to.

What causes corruption, perhaps errant pointer, bounds on arrays, strings, etc, especially those on stacks ie auto/local variables. Watch the depth of the call trees, and auto/local usage.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..