cancel
Showing results for 
Search instead for 
Did you mean: 

HardFault in STM32F103 because of array class member in FreeRTOS C++ wrapper

IProg.1759
Associate II

Three weeks I am trying to cope with a mysterious problem.

I will begin from general description and then dive into details. The MCU is STM32F103RFT6, HAL and C++ Wrapper for FreeRTOS 10.2 are in use.

So, I have only one task running (not counting Idle Task). Continious infinite stream of bytes arrives at UART. 72 bytes every 10ms. Inside the task I am periodically arming DMA to receive these 72 bytes and suspending the task twice using ulTaskNotifyTake(). There're RxHalfComplete and RxFullComplete callback functions which are called by DMA IRQ handler when it half or fully finishes reception. Inside these callback functions there's nothing but vTaskNotifyGiveFromISR with portYIELD_FROM_ISR calls.

When arming DMA a buffer must be provided. The buffer can be declared as stack variable inside task's body or as private member variable of a class. What perplexing me is that when buffer is the stack variable - everything is OK. And if it is member of the class - Hard Fault comes within several seconds.

Now, the necessary pieces of code.

As I said FreeRTOS native calls a wrapped into C++ classes. So here is how the wrapper for task creation lools like:

class Thread
{
public:
        Thread(const char* const threadName, uint16_t stackDepth, UBaseType_t threadPriority)
        {
                BaseType_t result = xTaskCreate(
                                        TaskAdapter, 
                                        threadName, 
                                        stackDepth,
                                        this, 
                                        threadPriority,
                                        &handle);
        }
       
       TaskHandle_t getTaskHandle(void)
       {
           return handle;
       }
protected:
        virtual void run(void) = 0;
private:
        TaskHandle_t handle;
 
        static void TaskAdapter(void *pvParameters)
        {
            Thread *task = static_cast<Thread *>(pvParameters);         
            task->run();
            #if (INCLUDE_vTaskDelete == 1)
                vTaskDelete(task->handle);
            #endif
        }
 }

It is inherited by another class which implements desired behaviour:

class DataGrabberReceiver : public Thread
{
    private:
        uint8_t classMemberByteBuffer[72];
 
    public:
        DataGrabberReceiver(const char* const threadName, uint16_t stackDepth, UBaseType_t threadPriority) : Thread(threadName, stackDepth, threadPriority) { //no-op here }; 
 
        void run(void) override
        {
                uint8_t stackAllocatedByteBuffer[72];
                QueueHandle_t queue = xQueueCreate(72, sizeof(uint8_t));
 
                while(true)
                {
                        //with this if-statement everything works
                        if(HAL_OK == HAL_UART_Receive_DMA(&huart_DataGrabber, stackAllocatedByteBuffer, 72))
                        //and with this one HardFault comes in seconds after start
                        //if(HAL_OK == HAL_UART_Receive_DMA(&huart_DataGrabber, classMemberByteBuffer, 72))
                         //only one above mentioned if-statement can be uncommented
                         {
                                 //waiting for the first half being received into buffer
                                 ulTaskNotifyTake(pdTRUE, portMAX_DELAY);  
                                 //and copying it inside the queue
                                 for(int i = 0; i < 36; ++i)
                                 {
                                        xQueueSendToBack(queue, &stackAllocatedByteBuffer[i], 1);
                                        //xQueueSendToBack(queue, &classMemberByteBuffer[i], 1);
                                }
                                //waiting for the second hald being received into buffer
                                 ulTaskNotifyTake(pdTRUE, portMAX_DELAY);
                                 //and copying further
                                 for(int i = 36; i < 72; ++i)
                                 {
                                        xQueueSendToBack(queue, &stackAllocatedByteBuffer[i], 1);
                                        //xQueueSendToBack(queue, &classMemberByteBuffer[i], 1);
                                }
                         }
                }
        }
};

The overriden RxCallbacks:

void HAL_UART_RxHalfCpltCallback(UART_HandleTypeDef *huart)
{
    if(huart->Instance == huart_DataGrabber) 
    {   
        BaseType_t px = pdFALSE;
        vTaskNotifyGiveFromISR(dataGrabber_handle, &px);
        portYIELD_FROM_ISR(px);
    }
}
 
void HAL_UART_RxCpltCallback(UART_HandleTypeDef *huart)
{
    if(huart->Instance == huart_DataGrabber) 
    {   
        BaseType_t px = pdFALSE;
        vTaskNotifyGiveFromISR(dataGrabber_handle, &px);
        portYIELD_FROM_ISR(px);
    }
}

The code in main.cpp:

#include "stm32f1xx_hal.h"
#include "DataGrabberReceiver.h"
 
UART_HandleTypeDef huart_DataGrabber;
TaskHandle_t dataGrabber_handle;
 
int main(void)
{
    HAL_Init();
    SystemClock_Config();
    MX_GPIO_Init();
    MX_DMA_Init();
    MX_USART2_UART_Init();
 
    auto receiverThread = DataGrabberReceiver("DataGrabberReceiver", 128, 3);
    dataGrabber_handle = receiverThread.getTaskHandle();
 
    vTaskStartScheduler();
 
    /* Here we should never get */
    while(1) {  }
}

everything is quite obvious there and even more: it is generated by ST CumeMX. So task is created, scheduler started and... everything works fine if inside DataGrabberReceiver::run's while cycle I work with stackAllocatedByteBuffer[]. And I am facing HardFault if I work with classMemberByteBuffer[].

Let me answer some questions:

0. Task's stack was set at different sizes. 128 words, 1024 words.. all the same.

1. Yes, NVIC interrupt priorities of DMA and UART are logically lower than configMAX_SYSCALL_INTERRUPT_PRIORITY.

2. If one makes classMemberByteBuffer static or global - everything works fine.

3. Heap size in FreeRTOS config file set at 30KB.

4. I am using heap4.c and have tried to move it to another location, defining manualy ucHeap array and asking linker to put it at certain RAM position - it did not helped.

5. If one rejects wrapper everything seems working. But I need runtime polymorphism for interfaces and incapsulation. C function pointers are not the case.

6. When in HardFault stack trace (I am using Segger J-Link via SWD) is like this

Thread #1 57005 (Suspended : Signal : SIGTRAP:Trace/breakpoint trap)    
    HardFault_Handler() at stm32f1xx_it.c:78 0x8003a20  
    <signal handler called>() at 0xfffffff1 
    uxListRemove() at list.c:218 0x8004748  
    xTaskIncrementTick() at tasks.c:2,571 0x800565e 
    xPortSysTickHandler() at port.c:445 0x8004534   
    osSystickHandler() at cmsis_os.c:1,415 0x8004624    
    SysTick_Handler() at stm32f1xx_it.c:166 0x8003a4a   
    <signal handler called>() at 0xfffffffd 
    prvPortStartFirstTask() at port.c:270 0x8004360 
    xPortStartScheduler() at port.c:350 0x8004402

The cause is floating and makes one think that something wrong with kernel which I believe is impossible.

7. Class member variables are stored somewhere in RAM heap while task's stack variables are stored in FreeRTOS's heap. They are separated in RAM space. But I don't think that someone's (not FreeRTOS's) stack overlaps the RAM heap and currupting memory.

Who can explain or have faced with such "magic"? I am almost sure that memory is currupted somehow, but there is no reasons for race conditions because the byteBuffer is guarded. But HOW?? Help me please!

10 REPLIES 10
Ozone
Lead

> Who ... have faced with such "magic"?

I had something similar to that some while ago, only different MCU (Fujitsu proprietary) and different OS (OSEK).

One task passed a stack-based buffer to an IO-related function, which got served in the the context of another task.

Due to some unfortunate conditions, the second (IO-related) task wrote on the supplied address while the calling task was terminated and the respective stack space re-used for another task.

Depending on the state of said task, either a data value or a return address was hit - resulting in strange errors or crashes.

FYI: OSEK, or our implementation, uses sequential tasks and task chains. Meaning, they are supposed to terminate in a given time limit, and are periodically reinvoked.

Only the idle/background task is implemented as endless loop.

> 2. If one makes classMemberByteBuffer static or global - everything works fine.

I would settle for that solution.

IProg.1759
Associate II
javadog42
Associate II

Hi,

the sourceforge link doesn't work anymore. Do you have an issue ID of this problem or some short explanation, how this happens?

Thanks !!

How this happens? Simple. People that don't know enough C try to use much more complex C++ and obviously get all sorts of fun. Three weeks of wasted time, how cool is this? Learn on others' mistakes but don't copy them.

-- pa

/* a hint: in the first post, line 15 in listing of main.cpp: what does it really do?? */

Hello. Indeed, link does not exist any more... Anyway, the core answer was the following. When one declares and initiates a variable inside the body of main function it's memory allocated on the stack. Everything is okay until the FreeRTOS scheduler started. After vTaskStartScheduler(); called the stack pointer is rewinded to the very begining (sorry, to the very end of RAM). Scheduler begins it work actively using the stack and cheerfully overwriting already allocated local variables declared in main function body.

Thus the core of the problem is the stack pointer rewind to the end of RAM during launching the FreeRTOS scheduler. One can clearly see this asm code following this chain of function calls: vTaskStartScheduler() -> xPortStartScheduler() -> prvPortStartFirstTask(). And there is that tricky instruction:

" msr msp, r0         \n" /* Set the msp back to the start of the stack. */

In conclusion I would like to say that this problem was not the only I've faced trying to combine mordern C++, HAL and FreeRTOS. I've gave up this idea and moved to pure C. In C one also can write construct program using OOP. Not as convinient as C++, but stable and predictable.

It creates the task DataGrabberReceiver. In theory it should be equivalent to xTaskCreate() call. The left hand side auto receiverThread and following line 16 are useless in this reduced example, but I've used auto receiverThread to put it as a pointer into a map std::map<ObjectKey, std::shared_ptr<void>> storage; where ObjectKey is the hand written class enum used as key. The idea was to get out of map any object (thread handle to notify it; queue; semaphore and so on) by it's known key in any desired and allowed point of program avoiding dependency injection, global variables and certain includes. Only universal "Storage.h" with specific methods. But this idea also crushed in misunderstanding of how modern C++ and FreeRTOS interacts. If you're interested I can provide a full code of the described map storage.

javadog42
Associate II

Thanks a lot for your detailed answer. In the meantime I have found a lot of things about the FreeRTOS port on STM32, but my problem is a bit different, because the memory access fault happens much later and very seldom compared to the number of cycles in the code, and it's pure C. The problem lies somewhere in the combination of DMA, USART with 921800 bps and packet receive/transmit. Without transmit the fault doesn't happen. I'll post my findings later... hopefully!

  1. Does it work at lower speed?
  2. Do you use HAL?
  3. Can you post here as short problem code as possible? I mean can you throw away everything but transmitting thread and watch the fault reproducing?
javadog42
Associate II

Conclusion of my problem: a very simple stack overflow in a very simple FreeRTOS task, that does a bit GPIO and simply calls vTaskDelay(...).

Conclusion#2: Only trust your own code.

#define configMINIMAL_STACK_SIZE                 ((uint16_t)128)

This is not right. In my task, that just calls vTaskDelay, it is something around 160 at least.

Then you might think, that in a code full of assertions and tests for wrong parameter values or null pointers the following will help:

#define configCHECK_FOR_STACK_OVERFLOW	1

But in this case you have to implement (overload) at least this method by yourself:

/**
  Dummy implementation of the callback function vApplicationStackOverflowHook().
*/
#if (configCHECK_FOR_STACK_OVERFLOW > 0)
__WEAK void vApplicationStackOverflowHook (TaskHandle_t xTask, signed char *pcTaskName) {
  (void)xTask;
  (void)pcTaskName;
}
#endif

A debug break would make the developer happier than the two lines of code to make the compiler happy 😉