cancel
Showing results for 
Search instead for 
Did you mean: 

Azure RTOS threadX stack checking stuck in loop

GreenGuy
Lead

I have an application where I have enabled TX_ENABLE_STACK_CHECKING.  The application gets stuck in a do loop in the tx_thread_stack_analyze.c between lines 114 and 135.  (H7.3.1.0)  It is clear in the debugger that the loop will never complete.  The loop is checking for the stack fill pattern 0xefefefef using a binary search after checking the pointer are not NULL.     Here is the code:

 

                    /* We need to binary search the remaining stack for missing 0xEFEFEFEF 32-bit data pattern.
                       This is a best effort algorithm to find the highest stack usage. */
                    do
                    {

                        /* Calculate the size again. */
                        size =  (ULONG) (TX_ULONG_POINTER_DIF(stack_highest, stack_lowest))/((ULONG) 2);
                        stack_ptr =  TX_ULONG_POINTER_ADD(stack_lowest, size);

                        /* Determine if the pattern is still there.  */
                        if (*stack_ptr != TX_STACK_FILL)
                        {

                            /* Update the stack highest, since we need to look in the upper half now.  */
                            stack_highest =  stack_ptr;
                        }
                        else
                        {

                            /* Update the stack lowest, since we need to look in the lower half now.  */
                            stack_lowest =  stack_ptr;
                        }

                    } while(size > ((ULONG) 1));

 

At the start of the application, everything works fine.  After all the threads have been instantiated, and the application goes into idle mode waiting for commands from either the web server or the debug port a command is sent over the debug port to report on the status of some variables.  This kicks off the debug thread which causes a stack check to occur.  When the stack check process starts (code listed above) the end stack pointer is less than the start.  This means, that the code above will never converge and endlessly loop. 

the ThreadX List window shows the Debug thread stack as starting at 0z240285fc and the end at 0x240289f7 but when I break point the stack check, so I can see what the entry values are, the stack_lowest is 0x240285fc as expected but the stack_highest is 0x2402859c.  Should that not be the end as indicated in the is Thread List?  

Is this a bug?

 

7 REPLIES 7
nouirakh
ST Employee

Hello @GreenGuy 


Based on your description, it seems that there is a discrepancy between the expected end of the stack (0x240289f7) as shown in the ThreadX List window and the value you are seeing for stack_highest (0x2402859c) when the stack check is initiated. This discrepancy is indeed unusual because stack_highest should point to the end of the stack, and it should not be less than stack_lowest like you mentioned. It seems there might be an issue with the initial values of the stack pointers or the way the stack is being utilized by the debug thread. It is not necessarily a bug in the RTOS, but rather it could be an issue with the application's configuration or usage of the stack checking feature.

In order to reproduce this possible issue, could you please specify the STM32 product you are using and if possible give more details about your settings project configurations.

Hi @nouirakh ,

 

I am experiencing the same issue while debugging a ThreadX application using USB and BLE.  (STM32WB55 with FW package: v.1.19.0, CubeIDE Version: 1.15.1). The lowest stack address is taken from tx_thread_stack_start, but the highest is somehow converted, apparently with a bug.

stack_highest: 0x20008ff4

stack_lowest:  0x20008ff8

 

 

 

 

VOID  _tx_thread_stack_analyze(TX_THREAD *thread_ptr)
{

TX_INTERRUPT_SAVE_AREA

ULONG       *stack_ptr;
ULONG       *stack_lowest;
ULONG       *stack_highest;
ULONG       size;


    /* Disable interrupts.  */
    TX_DISABLE

    /* Determine if the thread pointer is NULL.  */
    if (thread_ptr != TX_NULL)
    {

        /* Determine if the thread ID is invalid.  */
        if (thread_ptr -> tx_thread_id == TX_THREAD_ID)
        {

            /* Pickup the current stack variables.  */
            stack_lowest =   TX_VOID_TO_ULONG_POINTER_CONVERT(thread_ptr -> tx_thread_stack_start);

            /* Determine if the pointer is null.  */
            if (stack_lowest != TX_NULL)
            {

                /* Pickup the highest stack pointer.  */
                stack_highest =  TX_VOID_TO_ULONG_POINTER_CONVERT(thread_ptr -> tx_thread_stack_highest_ptr);

                /* Determine if the pointer is null.  */
                if (stack_highest != TX_NULL)
                {

                    /* Restore interrupts.  */
                    TX_RESTORE

                    /* We need to binary search the remaining stack for missing 0xEFEFEFEF 32-bit data pattern.
                       This is a best effort algorithm to find the highest stack usage. */
                    do
                    {

                        /* Calculate the size again. */
                        size =  (ULONG) (TX_ULONG_POINTER_DIF(stack_highest, stack_lowest))/((ULONG) 2);
                        stack_ptr =  TX_ULONG_POINTER_ADD(stack_lowest, size);

                        /* Determine if the pattern is still there.  */
                        if (*stack_ptr != TX_STACK_FILL)
                        {

                            /* Update the stack highest, since we need to look in the upper half now.  */
                            stack_highest =  stack_ptr;
                        }
                        else
                        {

                            /* Update the stack lowest, since we need to look in the lower half now.  */
                            stack_lowest =  stack_ptr;
                        }

                    } while(size > ((ULONG) 1));

                    /* Position to first used word - at this point we are within a few words.  */
                    while (*stack_ptr == TX_STACK_FILL)
                    {

                        /* Position to next word in stack.  */
                        stack_ptr =  TX_ULONG_POINTER_ADD(stack_ptr, 1);
                    }

                    /* Optional processing extension.  */
                    TX_THREAD_STACK_ANALYZE_EXTENSION

                    /* Disable interrupts.  */
                    TX_DISABLE

                    /* Check to see if the thread is still created.  */
                    if (thread_ptr -> tx_thread_id == TX_THREAD_ID)
                    {

                        /* Yes, thread is still created.  */

                        /* Now check the new highest stack pointer is past the stack start.  */
                        if (stack_ptr > (TX_VOID_TO_ULONG_POINTER_CONVERT(thread_ptr -> tx_thread_stack_start)))
                        {

                            /* Yes, now check that the new highest stack pointer is less than the previous highest stack pointer.  */
                            if (stack_ptr < (TX_VOID_TO_ULONG_POINTER_CONVERT(thread_ptr -> tx_thread_stack_highest_ptr)))
                            {

                                /* Yes, is the current highest stack pointer pointing at used memory?  */
                                if (*stack_ptr != TX_STACK_FILL)
                                {

                                    /* Yes, setup the highest stack usage.  */
                                    thread_ptr -> tx_thread_stack_highest_ptr =  stack_ptr;
                                }
                            }
                        }
                    }
                }
            }
        }
    }

    /* Restore interrupts.  */
    TX_RESTORE
}

 

 

 

STM32H743

FW_1.11.2

STM32CubeIDE Version: 1.14.1

STM32CubeMX - STM32 Device Configuration Tool Version: 6.10.0-RC9

AZURE RTOS package STM32H7 3.2.0

 

 

xixi
Associate II

I've met the same issue. the issue probably was caused by a stack overflow. Interestingly, the thread where the stack error occurs is not the thread that is running, but the thread that is going to resume. So maybe you should add the stack of the ready thread.

cvanbeek
Associate III

Same problem:

  • STM32U585VITxQ
  • FW_U5 V1.7.0
  • STM32CubeMX version 6.13.0 (with ThreadX version 6.4.0)
  • TrustZone project
skeeter
Associate II

Same issue here with TX_ENABLE_STACK_CHECKING.

Calling tx_event_flags_set() from inside an ISR causes ThreadX to resume a thread, which calls the stack checker. The stack checker then loops for ever on size=0x7fffffff.

  • STM32CubeMX 6.14
  • STM32F4 MCU
  • ThreadX 6.1.10
skydiver
Associate II

I also ran into this, and it looks like when you have a real stack overflow the stack checking code gets stuck looping for a massive `size` variable value that it calculates with `TX_ULONG_POINTER_DIF()`.  In my case the stack_highest was lower than the stack_lowest on the thread that had an overflow.

Once I fixed my overflow it worked correctly, so I would check all of your threads for an overflow if you run into this.  You can modify the code to set a volatile flag variable if `stack_highest < stack_lowest`, and put a breakpoint on the flag assignment.  Then you can use the `thread_ptr` to figure out which thread is the problem.

It would probably be a good idea to make this code at least sit in a well-labeled while1 loop or something that indicates the problem is a stack overflow and not some other issue.  My application was "working fine" as far as I knew before I enabled stack checking, so my first thought was not "maybe I'm having a stack overflow" when my code suddenly got stuck in the new checking code.

I would imagine a lot of people enable this when they see suspicious behavior that might be from a stack overflow, so having it *** out when a stack overflow occurs doesn't make much sense.  Also, I had to manually calculate my max sizes, because the RTOS-aware feature in the IDE doesn't seem to notice it should populate that column.