cancel
Showing results for 
Search instead for 
Did you mean: 

lwIP deadlock with a twist

XD
Associate II

Setup:
MCU: STM32F429
PHY: DP83848 (MII)
Advertised Mode: 10BASE-T, Half/Full-Duplex 100BASE-TX, Hal/Full-Duplex
OS: CMSIS-RTOSv2
CubeMX: Yes

 

Just wanted to start of with mentioning that i've read and tried the fixes  provided by @Piranha. I've tried the "Semaphore fix" but it has not helped, i still get stuck on the following row in the function xQueueSemaphoreTake (queue.c). uxItemSize does not equal zero and the CPU gets stuck/locks at that line. uxItemSize always equals 2779096485 when it gets stuck, regardless if its after 1 ping or after 5000 pings .

configASSERT( pxQueue->uxItemSize == 0 );

 

The interesting thing is that i'm able to ping the processor a few dozen times to a few thousand times before that processor locks.

The processor instantly gets stuck if i scan my network with Advanced IP Scanner.

However the processor does not get stuck when scanned, if i continuously poll USART3 at 1Hz. It seems to eventually get stuck but i'm able to scan it with Advanced IP Scanner several times, something im not able to do a single time if USART3 isn't used.
USART6 is constantly being used but the deadlock still occurs.

Both USARTs use the below functions for Rx/Tx.

HAL_UART_Transmit_DMA(&huart3, buffer, size);
HAL_UARTEx_ReceiveToIdle_DMA(&huart3, buffer, size);

The only noticeable difference i can tell is that USART3 uses a wait flag that is set in the HAL_UARTEx_RxEventCallback.
USART6 uses an osDelay to check the data.

osEventFlagsWait(UART_FlagHandle, 1, osFlagsWaitAny, osWaitForever);

 

I have no idea of what is going on, i assume it has something to do with the OS scheduling but i have no idea.
Im new to ethernet and networking so i'm not really sure where to start, any help is greatly appreciated!

 

 

 

 

4 REPLIES 4
Bob S
Principal

What version of CubeMX and CubeF4 library are you using (be specific, please don't say "the latest").

2779096485 = 0xA5A5A5A5, which looks suspiciously like something that fills memory (stack???) to see if it has been accessed.  Which means somehow pQueue has been corrupted.  Perhaps a stack overflow somewhere?  Turn on FreeRTOS stack overflow checking (CHECK_FOR_STACK_OVERFLOW, under "Config Paramters"->"Hook function related definitions", at least in older CubeMX versions).

When you halt the debugger when the code is stuck - what does the stack trace show?

XD
Associate II

Hi, 

Software versions:
CubeMX: v6.9.1
Hal firmware: v1.27.1
I've checked "use latest available version" but it hasn't upgraded to newer v1.28. Seems ST only added a license file to lwIP?

I enabled the stack overflow detection with option 1 & 2. It does not seem to have triggered, it should call vApplicationStackOverflowHook if an overflow occurs?

Below is the stack trace when the debugger halted when the processor gets stuck.

StackTrace.png

 

Bob S
Principal

Yes, the FreeRTOS scheduler should call vApplicationStackOverflowHook() when it detects an overflow.

This is definitely corrupted memory somewhere.  0xa5a5a5a5 is the value that FreeRTOS uses to fill the stack areas in order to check for overflow.  So something is corrupting the the "pxQueue" pointer before it gets passed to xQueueSemaphoreTake().

I'm not running the newer Ethernet/LwIP in CubeF4 ver 1.27, so I can't tell you WHICH semaphore or mutex low_level_output() is trying to acquire (the older versions do not use a semaphore/mutex there).  Start your search there.  See if you can find out who or what is clobbering that mutex/semaphore structure.

XD
Associate II

I don't know how i missed it but vApplicationStackOverflowHook() is called when calling MX_LWIP_Init(). An overflow occurs in the EthIf task, i increased it by 4x for testing purposes and it seems to work. I've pinged and scanned the device continuously for about 5 hours now without a single lockup.

 

  /* create the task that handles the ETH_MAC */
/* USER CODE BEGIN OS_THREAD_NEW_CMSIS_RTOS_V2 */
  memset(&attributes, 0x0, sizeof(osThreadAttr_t));
  attributes.name = "EthIf";
  attributes.stack_size = (INTERFACE_THREAD_STACK_SIZE * 4);
  attributes.priority = osPriorityRealtime;
  osThreadNew(ethernetif_input, netif, &attributes);
/* USER CODE END OS_THREAD_NEW_CMSIS_RTOS_V2 */