cancel
Showing results for 
Search instead for 
Did you mean: 

LwIP / RTOS netconn thread hang

greg239955_stm1
Associate II
Posted on May 26, 2015 at 13:01

I'm hoping someone can help me out here.

I have: - App running LwIP/FreeRTOS (generated from Cube MX - tcpip thread running at real time priority - application thread using netconn to send/receive, at a priority above normal Intermittently the application thread gets stuck. After much debugging, I've found the problem is the tcpip_thread interrupting the application thread, and screwing up the semaphores that signal when an operation is done. - The application thread ends up callingdo_writemore(), which when data is all sent calls sys_sem_signal(&conn->op_completed) - The tcpip_thread callsdo_recv(), which at the end, it does a TCPIP_APIMSG_ACK(msg) which is also a sys_sem_signal(&m->conn->op_completed) Normally the sequence is: - app thread: send start - app thread: send op_completed comes back - tcp thread: receive start - tcp thread: receive op completed comes back When it hangs, the order is: - app thread: send start - tcp thread: receive start - tcp thread:receive op completed sys_mutex_unlock() fails here, xSemaphoreGive() returns queue full error - send op_completed never arrives, application thread hangs for ever If I change my application thread to the same priority as the tcpip_thread(), its seems to fix the issue. Debugging shows this:api_msg.c:

if (write_finished) {
/* everything was written: set back connection state
and back to application task */
conn->current_msg->err = err;
conn->current_msg = NULL;
conn->state = NETCONN_NONE;
** context switch occurs here **
#if LWIP_TCPIP_CORE_LOCKING
if ((conn->flags & NETCONN_FLAG_WRITE_DELAYED) != 0)
#endif
{
sys_sem_signal(&conn->op_completed);
}
}

It switches context to the tcpip_thread() after setting the NETCONN_NONE state, but before signaling the operation is complete. Am I missing something here? Thanks. #lwip #stm32 #lwip #freertos
2 REPLIES 2
greg239955_stm1
Associate II
Posted on May 27, 2015 at 01:52

I figured out that the Netconn API is not thread safe. Full duplex has to done using LWIP_SO_RCVTIMEO and setting recv_timeout on the socket, and doing TX after RX has finished or timed out.

Would be nice if the ST LwIP application note made this clearer.

StefanoBettega1
Associate II
Posted on August 05, 2015 at 16:57

Hi Greg,

I'm experiencing a similar issue. Sometimes my application crashes in tcpip_thread, around API message handling:

... 
case
TCPIP_MSG_API:
LWIP_DEBUGF(TCPIP_DEBUG, (
''tcpip_thread: API message %p
''
, (
void
*)msg));
msg->msg.apimsg->function(&(msg->msg.apimsg->msg)); 
//<<< crash around here
break
;
...

Actually I have one thread using a connection to read and write on a socket (offering a telnet like connection for console) and a second thread that opens up to four connection to different addresses, each one having its read and write operations. All of my tasks uses netconn api, and there is no cross socket task usage. Crash happens randomly and I'm not able to find a suitable explanation; I think I am experiencing a similar problem with a semaphore being screwed up by another task for some reason. I hope to find a way to come out from this situation...