2026-02-19 7:07 AM - edited 2026-02-19 7:43 AM
Hello! Recently I discovered that my program (STM32 TCP client) gets stuck in netconn_connect() function for about 5 minutes if server to which I want to connect to is not available when this function is called. Only after that time program would go into
if ((tcp_rexmit_rto_prepare(pcb) == ERR_OK) || ((pcb->unacked == NULL) && (pcb->unsent != NULL))) {statement (which is present in tcp_slowtmr() function). Once it happens, the debugger shows that:
TCP PCB shown in debugger
It's the tcp_rexmit_rto_prepare(pcb) condition that allows program to go into the if statement. Also, as debugger shows, nrtx is equal to 0, which means there were no retransmissions in the last 5 minutes (rtime=575, which is 287 seconds).
Interesting thing is it's possible to "unfreeze" the program way sooner by pinging the client device or by replugging Ethernet cable. netconn_connect() would finally finish and device could try to connect to server once again. By the way, the netconn I used was of blocking type. Non-blocking one leaves netconn_connect() immediately and returns ERR_ISCONN, but that doesn't help with reestabilishing connection to server if netconn_connect() was called when server wasn't available.
Possible solution
I tried many things to fix the problem, including disabling MPU (which I configured according to this guide) and changing all LwIP semaphore timeouts from infinity to hundreds of milliseconds. What finally helped was changing ethernetif_input() function present in ethernetif.c file:
/**
* @brief This function should be called when a packet is ready to be read
* from the interface. It uses the function low_level_input() that
* should handle the actual reception of bytes from the network
* interface. Then the type of the received packet is determined and
* the appropriate input function is called.
*
* netif the lwip network interface structure for this ethernetif
*/
void ethernetif_input(void* argument)
{
struct pbuf *p = NULL;
struct netif *netif = (struct netif *) argument;
/* OLD FOR LOOP */
// for( ;; )
// {
// if (osSemaphoreAcquire(RxPktSemaphore, TIME_WAITING_FOR_INPUT) == osOK)
// {
// do
// {
// p = low_level_input( netif );
// if (p != NULL)
// {
// if (netif->input( p, netif) != ERR_OK )
// {
// pbuf_free(p);
// }
// }
// } while(p!=NULL);
// }
// }
/* NEW FOR LOOP */
for( ;; )
{
osSemaphoreAcquire(RxPktSemaphore, TIME_WAITING_FOR_INPUT/*100*/); //both timeouts seem to work fine
LOCK_TCPIP_CORE();
HAL_ETH_ReleaseTxPacket(&heth); //release earlier transmitted packets
UNLOCK_TCPIP_CORE();
do
{
p = low_level_input(netif);
if (p != NULL)
{
if (netif->input(p, netif) != ERR_OK)
{
pbuf_free(p);
}
}
} while(p != NULL);
}
}
Using HAL_ETH_ReleaseTxPacket() function before do... while loop finally allowed program to properly retransmit SYN packet and not get stuck in netconn_connect(). From what I understand, old SYN packet, sent when server was down, was stuck in DMA and held reference to output segments, which made tcp_rexmit_rto_prepare() function return ERR_VAL and because of that tcp_slowtmr() would not get into if statement mentioned at the beginning of this post. HAL_ETH_ReleaseTxPacket() makes program acknowledge that SYN packet has been sent and thus retransmitting it becomes possible. Also, since HAL_ETH_ReleaseTxPacket() is being called when device performs a transmission (with use of low_level_output() function), pinging STM32 makes it release old Tx packets when it's responding to ping.
EDIT 1 Short non-infinite timeout in new for loop of ethernetif_input() seems to be more reliable and allows for faster "unfreeze" than long/infinite timeout.