cancel
Showing results for 
Search instead for 
Did you mean: 

STM32F7 lwIP, TCP, dupacks and delayed retransmit - w/ source code

LCE
Principal II

Hello,

I'm working with a STM32F767, on a custom pcb and a Nucleo-144, same behavior, so it's probably not hardware / board related.

Source code:

  • Bare metal, no OS
  • SAI to ETH, via DMA
  • ETH: zero copy Tx & Rx
  • http server (TCP) works perfect
  • PTP via UDP works perfect
  • Debugging with UART, Wireshark, Jperf.

Problem:

  • streaming / sending TCP data from STM32 to Jperf server, 6.4 Mbit/s
  • every now and then there's a dupack, then fast retransmission, okay...
  • but usually after a few seconds there are lots of dupacks (>3), and after too many a retransmission occurrs with a delay of > 400 ms, killing my application

I played with the lwIP timers and the TCP retransmission timers, that reduced the delay to these 400 ms, before that it was sometimes 1.5 s.

I turned of everything in source code that might block.

I changed all ETH transfers to zero copy, TX with interrupt for non-blocking PTPd.

Any ideas?

Edit: I uploaded the most important source code parts - the complete zero copy (mostly non-Cube/HAL) ethernetif.c / .h

Edit 2: I uploaded the updated source, having removed some blunders...

For now TCP is running smoothly with 25.6 Mbps

1 ACCEPTED SOLUTION

Accepted Solutions
LCE
Principal II

At least I found the cause of the late collision error:

It's the HAL / Cube setting up of the LAN8742, rather the reading of the PHY settings.

I used the "old" STM32F7 Cubae HAL stuff for setting this up - the last part remaining where I use HAL / Cube for ethernet...

It's reading back the wrong PHY register, and then sets its own duplex mode wrongly in MACCR.

Aaaaaarrrghhh....!

I found that because www told me that the most probable cause for the late collision error is that one side has the wrong duplex mode set.

Then I found that someone in another forum complained about the wrong register reading in HAL / Cube for LAN8742.

Here's the wrong and corrected version from stm32f7xx_hal_eth.c :

I hope that and the code above might help other people.

#define PHY_REGNEW_SSR					(uint16_t)31			/* PHY Special Status Register */
#define PHY_REGNEW_SSR_SPD_100M			((uint32_t)1 <<  3)		/* Speed Indication, bits 4:2 */
#define PHY_REGNEW_SSR_DUPL_FULL		((uint32_t)1 <<  4)		/* Speed Indication, bits 4:2 */
 
#if( 0 )
/* 2022-10-01
 * HAL reading back wrong register for status of LAN8742!
 */
	/* WRONG */
		/* Read the result of the auto-negotiation */
		if((HAL_ETH_ReadPHYRegister(heth, PHY_SR, &phyreg)) != HAL_OK)
		{
			/* In case of write timeout */
			err = ETH_ERROR;
 
			/* Config MAC and DMA */
			ETH_MACDMAConfig(heth, err);
 
			/* Set the ETH peripheral state to READY */
			heth->State = HAL_ETH_STATE_READY;
 
			/* Return HAL_ERROR */
			return HAL_ERROR;
		}
 
		/* Configure the MAC with the Duplex Mode fixed by the auto-negotiation process */
		if((phyreg & PHY_DUPLEX_STATUS) != (uint32_t)RESET)
		{
			/* Set Ethernet duplex mode to Full-duplex following the auto-negotiation */
			(heth->Init).DuplexMode = ETH_MODE_FULLDUPLEX;
		}
		else
		{
			/* Set Ethernet duplex mode to Half-duplex following the auto-negotiation */
			(heth->Init).DuplexMode = ETH_MODE_HALFDUPLEX;
		}
		/* Configure the MAC with the speed fixed by the auto-negotiation process */
		if((phyreg & PHY_SPEED_STATUS) == PHY_SPEED_STATUS)
		{
			/* Set Ethernet speed to 10M following the auto-negotiation */
			(heth->Init).Speed = ETH_SPEED_10M;
		}
		else
		{
			/* Set Ethernet speed to 100M following the auto-negotiation */
			(heth->Init).Speed = ETH_SPEED_100M;
		}
#else
	/* CORRECT */
		/* Read the result of the auto-negotiation */
		if( HAL_ETH_ReadPHYRegister(heth, PHY_REGNEW_SSR, &phyreg) != HAL_OK )
		{
			/* In case of write timeout */
			err = ETH_ERROR;
 
			/* Config MAC and DMA */
			ETH_MACDMAConfig(heth, err);
 
			/* Set the ETH peripheral state to READY */
			heth->State = HAL_ETH_STATE_READY;
 
			/* Return HAL_ERROR */
			return HAL_ERROR;
		}
 
		/* Configure the MAC with the Duplex Mode fixed by the auto-negotiation process */
		if( phyreg & PHY_REGNEW_SSR_DUPL_FULL )
		{
			/* Set Ethernet duplex mode to Full-duplex following the auto-negotiation */
			(heth->Init).DuplexMode = ETH_MODE_FULLDUPLEX;
		}
		else
		{
			/* Set Ethernet duplex mode to Half-duplex following the auto-negotiation */
			(heth->Init).DuplexMode = ETH_MODE_HALFDUPLEX;
		}
		/* Configure the MAC with the speed fixed by the auto-negotiation process */
		if( phyreg & PHY_REGNEW_SSR_SPD_100M )
		{
			/* Set Ethernet speed to 100M following the auto-negotiation */
			(heth->Init).Speed = ETH_SPEED_100M;
		}
		else
		{
			/* Set Ethernet speed to 10M following the auto-negotiation */
			(heth->Init).Speed = ETH_SPEED_10M;
		}
#endif

View solution in original post

8 REPLIES 8
LCE
Principal II

Just added the most important source code parts ...

With "killing my application" I just mean that data transfer from SAI to ETH is stopped, because RAM is not big enough for such a time gap.

When I stop SAI DMA and restart ETH TX, it works again for a few seconds.

Further observations:

  • It stops much quicker with a higher data rate
  • I check every function with the CPU cycle counter, there's nothing blocking that long that might explain the delayed retransmission
  • I see that the TCP PCB's snd_buf gets full, unsent and unacked queue too
  • after the delayed retransmission, these queues are perfectly "emptied" as they should
LCE
Principal II

So, tcp_output() somehow doesn't work on the unsent queues.

I put some error messages in tcp_output() / tcp_output_segment(), nothing.

I checked the lwIP mem stats, all below maximum.

I read back the PHY registers, 100M full duplex.

The problem right now is, that I have the feeling that I have looked everywhere...

LCE
Principal II

Not everywhere:

I just started checking TX descriptor 0 = status for errors:

packets get lost / TX aborted because of late collision error.

Okay, at least I know why the packet's gone...

But that still doesn't explain this long retransmission delay.

LCE
Principal II

At least I found the cause of the late collision error:

It's the HAL / Cube setting up of the LAN8742, rather the reading of the PHY settings.

I used the "old" STM32F7 Cubae HAL stuff for setting this up - the last part remaining where I use HAL / Cube for ethernet...

It's reading back the wrong PHY register, and then sets its own duplex mode wrongly in MACCR.

Aaaaaarrrghhh....!

I found that because www told me that the most probable cause for the late collision error is that one side has the wrong duplex mode set.

Then I found that someone in another forum complained about the wrong register reading in HAL / Cube for LAN8742.

Here's the wrong and corrected version from stm32f7xx_hal_eth.c :

I hope that and the code above might help other people.

#define PHY_REGNEW_SSR					(uint16_t)31			/* PHY Special Status Register */
#define PHY_REGNEW_SSR_SPD_100M			((uint32_t)1 <<  3)		/* Speed Indication, bits 4:2 */
#define PHY_REGNEW_SSR_DUPL_FULL		((uint32_t)1 <<  4)		/* Speed Indication, bits 4:2 */
 
#if( 0 )
/* 2022-10-01
 * HAL reading back wrong register for status of LAN8742!
 */
	/* WRONG */
		/* Read the result of the auto-negotiation */
		if((HAL_ETH_ReadPHYRegister(heth, PHY_SR, &phyreg)) != HAL_OK)
		{
			/* In case of write timeout */
			err = ETH_ERROR;
 
			/* Config MAC and DMA */
			ETH_MACDMAConfig(heth, err);
 
			/* Set the ETH peripheral state to READY */
			heth->State = HAL_ETH_STATE_READY;
 
			/* Return HAL_ERROR */
			return HAL_ERROR;
		}
 
		/* Configure the MAC with the Duplex Mode fixed by the auto-negotiation process */
		if((phyreg & PHY_DUPLEX_STATUS) != (uint32_t)RESET)
		{
			/* Set Ethernet duplex mode to Full-duplex following the auto-negotiation */
			(heth->Init).DuplexMode = ETH_MODE_FULLDUPLEX;
		}
		else
		{
			/* Set Ethernet duplex mode to Half-duplex following the auto-negotiation */
			(heth->Init).DuplexMode = ETH_MODE_HALFDUPLEX;
		}
		/* Configure the MAC with the speed fixed by the auto-negotiation process */
		if((phyreg & PHY_SPEED_STATUS) == PHY_SPEED_STATUS)
		{
			/* Set Ethernet speed to 10M following the auto-negotiation */
			(heth->Init).Speed = ETH_SPEED_10M;
		}
		else
		{
			/* Set Ethernet speed to 100M following the auto-negotiation */
			(heth->Init).Speed = ETH_SPEED_100M;
		}
#else
	/* CORRECT */
		/* Read the result of the auto-negotiation */
		if( HAL_ETH_ReadPHYRegister(heth, PHY_REGNEW_SSR, &phyreg) != HAL_OK )
		{
			/* In case of write timeout */
			err = ETH_ERROR;
 
			/* Config MAC and DMA */
			ETH_MACDMAConfig(heth, err);
 
			/* Set the ETH peripheral state to READY */
			heth->State = HAL_ETH_STATE_READY;
 
			/* Return HAL_ERROR */
			return HAL_ERROR;
		}
 
		/* Configure the MAC with the Duplex Mode fixed by the auto-negotiation process */
		if( phyreg & PHY_REGNEW_SSR_DUPL_FULL )
		{
			/* Set Ethernet duplex mode to Full-duplex following the auto-negotiation */
			(heth->Init).DuplexMode = ETH_MODE_FULLDUPLEX;
		}
		else
		{
			/* Set Ethernet duplex mode to Half-duplex following the auto-negotiation */
			(heth->Init).DuplexMode = ETH_MODE_HALFDUPLEX;
		}
		/* Configure the MAC with the speed fixed by the auto-negotiation process */
		if( phyreg & PHY_REGNEW_SSR_SPD_100M )
		{
			/* Set Ethernet speed to 100M following the auto-negotiation */
			(heth->Init).Speed = ETH_SPEED_100M;
		}
		else
		{
			/* Set Ethernet speed to 10M following the auto-negotiation */
			(heth->Init).Speed = ETH_SPEED_10M;
		}
#endif

LCE
Principal II

The problem with lwIP's super late retransmission remains...

LCE
Principal II

The problem with lwIP's super late retransmission still remains...

But it's not that urgent anymore, because now I haven't seen a retranmission for some days.

I have uploaded some updated source code for those who are interested in zero-copy ethernet IO for STM32F7 without OS, and mostly without HAL.

I stole it myself from everywhere and mixed it up... :D

Hi LCE,

I'm trying to use LwIP with zero-copy Tx on STM32H7. Did you succeded to do it? Do you have code example ?

For now when I use zero-copy Tx, I have an issue on my implementation of low_level_output where I don't have enough DMA decriptor available.

 

 

Hi,

i reduced the time for retransmission in tcp_priv.h:

 

 

#ifndef TCP_TMR_INTERVAL

#define TCP_TMR_INTERVAL 25//usermodify:before it was 250 /* The TCP timer interval in milliseconds. */

#endif /* TCP_TMR_INTERVAL */

 

 

I use STM32H725IGK6 + PHY ADIN1100. Code generated by CUBEMX. No RTOS, with ETH and with LWIP.

 

Another time improvement was to disable nagle algorithm (to force sending short messages):

 

 

#define tcp_nagle_disable(pcb) tcp_set_flags(pcb, TF_NODELAY)