cancel
Showing results for 
Search instead for 
Did you mean: 

Nx_TCP_Echo_Client Example Hard Fault if cable is unplugged while test running (STM32H573I-DK)

RomThi
Associate III

Hi,

i am developing an embedded software based on the TCP Client example for CubeMX. During tests with large number of packets I have got an hard fault error. It seems that it happens on a re-transmit. The hard fault can be easy reproduced with original code. For this the right IP address must be set. Also the packet count should be increased to have time to remove the LAN cable while running.

 

The Project was setup like this:

2024_09_06_11_36_35_Start_Project_from_Example.png   

My only modification:

#define TCP_SERVER_ADDRESS IP_ADDRESS(192, 168, 0, 213)

#define MAX_PACKET_COUNT 10000

 

Here a screenshot:

Untitled.png

It seems that a pointer gets damaged as the address 0xF338000 makes no sense.

 

Best regards,

Roman Thiel

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
RomThi
Associate III

Hello,

as I sad already:

1) I am using your unmodified example code that contains the code you posted!

2) On a re-transmit your code goes into a hard fault!

3) A re-transmit happens (e.g. if the CRC bad), no matter what you think. Forget the *** cable!

 

Here is the solution:

Add NX_PACKET_HEADER_PAD and NX_PACKET_HEADER_PAD_SIZE=4 to the processor.

 

Here the explanation:

The NX_PACKET_STRUCT contains already some data for padding at the end. See here:

typedef  struct NX_PACKET_STRUCT
{
...

#ifdef NX_PACKET_HEADER_PAD

    /* Define a pad word for 16-byte alignment, if necessary.  */
    ULONG       nx_packet_packet_pad[NX_PACKET_HEADER_PAD_SIZE];
#endif
} NX_PACKET;

With the described pre-processor settings this data is added and can be overwritten before a packet is send without damaging the data before. 

 

Here the test:

1) Generate TCP_Client code with cubemx 

2) Set MAX_PACKET_COUNT to 100000 in "app_netxduo.h"

3) Modify your server IP in in "app_netxduo.h"

4) During transfer unplug the cable for one second

5) Insert the cable back -> the transfer gets stuck without my solution, but with it keeps sending packets

 

Regards,

Roman

View solution in original post

10 REPLIES 10
STea
ST Employee

Hello @RomThi ,

Cable connection have to be checked after initialization and wait until Ethernet cable is connected, please find below an example of code:

do { /* Send request to check if the Ethernet cable is connected. */ 
ret = nx_ip_interface_status_check(&NetXDuoEthIpInstance, 0, NX_IP_LINK_ENABLED,&actual_status, 10); }

while(ret != NX_SUCCESS);

Regards 

In order to give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.
RomThi
Associate III

Hello,

I put your code in the main thread entry. 

static VOID App_Main_Thread_Entry (ULONG thread_input)
{
  /* USER CODE BEGIN Nx_App_Thread_Entry 0 */
	ULONG actual_status;
  /* USER CODE END Nx_App_Thread_Entry 0 */

  UINT ret = NX_SUCCESS;

  /* USER CODE BEGIN Nx_App_Thread_Entry 1 */
  do {
	  /* Send request to check if the Ethernet cable is connected. */
	  ret = nx_ip_interface_status_check(&NetXDuoEthIpInstance, 0, NX_IP_LINK_ENABLED,&actual_status, 10);
  } while(ret != NX_SUCCESS);
  /* USER CODE END Nx_App_Thread_Entry 1 */

 

As expected, there is no change because it is not an init problem. The cable is already checked in the original code.

 

Here the terminal output:

Nx_TCP_Echo_Client application started..
The network cable is not connected.
The network cable is connected.
STM32 IpAddress: 192.168.0.164

[192.168.0.213:20504] -> 'TCP Client on STM32H573-DK'

...


Here is the problem again:

I use the original demo code. The cable is plugged in, I press reset on the STM32H573I-DK, an IP is assigned, the test starts. While the data is going back and forth, I pull the cable. Then, I am in hard fault.

 

Regards,

Roman

 

 

 

 

STea
ST Employee

Hello @RomThi ,

i get you issue and i was not talking about checking the link after init but it should be checked periodically, and you need to implement proper handling to recover when the link is down.

Can you try this with DHCP enabled and tell us if it is still the case?
Regards 

In order to give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.
RomThi
Associate III

Hello,

As I already wrote, the problem is the handling of a retransmission. Retransmission also happens sometimes when the cable is not removed, so polling the connection will not fix the problem. And I think that polling the connection is much slower than the retransmitt handling.

Removing the LAN cable is only a quick way to force a retransmission and get the Hard Fault error. It is not the root cause!

I'm sorry, but this problem is too complex for me. Could you please ask someone to fix this bug internally?

 

Regards,

Roman

     

 

Hello,

I have take a closer look to the problem. I thing something is wrong with the re-transmit handling. 

Here is the problem:

1)  A packet of type NX_PACKET is send using nx_tcp_socket_send()

2)  After a few calls the function _nx_driver_packet_send() is called. Inside that function the packet structure is overwritten to build the massage header. See here:
2024_09_24_10_51_23_TCPClient_Debugging_Microsoft_Visual_Studio.png4) After a timeout, the pointer to the now damaged NX_PACKET structure is used for a re-transmit in the function _nx_tcp_socket_retransmit()

5) In the function _nx_ip_driver_packet_send() the pointer access will fail. See here:
2024_09_24_11_03_28_TCPClient_Debugging_Microsoft_Visual_Studio.png

Seems to be major issue in the re-transmit process of NetX. I'm surprised that no one has ever noticed that. Doesn't anyone ever pull the plug during data transmission? Happy path programming is only a small part of the job.

 

Regards,

Roman

Hello @RomThi ,

if proper link management is done in NetX you should not retransmit or use the send packet API when the link is down, or the cable is unplugged as the link status should be checked periodically.
check the way this is implemented in NX examples for the H5 series as a reference.

static VOID App_Link_Thread_Entry(ULONG thread_input)
{
  ULONG actual_status;
  UINT linkdown = 0, status;

  while(1)
  {
    /* Get Physical Link status. */
    status = nx_ip_interface_status_check(&NetXDuoEthIpInstance, 0, NX_IP_LINK_ENABLED,
                                      &actual_status, 10);

    if(status == NX_SUCCESS)
    {
      if(linkdown == 1)
      {
        linkdown = 0;
        status = nx_ip_interface_status_check(&NetXDuoEthIpInstance, 0, NX_IP_ADDRESS_RESOLVED,
                                      &actual_status, 10);
        if(status == NX_SUCCESS)
        {
          /* The network cable is connected again. */
          printf("The network cable is connected again.\n");
          /* Print UDP Echo Server is available again. */
          printf("UDP Echo Server is available again.\n");
        }
        else
        {
          /* The network cable is connected. */
          printf("The network cable is connected.\n");
          /* Send command to Enable Nx driver. */
          nx_ip_driver_direct_command(&NetXDuoEthIpInstance, NX_LINK_ENABLE,
                                      &actual_status);
          /* Restart DHCP Client. */
          nx_dhcp_stop(&DHCPClient);
          nx_dhcp_start(&DHCPClient);
        }
      }
    }
    else
    {
      if(0 == linkdown)
      {
        linkdown = 1;
        /* The network cable is not connected. */
        printf("The network cable is not connected.\n");
      }
    }

    tx_thread_sleep(NX_ETH_CABLE_CONNECTION_CHECK_PERIOD);
  }
}

the period if checking for network cable reconnection can be changed by changing NX_ETH_CABLE_CONNECTION_CHECK_PERIOD definition.
Regards  

In order to give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.
RomThi
Associate III

Hello,

as I sad already:

1) I am using your unmodified example code that contains the code you posted!

2) On a re-transmit your code goes into a hard fault!

3) A re-transmit happens (e.g. if the CRC bad), no matter what you think. Forget the *** cable!

 

Here is the solution:

Add NX_PACKET_HEADER_PAD and NX_PACKET_HEADER_PAD_SIZE=4 to the processor.

 

Here the explanation:

The NX_PACKET_STRUCT contains already some data for padding at the end. See here:

typedef  struct NX_PACKET_STRUCT
{
...

#ifdef NX_PACKET_HEADER_PAD

    /* Define a pad word for 16-byte alignment, if necessary.  */
    ULONG       nx_packet_packet_pad[NX_PACKET_HEADER_PAD_SIZE];
#endif
} NX_PACKET;

With the described pre-processor settings this data is added and can be overwritten before a packet is send without damaging the data before. 

 

Here the test:

1) Generate TCP_Client code with cubemx 

2) Set MAX_PACKET_COUNT to 100000 in "app_netxduo.h"

3) Modify your server IP in in "app_netxduo.h"

4) During transfer unplug the cable for one second

5) Insert the cable back -> the transfer gets stuck without my solution, but with it keeps sending packets

 

Regards,

Roman

Hello @RomThi ,

I totally get what you meant know. I understand that this issue could occur also when handling CRC errors as you mentioned  

I raised your issue internally ticket number 191936(for internal reference only) for more investigation 
thank you for pointing this out.
Regards

In order to give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

Hello,

Is there any progress in the investigation? I am experiencing exactly the same problem as @RomThi. I am actually running tests on my complete application, but the ethernet code is based on "Nx_TCP_Echo_Client" example. No matter if DHCP is on or off, client or server mode, the attempt to re-transmit the packet ends with Hard Fault - as @RomThi described.

I tried @RomThi's solution - unfortunately in my case the ethernet code becomes inoperable, even at ping level (despite successful initialization). I'm still working on it, I'm counting on ST support too.

Best regards

Adam