cancel
Showing results for 
Search instead for 
Did you mean: 

Ethernet not working on STM32H7x3

ST Community
ST Employee

The Ethernet peripheral on STM32H7x3 is not sending, or receiving data correctly. Or, the IP stack is not able to establish connection to other devices.

What could be the problem?

In most cases, the problem is related to memory layout and Memory Protection Unit (MPU) configuration.


Solution

It is strongly recommended to follow the same configuration as in examples provided in STM32CubeH7 package:

e.g. Projects\STM32H743ZI-Nucleo\Applications\LwIP\LwIP_HTTP_Server_Netconn_RTOS.

 

The following conditions should be met:

  • All data passed to Ethernet peripheral (ETH) must NOT be stored in DTCM RAM, since the ETH DMA can't access DTCM RAM (starting at 0x20000000)
  • Projects generated with STM32CubeMX can put variables by default to DTCM RAM. The solution is to place them to D1 SRAM (starting at 0x24000000)
  • TX and RX DMA descriptors (defined in ethernetif.c file) are  recommended to be in D2 SRAM and be configured by MPU as Device memory or Strongly-ordered type. 
  • These should be set by default by CubeMX for IAR and Keil IDEs. For GCC, you need to modify the linkerscript (please see the examples.
  • MPU configuration can be found in main.c file in the example
  • RX buffer (defined in ethernetif.c file) is recommended to be located in D2 SRAM and must not be configured as Device memory type (because IP stacks like LwIP can access the received data unaligned).

  • This is also set by default by CubeMX for Keil and IAR IDEs. Again For GCC, you need to modify the linkerscript (please see the examples).
  • Shouldn't be placed in MPU region of TX and RX DMA descriptors

  • Should be configured as non-cacheable memory, or write-through
  • This is required for HAL library version 1.2.0 and might not be implemented in the library examples

The example MPU configuration can look like (please see attached file) for the MPU configuration code 

  • 256 bytes at 0x30040000 configured as Shared Device, MPU region 2 (required for overlapping)
    • This is for RX and TX DMA descriptors
  • 16 kb at 0x30044000 configured as write-through, MPU region 1
    • This is for TX buffers allocated by LwIP
  • 16 kb at 0x30040000 configured as non-cacheable, MPU region 0 (required for overlapping)
    • This is for RX buffer used by the Ethernet driver

This is how it is implemented in STM32CubeH7 examples, and it works with the current implementation of the library (v1.2.0). Other configurations (e.g. MPU) might also work, but might also fail with specific compiler options (especially with optimization flags).


Explanation

The data are managed by the dedicated DMA in the Ethernet peripheral. This means that:

  • Receive and transmit buffers must be accessible by DMA
  • All data (including DMA descriptors) must be effectively written to the SRAM before triggering/enabling the DMA

The Cortex-M7 can perform accesses to Normal memory type out-of-order. So if we have sequence write to SRAM (Normal type by default), then write to Ethernet register (Device by default), those two operations can be switched by the CPU. And this actually happens in some cases.

So the TX and RX DMA descriptors should be configured to Device or Strongly-ordered type, since:

  • accesses to Device type must be performed in-order compared to other Device type accesses
  • accesses to Strongly-ordered type must be executed in-order to all other types (Strongly-ordered access acts as memory barrier).

The RX buffer should be non-cacheable or write-through, because at some cases the RX buffer can be reused as TX buffer inside the LwIP stack. E.g. when pinging the device. So sometimes the valid data would be in cache (TX) and sometimes in SRAM (RX). The write-through configuration can solve this issue, but requires cache invalidation during the data reception.
 

Comments
jacksong
Associate II

Hi, Kersten HEINS. Thanks for your document about this issue, which help me save a lot of time to develop the Ethernet part.

In my project, I configured the Ethernet as you suggested, while I didn't enable the cache(I also commented all the cache operation parts in the Ethernet API), but I found that some of the git branches of my project can work well and some branches' Ethernet didn't work well, e.g., mass of ping packets lose and incorrect ping replies. I looked into this issue, then found that each of the incorrect ping packet always contained several former ping relies. Then I debugged the Ethernet driver and checked its corresponding registers, then found that maybe we need to add the memory barrier in the Ethernet transmit function in this case that I had. I patched this function in my project as follows:

HAL_StatusTypeDef HAL_ETH_Transmit_patched(ETH_HandleTypeDef *heth, ETH_TxPacketConfig *pTxConfig, uint32_t Timeout)
{
  uint32_t tickstart;
  const ETH_DMADescTypeDef *dmatxdesc;
 
  if(pTxConfig == NULL)
  {
    heth->ErrorCode |= HAL_ETH_ERROR_PARAM;
    return HAL_ERROR;
  }
 
  if(heth->gState == HAL_ETH_STATE_READY)
  {
    /* Config DMA Tx descriptor by Tx Packet info */
    if (ETH_Prepare_Tx_Descriptors(heth, pTxConfig, 0) != HAL_ETH_ERROR_NONE)
    {
      /* Set the ETH error code */
      heth->ErrorCode |= HAL_ETH_ERROR_BUSY;
      return HAL_ERROR;
    }
 
    /*  All the ring register updating  must be executed in a strongly-order  */
    __DMB();
    __DSB();
 
    dmatxdesc = (ETH_DMADescTypeDef *)(&heth->TxDescList)->TxDesc[heth->TxDescList.CurTxDesc];
 
    /* Incr current tx desc index */
    INCR_TX_DESC_INDEX(heth->TxDescList.CurTxDesc, 1);
 
    /* Start transmission */
    /* issue a poll command to Tx DMA by writing address of next immediate free descriptor */
    WRITE_REG(heth->Instance->DMACTDTPR, (uint32_t)(heth->TxDescList.TxDesc[heth->TxDescList.CurTxDesc]));
 
    tickstart = HAL_GetTick();
 
    /* Wait for data to be transmitted or timeout occured */
    while((dmatxdesc->DESC3 & ETH_DMATXNDESCWBF_OWN) != (uint32_t)RESET)
    {
      if((heth->Instance->DMACSR & ETH_DMACSR_FBE) != (uint32_t)RESET)
      {
        heth->ErrorCode |= HAL_ETH_ERROR_DMA;
        heth->DMAErrorCode = heth->Instance->DMACSR;
        /* Set ETH HAL State to Ready */
        heth->gState = HAL_ETH_STATE_ERROR;
        /* Return function status */
        return HAL_ERROR;
      }
 
      /* Check for the Timeout */
      if(Timeout != HAL_MAX_DELAY)
      {
        if(((HAL_GetTick() - tickstart ) > Timeout) || (Timeout == 0U))
        {
          heth->ErrorCode |= HAL_ETH_ERROR_TIMEOUT;
          heth->gState = HAL_ETH_STATE_READY;
          return HAL_ERROR;
        }
      }
    }
 
    /* Set ETH HAL State to Ready */
    heth->gState = HAL_ETH_STATE_READY;
 
    /* Return function status */
    return HAL_OK;
  }
  else
  {
    return HAL_ERROR;
  }
}

I tested my project on the STM32H753 Eval board using STM32Cube_v_1.3, every branch of my project can ping well now. I just want to share my solution about this special case that I had. Hope it can help others who may have the same issue.

FLACO.2
Visitor II

Hello,

[Edit - start]

It was lacking some details, so here they are:

[Edit - end]

I would like a MEM_SIZE for LWIP of about 2 times the one used here (let's say 32kB).

But increasing the size won't work, and end up in hardfault because 16kB is actually the "end" of D2 SRAM.

There seem to be no more space left.

What possibly would be the correct configuration to have more space, then ?

Thank you,

Fxois

SKana.3
Associate

HAL_ETH_IRQHandler call HAL_ETH_TxCpltCallback

when last descriptor is transmitted(only last descriptor is set IOC).

ETI described as Early Transmit Interrupt. Why we don't use it ?

When all descriptors had transferred to IO BUFFER they can be returned immediately and next transfer operation can prepare descriptors while current is physically sending to line.

 /* Packet transmitted */

 if (__HAL_ETH_DMA_GET_IT(heth, ETH_DMACSR_TI))

 {

  if (__HAL_ETH_DMA_GET_IT_SOURCE(heth, ETH_DMACIER_TIE))

  {

   /* Clear the Eth DMA Tx IT pending bits */

   __HAL_ETH_DMA_CLEAR_IT(heth, ETH_DMACSR_TI | ETH_DMACSR_NIS);

#if (USE_HAL_ETH_REGISTER_CALLBACKS == 1)

   /*Call registered Transmit complete callback*/

   heth->TxCpltCallback(heth);

#else

   /* Transfer complete callback */

   HAL_ETH_TxCpltCallback(heth);

#endif /* USE_HAL_ETH_REGISTER_CALLBACKS */

  }

 }

 Of course it needs to clear AIS if other of ETI bits are not set.

Why ETI is in group AIS(Error), but ERI in NIS(Normal)?

I read RM0468 but didn't get a full explanation.

Why we waiting end of physically transfer of current packet?

lech_s
Visitor II

 

Hello,

Could you please update the links? The following don't seem to work:
    please see attached file for MPU configuration code
    lwip_eth_mpu_configuration.txt (1)

Thanks in advance.

 




 



 

Laurids_PETERSEN
Community manager
Community manager

Hi @lech_s 

The file has now been updated and can be found in the attachment section of this article. 
Best regards,
Laurids

Piranha
Chief II

RX buffer ... Should be configured as non-cacheable memory, or write-through

The RX buffer should be non-cacheable or write-through, ... The write-through configuration can solve this issue, but requires cache invalidation during the data reception.

If there is no D-cache invalidation, the Rx buffer memory cacheability cannot be configured as write-through. But, if there is D-cache invalidation, then there is no need to waste the performance and Rx buffer memory cacheability should be configured as write-back, which is also the default for SRAM. I will also remind that ST's code still doesn't implement a correct D-cache maintenance.

This is required for HAL library version 1.2.0 and might not be implemented in the library examples

So you just say that ST ships broken examples and that makes it fine?

And why is the attachment file named ".txt" not like a normal ".c" file so that the code is formatted accordingly?

cicek
Associate

I applied the solution that suggested. but still im getting same error.

STM32H730VB, HAL Library v1.11.2, LWIP v2.1.2_Cube

  • I used D1_SRAM (0x24000000) for stack. i dont use DTCMRAM.
  • i splitted D2_SRAM (0x30000000) for ETH and other DMA operations. 1KB (starts at 0x30000000) for ETH DMA Descriptors, 16 KB (starts at 0x30004000) for other Peripherals DMA Buffers. RxPooling memory is not fitting 16KB so i descresed its size.

 

linker script:

MEMORY
{
DTCMRAM (xrw) : ORIGIN = 0x20000000, LENGTH = 128K
RAM (xrw) : ORIGIN = 0x24000000, LENGTH = 128K
RAM_D2 (xrw) : ORIGIN = 0x30000000, LENGTH = 32K
ETH_DMA (xrw) : ORIGIN = 0x30000000, LENGTH = 1K
DMA_BUFFER (xrw) : ORIGIN = 0x30004000, LENGTH = 16K
RAM_D3 (xrw) : ORIGIN = 0x38000000, LENGTH = 16K
ITCMRAM (xrw) : ORIGIN = 0x00000000, LENGTH = 64K
FLASH (rx) : ORIGIN = 0x8000000, LENGTH = 128K
}

SECTIONS
{
.RxDecripSection :
{
*(.RxDecripSection)
} >ETH_DMA

.TxDecripSection :
{
*(.TxDecripSection)
} >ETH_DMA

.Rx_PoolSection :
{
*(.Rx_PoolSection)
} >DMA_BUFFER

.dma_buffer : /* Space before ':' is critical */
{
*(.dma_buffer)
} >DMA_BUFFER
}

 

MPU Configuration

void MPU_Config(void)
{
MPU_Region_InitTypeDef MPU_InitStruct;

HAL_MPU_Disable();

MPU_InitStruct.Enable = MPU_REGION_ENABLE;
MPU_InitStruct.BaseAddress = 0x30000000;
MPU_InitStruct.Size = MPU_REGION_SIZE_1KB;
MPU_InitStruct.AccessPermission = MPU_REGION_FULL_ACCESS;
MPU_InitStruct.IsBufferable = MPU_ACCESS_BUFFERABLE;
MPU_InitStruct.IsCacheable = MPU_ACCESS_NOT_CACHEABLE;
MPU_InitStruct.IsShareable = MPU_ACCESS_SHAREABLE;
MPU_InitStruct.Number = MPU_REGION_NUMBER2;
MPU_InitStruct.TypeExtField = MPU_TEX_LEVEL0;
MPU_InitStruct.SubRegionDisable = 0x00;
MPU_InitStruct.DisableExec = MPU_INSTRUCTION_ACCESS_ENABLE;

HAL_MPU_ConfigRegion(&MPU_InitStruct);

MPU_InitStruct.Enable = MPU_REGION_ENABLE;
MPU_InitStruct.BaseAddress = 0x30004000;
MPU_InitStruct.Size = MPU_REGION_SIZE_16KB;
MPU_InitStruct.AccessPermission = MPU_REGION_FULL_ACCESS;
MPU_InitStruct.IsBufferable = MPU_ACCESS_NOT_BUFFERABLE;
MPU_InitStruct.IsCacheable = MPU_ACCESS_CACHEABLE;
MPU_InitStruct.IsShareable = MPU_ACCESS_NOT_SHAREABLE;
MPU_InitStruct.Number = MPU_REGION_NUMBER1;
MPU_InitStruct.TypeExtField = MPU_TEX_LEVEL0;
MPU_InitStruct.SubRegionDisable = 0x00;
MPU_InitStruct.DisableExec = MPU_INSTRUCTION_ACCESS_ENABLE;

HAL_MPU_ConfigRegion(&MPU_InitStruct);

MPU_InitStruct.Enable = MPU_REGION_ENABLE;
MPU_InitStruct.BaseAddress = 0x30000000;
MPU_InitStruct.Size = MPU_REGION_SIZE_32KB;
MPU_InitStruct.AccessPermission = MPU_REGION_FULL_ACCESS;
MPU_InitStruct.IsBufferable = MPU_ACCESS_NOT_BUFFERABLE;
MPU_InitStruct.IsCacheable = MPU_ACCESS_NOT_CACHEABLE;
MPU_InitStruct.IsShareable = MPU_ACCESS_NOT_SHAREABLE;
MPU_InitStruct.Number = MPU_REGION_NUMBER0;
MPU_InitStruct.TypeExtField = MPU_TEX_LEVEL1;
MPU_InitStruct.SubRegionDisable = 0x00;
MPU_InitStruct.DisableExec = MPU_INSTRUCTION_ACCESS_ENABLE;

HAL_MPU_ConfigRegion(&MPU_InitStruct);

HAL_MPU_Enable(MPU_PRIVILEGED_DEFAULT);

}

 

Version history
Last update:
‎2024-06-04 04:49 AM
Updated by: