STM32F407VET6 Custom Board Ethernet not working

rromano001 · ‎2021-07-19

Hello, F407VET TQFP100 on custom board, ethernet PHY LAN8742A, STMCUBEIDE 1.6.1 on Linux (mint).

PHY and MCU both clocked by 25MHz oscillator, reference clock 50MHz is OK. Cpu running @168MHz

FREERTOS 3 task:

default init LWIP, start HTTPD thread then feed TCPIP ECHO server

Task1: feed UDP Echo Server

Task2: managing RGB led

Task 2 is OK but no signal come from Ethernet remain in idle color fading.

default task init LWIP, start HTTPD task, release init flag enter loop..

task 1 Ethernet echo enter loop after flag released.

Checked MDIO Bus activity with Saleae Logic 8 pro it show PHY doesn't complete initialization:

Involved register in communication:

control register address 0

status register address 1

interrupt source flag register address 0x1d

special control/status register address 0x1f

Decoded status on register read:

Register 0 two state were found decode:

0x7809 decode to -> all 10/100tx mode enabled, auto negotiation not complete, link down.

0x782d decode to -> all 10/100tx mode enabled, autonegotiation complete Link is UP

register 10 0x41 decode -> 4 reserved flag, 1 crossover time extention enabled

register 1d 0xca decode -> Energy EON, Autonegotiation complete, Autonegotiation LP Ack, Autonegotiation page received.

bit 0 is reserved so why write 1 when default to 0???

Simplified MDIO communication

write register 0 phyadd 0 0x8000 (Software reset)

repeatedly read register 0x1f Phyadd 0x1f (??) result 0xffff (Open bus)

read register 1 Phyadd 0 result for a while 0x7809

after many poll status change to 0x782d

...

write register 0 phyadd 0 0x1000 (Auto Negotiation)

read register 1 Phyadd 0 result 0x782d

read register 0x10 Phyadd 0 result 0x0041

read register 0x1d Phyadd 0 result 0x00ca

write register 0x1d Phyadd 0 result 0x00cb

read register 0x1d Phyadd 0 result 0x00ca

read register 1 Phyadd 0 result 0x782d

no more action appear on MDIO bus

No visible activity on rx tx PHY lines

Link led active solid.

Activity led pulse on ping packet

pcb is 4 layer, need some fixes due was planned for F429. Components shortage forced a change to F407, on this SPI 4 not present require some fix but nothing to do with LAN interface.

Has someone idea of what can be wrong? Read Piranha post but leave us at ground level, say what wrong ( I fell can be too) but no idea where are located (VERY) sparse files of Ethernet module.

Previous test were done on nucleo board with Adafruit LCD. Extended to FREERTOS and lan but none of old project compile on new ide. My cat router assistant declared hand off too ;)

Thank in advance

rromano001 · ‎2021-07-19

Update:

Connected all RX TX wires to get a more stable probing.

RX Gate and data is present, TX is missing. TX pin tested redeclaring as Output then driving in a sequence. Pin are OK

Work in progress still no idea from where trouble come from.Edit1: Found some hint from Majerle Tilen Site about Pin A8, try'd use as MCO but nothing changed.

Edit2: PHY configuration is ok, can work untouched too with resistor strap initial config. Issue come from MAC part, no transmit. Is Ping/ARP frame received? Inspection still in progress.

Edit 3: resumed Nucleo F429ZI, generated simple Ethernet project, compiled. Resultin negotiating DHCP and ping is active. Loaded HTTPD and echo UDP, TCP all is working, access HTTP page and status.

rromano001 · ‎2021-07-20

Appended trigger to TXen.

Nothing happen for a long time.

Pressed reset and released, one pulse is transmitted just after release. This time PHY is not ready...

Read errata about network issue on STM32F4x7, nothing appropriate and no mention of irresponsive network..

Actual setup

Edit1: traced both f407 and f429 code Ethernet register access not found. Someone know where MAC registe got accessed from?

Thank you

rromano001 · ‎2021-07-21

Warning this post can be changed as soon As I found issue and flow.

Init function use HAL, I don't like HAL due to I found broken and unusable.. It fell me Arduino we can move Earth to Mars without power versus professional we handle power and we leave planet where is.

call tree to what I found

1 Main.c

init user thread LWIP untouched

from Default task Init is launched

/* init code for LWIP */

MX_LWIP_Init();

2 LWIP.C

set IPV4 data then call tcpipinit

tcpip_init( NULL, NULL );

3 tcpip.c

void

tcpip_init(tcpip_init_done_fn initfunc, void *arg)

{

lwip_init();

4 init.c

x ethernetif.c

static void low_level_init(struct netif *netif)

{

uint32_t regvalue = 0;

read set PHY registers

x1 stm32f4xx_hal_eth.c

void HAL_ETH_MspInit(ETH_HandleTypeDef* ethHandle)

{

communication to PHY and set some Ethernet.

COntrol spawn across 3 task

stm32f4xx_hal_eth

on lines 415-479 register 16 of phy is interpreted as autonegotiation result, register 16 or 0x10 is listed on datasheet as EDPD NLP/CROSSOVER TIMEREGISTER

nothing to do with remote party negotiation

Register carrying this information can be 31 or 0x1f PHY SPECIAL CONTROL/STATUS REGISTER

but format is different on list, anyway

bit 4 is duplex PHY_DUPLEX_STATUS must be 0x10 not 4

1 Full

0 Half

Bit 3,2 PHY_SPEED_STATUS is assigned value ~~2 odd but ok~~ bit 3,2 value must be 4, 8, better to test for right value.

01 10 Base T

10 100 Base T

link task also carry on HORRIFIC goto too... Is this professional or not?

/* Read the result of the auto-negotiation */
    if((HAL_ETH_ReadPHYRegister(heth, PHY_SR, &phyreg)) != HAL_OK) <<--- // PHY_SR 0x10 is not status register, it must be 0x1f
    {
      /* In case of write timeout */
      err = ETH_ERROR;
      
      /* Config MAC and DMA */
      ETH_MACDMAConfig(heth, err);
      
      /* Set the ETH peripheral state to READY */
      heth->State = HAL_ETH_STATE_READY;
      
      /* Return HAL_ERROR */
      return HAL_ERROR;   
    }
    
    /* Configure the MAC with the Duplex Mode fixed by the auto-negotiation process */
    if((phyreg & PHY_DUPLEX_STATUS) != (uint32_t)RESET) <<-- // Mask must be 0x10 not 4 What mean RESET???
    {
      /* Set Ethernet duplex mode to Full-duplex following the auto-negotiation */
      (heth->Init).DuplexMode = ETH_MODE_FULLDUPLEX;  
    }
    else
    {
      /* Set Ethernet duplex mode to Half-duplex following the auto-negotiation */
      (heth->Init).DuplexMode = ETH_MODE_HALFDUPLEX;           
    }
    /* Configure the MAC with the speed fixed by the auto-negotiation process */
    if((phyreg & PHY_SPEED_STATUS) == PHY_SPEED_STATUS) <<-- // this is worst, mask must be 0x0c and test vaue 4 10 base t or 8 100 Base T 
    {  
      /* Set Ethernet speed to 10M following the auto-negotiation */
      (heth->Init).Speed = ETH_SPEED_10M; 
    }
    else
    {   
      /* Set Ethernet speed to 100M following the auto-negotiation */ 
      (heth->Init).Speed = ETH_SPEED_100M;
    }
  }
  else /* AutoNegotiation Disable */
  {
    /* Check parameters */
    assert_param(IS_ETH_SPEED(heth->Init.Speed));
    assert_param(IS_ETH_DUPLEX_MODE(heth->Init.DuplexMode));
    
    /* Set MAC Speed and Duplex Mode */
    if(HAL_ETH_WritePHYRegister(heth, PHY_BCR, ((uint16_t)((heth->Init).DuplexMode >> 3U) |
                                                (uint16_t)((heth->Init).Speed >> 1U))) != HAL_OK)
    {
      /* In case of write timeout */
      err = ETH_ERROR;
      
      /* Config MAC and DMA */
      ETH_MACDMAConfig(heth, err);
      
      /* Set the ETH peripheral state to READY */
      heth->State = HAL_ETH_STATE_READY;
      
      /* Return HAL_ERROR */
      return HAL_ERROR;
    }  
    
    /* Delay to assure PHY configuration */
    HAL_Delay(PHY_CONFIG_DELAY);
  }

this same code is present on f429zi working project

Add July 22:

Code on ethernetif.c module line starting at 664:

void ethernetif_update_config(struct netif *netif)
{
  __IO uint32_t tickstart = 0;
  uint32_t regvalue = 0;

Remember a long long time ago I was told NOT TO USE goto, and also AVOID enter exit control structure by this worst construct......

lines 680 to 730 cross control structure more than one level entering in else clause .. why we teach student not to use goto??

/* Wait until the auto-negotiation will be completed */
      do
      {
        HAL_ETH_ReadPHYRegister(&heth, PHY_BSR, &regvalue);
 
        /* Check for the Timeout ( 1s ) */
        if((HAL_GetTick() - tickstart ) > 1000)
        {
          /* In case of timeout */
          goto error; <<<<---- // HORROR!!!
        }
      } while (((regvalue & PHY_AUTONEGO_COMPLETE) != PHY_AUTONEGO_COMPLETE));
 
      /* Read the result of the auto-negotiation */
      HAL_ETH_ReadPHYRegister(&heth, PHY_SR, &regvalue); <<-- // register address required to be 0x1f not 0x10 !!!
 
      /* Configure the MAC with the Duplex Mode fixed by the auto-negotiation process */
      if((regvalue & PHY_DUPLEX_STATUS) != (uint32_t)RESET) <<-- // value must be 0x10 not 4
      {
        /* Set Ethernet duplex mode to Full-duplex following the auto-negotiation */
        heth.Init.DuplexMode = ETH_MODE_FULLDUPLEX;
      }
      else
      {
        /* Set Ethernet duplex mode to Half-duplex following the auto-negotiation */
        heth.Init.DuplexMode = ETH_MODE_HALFDUPLEX;
      }
      /* Configure the MAC with the speed fixed by the auto-negotiation process */
      if(regvalue & PHY_SPEED_STATUS) <<-- // mask required to be 0x0c, test value 04 08,  simple test here can be 04, good code test 04 08 and set error otherwise (Without goto possibly...)
      {
        /* Set Ethernet speed to 10M following the auto-negotiation */
        heth.Init.Speed = ETH_SPEED_10M;
      }
      else
      {
        /* Set Ethernet speed to 100M following the auto-negotiation */
        heth.Init.Speed = ETH_SPEED_100M;
      }
    }
    else /* AutoNegotiation Disable */
    {
    error : <<<<---- // HORROR!!! cross two control srtuctures
      /* Check parameters */
      assert_param(IS_ETH_SPEED(heth.Init.Speed));
      assert_param(IS_ETH_DUPLEX_MODE(heth.Init.DuplexMode));
 
      /* Set MAC Speed and Duplex Mode to PHY */
      HAL_ETH_WritePHYRegister(&heth, PHY_BCR, ((uint16_t)(heth.Init.DuplexMode >> 3) |
                                                     (uint16_t)(heth.Init.Speed >> 1)));
    }

intermixed HAL_delay, OS_delay.. code is called from default task, Hal delay is not thread safe, deadlock and other issue I found in the past discourage use...

HAL Lock unlock....

Critical section?? at almost monitor, a semaphore? MUTEX required?

#if (USE_RTOS == 1U)
  /* Reserved for future use */
  #error "USE_RTOS should be 0 in the current HAL release"
#else
  #define __HAL_LOCK(__HANDLE__)                                           \
                                do{                                        \
                                    if((__HANDLE__)->Lock == HAL_LOCKED)   \
                                    {                                      \
                                       return HAL_BUSY;                    \
                                    }                                      \
                                    else                                   \
                                    {                                      \
                                       (__HANDLE__)->Lock = HAL_LOCKED;    \
                                    }                                      \
                                  }while (0U)
 
  #define __HAL_UNLOCK(__HANDLE__)                                          \
                                  do{                                       \
                                      (__HANDLE__)->Lock = HAL_UNLOCKED;    \
                                    }while (0U)
#endif /* USE_RTOS */

rromano001 · ‎2021-07-25

Result step:

R72 MDIO pullup is listed as 10K on schematic, on board got reworked to 1K5 as I found on reference schematic and sparse across LAN8742A data sheet.

Suspecting F407 can be defective mounted an STM32F417VET6 bought from Mouser. I own just one more 417, I found some 407 available from secondary channels, Ebay, Aliexpress.

Mounted F417 still doesn't work, all freeze after executing Ethernet interrupt line 1180 (stm32f4xx_hal_eth.c)

stopping cpu code appear seldom on HardFault, often show port.c lines 754 770 (freezed)

configASSERT( ucCurrentPriority >= ucMaxSysCallPriority );

Added independent monochromatic blink led on hard fault loop never seen blink nor get solid color.

VCC 3.23V VCAP1=VCAP2 = 1.25V

Commenting out Network init code remaining task work fine.

Feel afraid and extremely scared about this issue as I read across forum plague all Ethernet parts since far past.

rromano001 · ‎2021-07-26

F417, generating a new project:

patching PHY register address,

commenting out all network tasks other than LWIP_Init

got DHCP negotiation:

IP address after negotiation remain zero(??)

From router table got address, it respond to ping.

Enabled HTTPD, browser work on demo page loading quite fast but slow down a lot during file access.

Enabling FreeRTOS config parameters:

USE_STATS_Formatting, USE_TRACE_FACILITY remain up and respond to dynamic page. (after patching stack space)

enabling GENERATE_RUN_TIME_STATS freeze all.

Default task 128Bytes stack is not enough, raised to 256 seems more stable.

One thing is now out: Hardware. At almost 417 one is ok. System is too much unstable and not usable on industrial environment.

Generated another base project targeted to 407VE, as is froze. Applied stack increment to default task respond to ping.

Another strange why are 407 and 417 different, one work as is the other is immediately unstable. (Not hardware but perspective of CubeIde)

Cable MUST be connected at reset otherwise doesn't work.

When network was connected ~~one time~~ at reset, then repeatedly disconnecting and reconnecting network cable ping resume.

(Edit August 21) Error come from weird code that disable interface instead of managing link up and down.

407 board maybe damaged, doesn't work, just wait some new 407 I ordered. When arrive try rework both mcu and PHY.

rromano001 · ‎2021-08-18

Updated Cubeide to new version 1.7.0:

many RTOS setting reverted to default, all parameter listed need be checked again.

New fresh project now respond to ping. PHY Address still set to 1 when all board use 0. Status register address still 0x10

Old project after restoring settings respond to ping but HTTP task is no more working. (Work in progress)

Sebastiaan · ‎2021-08-18

I think I've seen issues with hard faults in the past when the configured stack was too low. It's good to enable the stack overflow monitoring in freertos configuration. I would anyway increase the stack to e.g. 768 to have a bit of margin...

I have a working project with quite some IP functionality on cubemx 5.6.1. Once ping is working fine, it doesn't take too long to send a HTTP request and walk through the ethernet layers to see where the packet is discarded.

rromano001 · ‎2021-08-19

hello Sebastiaan, at first thank for support.

Hard fault is not fired as written thru text, system freeze and lost control, yes it resemble a stack failure.

Stack overflow monitor was enabled, not of help when mcu is freezed with no idea where it was freezed, so catched by single stepping code or dynamic modifying bit of an execution flag.

About working project, same code was working on F417, not working on F407 this time is HARD to get some new part to test. It work fine on F429Z nucleo board hosted.

Project is from February, prototypes never got at an usable level. Actually reworked to use an F750 or better H750 I use on another project. Till 2022 when we all hope silicon available again we have not so many choice than use what we have.

Project are FPGA centric, retrofit an old control of an huge and expensive machinery. Scope of this small board is receive send data from master FPGA simulating an ancestral usart no more available. Nowaday Pc no more carry on rs232 port, this can add alternate and more support by network: USART to UDP/TCP Telnet and remote video keyboard. But still not finished due blocking issues.

About your project can you kindly do some test?

Try force Ethernet to Half/Full duplex or worst 10Mbps and tell me if it work. Industrial network MUST support 10BaseT.

Start with cable unplugged, check if connection get up at cable plug in.

Do some sort of Ethernet overload test, flooding ping is a good start. Still running or crash?

Best Regards

Roberto

Sebastiaan · ‎2021-08-19

It will take at least a few days before I can verify this. Anyway, it means that F417 was fine, and I would assume full HW equivalence.

With respect to "pluggability", yes we have tested this before. See also my post on https://community.st.com/s/question/0D50X0000C7cinu/stm32f207vc-ethernet-with-lwip-works-after-flashing-but-not-any-more-after-power-supply-is-set-off-and-on-again . It's recommended to call "netif_set_up(&gnetif);" directly after MX_LWIP_Init() (or in the same function, if you want to edit that code). The NETIF will need to be set up such that ethernetif_set_link() function can eventually set the LINK up (a next step) when the a cable is connected.

Also take into account that the reception of packets is interrupt driven, maybe your interrupt pin is not configured properly?

So first make sure that link_up is properly detected, then verify if you can ping it (with correct IP settings and IRQ pin), and the higher layers are mostly SW (apart from e.g. MAC address filtering settings which I also needed to change to allow intake of multicast packets).