‎2018-10-17 09:19 AM
I have attached a trace file showing the error.
When ethernetif_input() receives multiple frames in its for() loop, the 2nd frame is assigned the 1st frame's buffer so the 1st frame is processed twice. Then every following frame in the receive loop uses the wrong receive buffer, the previous message buffer.. In the case when I sent 10 UDP frames that were received within the for() loop, the 1st frame is correct, the 2nd is a duplicate of the 1st, then remaining are assigned the wrong buffer, and the last frame was never received. This is very repeatable. It happens whenever 2 or more frames are received by ethernetif_input() . Just being connected to a busy Ethernet is sufficient to receive multiple frames.
The attached file is too long to post inline.
One more issue to consider in the ethernetif_input() for() loop:
void ethernetif_input( void const * argument )
{
struct pbuf *p;
struct netif *netif = (struct netif *) argument;
for( ;; )
{
if( osSemaphoreWait( RxPktSemaphore, TIME_WAITING_FOR_INPUT ) == osOK )
{
do
{
p = low_level_input( netif );
if (p != NULL)
{
if (netif->input( p, netif) != ERR_OK )
{
pbuf_free(p);
}
}
/* Build Rx descriptor to be ready for next data reception */
HAL_ETH_BuildRxDescriptors(&heth);
}while(p!=NULL);
The call to netif->input( p, netif) only places the p into the tcpip_input() mbox. The Rx_Buff in the payload has not been processed. When HAL_ETH_BuildRxDescriptors(&heth) is called, the Rx_Buff is returned to DMA OWN, and the DMA can reuse the Rx_Buff before or while the IP stack is processing the contents, which can lead to corruption.
‎2020-01-15 02:53 PM
H7 issues listed here:
H7 Issues listed by Piranha at https://community.st.com/s/question/0D50X0000BOtfhnSQB:
Further comments:
‎2020-01-15 03:09 PM
@alister​ - Thanks for the exhaustive reply!
I also noticed the init changes, which to me at first glance look incorrect:
‎2020-02-02 03:37 AM
In reply to @Dave Nadler​ and @alister​ regarding my issue list. Alister's code probably fixes points 3-6, if the implementation is correct. Points 1, 2 and 7 are still broken. Check out my updated topic, as I've added more detailed description of points 1 and 2, and updated information on one of the most serious bugs in non-H7 series, which was recently fixed by ST. As for the point 7, here is an example from alister's code:
void ethernet_link_thread(void const * argument)
{
struct netif *netif = (struct netif *) argument;
for(;;)
{
HAL_ETH_Start_IT(&heth);
netif_set_up(netif);
netif_set_link_up(netif);
osDelay(100);
}
}
It's not allowed to call RAW API functions from other threads without proper protection. It's written clearly in lwIP documentation to which I've provided links in my topic. And the use of netif_set_up() is wrong - is't not for a link status. Also the code doesn't check the actual link and calls all of the functions all the time in a loop. This code is completely broken!
‎2020-02-02 02:16 PM
@Piranha​, thanks for the feedback.
>Alister's code probably fixes points 3-6, if the implementation is correct. Points 1, 2 and 7 are still broken.
I'd inspected the ST's Cube H7_FW V1.5.0 source code and fixed every problem I'd found. Cube H7_FW V1.6.0 is the same as V1.5.0.
Point 1, Missing compiler and CPU memory barriers.
I've not seen this fail. I'm loath to add its cycles if it's not really warranted.
Could you provide more information please?
Point 2, MMC counter interrupts not masked.
The ST's Cube H7_FW V1.5.0 does not use the MCC and at this time I'm not adding it.
So this is safe.
Point 7, Calling lwIP RAW API without protection.
Yes this slipped through. I replace ethernet_link_thread with EthPhy for my custom phy. I should have commented ethernet_link_thread out.
I haven't shared EthPhy as it's not the topic.
Another bug in ethernet_link_thread is initializing the link up before it knows its up and so if it's down the up's spurious.
‎2020-02-02 02:32 PM
This weekend I added links to detailed descriptions for points 1 and 2 in my issue list topic. Check it out! :)
‎2021-02-22 02:01 AM
now a year later... the bug still happens :(
We are working on a STM32H7 product, and are thinking about change to other vendor that will listen.. i fear we are not the only one thinking about this, and i hate if we are to stop using the STM32 :(
‎2021-02-22 03:28 PM
>now a year later... the bug still happens :(
Please list the H7 FW version that's still not working.
Have you tried https://community.st.com/s/question/0D50X0000C6eNNSSQ2/bug-fixes-stm32h7-ethernet?
‎2021-02-23 01:30 AM
Hello All,
Our team are working to resolve Ethernet issues.
The Ethernet driver will be fully reworked for multiple fixes by 21Q2.
Thank you for your patience while we work on this.
Imen
‎2021-07-13 09:51 AM
21Q2 is over. Is it fixed already? And if not, then what is the latest ETA?
‎2021-07-14 05:30 AM
IIRC there was intent to provide some fix or change for ThreadX integration.
I really hope that ST would split the ETH driver to a chip-specific and multiple "middleware" parts, as demonstrated here.
ST will own the chip-specific ("HAL") module.
Each network stack (LwIP, FreeRTOS, ThreadX ...) will get their own "middleware" part which deals with memory and DMA descriptors.
90% of bugs and user complaints refer to this "middleware" part so outsourcing it will benefit both ST and users.
-- pa