cancel
Showing results for 
Search instead for 
Did you mean: 

STM32H723 Ethernet UDP Transmit Fails After Several Hours (pbuf_alloc Failure)

ajmw_
Associate III

I am testing Ethernet on STM32H723ZGT6 by transmitting UDP packets (1024 bytes) every 1 ms.

Setup / Implementation:

  • No CubeMX / HAL used

  • Fully bare-metal Ethernet driver (lwIP integrated manually)

  • Ethernet interrupts enabled and serviced

  • RX handled inside while(1) loop

  • Non-blocking transmit path

  • MPU enabled

Memory configuration:

  • RX pool located in AXI RAM

  • lwIP heap: 32232 bytes in D2_SRAM

Observed Behavior:

  • System runs normally for many hours (sometimes >12 hours)
  • Eventually UDP transmit stops working

  • Ping continues to work without issues

Debugging shows failure occurs at:

 pbuf_alloc(PBUF_TRANSPORT, len, PBUF_RAM);

After failure:

1) pbuf_alloc() returns NULL (ERR_MEM)

2) ETH->DMACSR reads 0

while (1)
{
    sys_check_timeouts();
    if (eth_rx_rdy)
    {
        eth_rx_rdy = 0;
        ethernetif_poll(&gnetif);
        ETH->DMACIER |= ETH_DMACIER_RIE;
    }
    if (sys_now() - udp_tx_timer >= 1) // 1 ms transmit interval
    {
        ret = udp_server_send(udp_tx, sizeof(udp_tx));
        if (ret != ERR_OK)
        {
            printf("err: %lu\r\n", ETH->DMACSR);
        }
        udp_tx_timer = sys_now();
    }
}
err_t udp_server_send(const void *out, u16_t len)
{
    if (!upcb)
        return ERR_VAL;

    struct pbuf *p = pbuf_alloc(PBUF_TRANSPORT, len, PBUF_RAM);
    if (!p)
        return ERR_MEM;

    err_t err = pbuf_take(p, out, len);
    if (err == ERR_OK)
    {
        err = udp_send(upcb, p);
        if (err == ERR_OK)
            tx_sent++;
    }
    pbuf_free(p);
    return err;
}
  1. What could cause pbuf allocation failure after long run time while ping still works?

  2. Could this indicate lwIP heap exhaustion / memory leak / fragmentation?

  3. Is there anything STM32H7-specific (DMA / cache / MPU / memory region placement) that might trigger this?

  4. Does DMACSR = 0 provide any diagnostic meaning in this scenario?

  5. For UDP transmission, would it be better to use PBUF_ROM or PBUF_REF instead of PBUF_RAM? Since  PBUF_POOL is generally not recommended for TX. 

Any suggestions or debugging directions ? 

~AJ

1 ACCEPTED SOLUTION

Accepted Solutions
LCE
Principal II

You could try using PBUF_ROM, which - I think - would also save you the copying time.

But then check how to set up your UDP send functions and buffer + pbuf handling.

View solution in original post

5 REPLIES 5
LCE
Principal II

Check where pbuf_free() is used for UDP sending.

I'm mostly streaming TCP packets, up to ~ 50 Mbit/s, and also over hours.
In LWIP's tcp_out.c: pbuf_alloc(PBUF_TRANSPORT, u16SegLen, PBUF_ROM), then the payload pointer is set to the custom data streaming buffers placed in OCTOSPI / HyperRam.

 

Each descriptor also has a pointer to the used (first) pbuf, and a custom flag field (mostly used for TCP ACK status, and PTP using UDP).
In the regularly called function EthTxReleasePackets() these descriptor flags are checked, and if there's still a pbuf assigned to the descriptor, it is freed with pbuf_free().

 

The PTPd (heavily modified) UDP is using pbuf_alloc(PBUF_TRANSPORT, i16Length, PBUF_RAM).
But this is used only a few times per second.
Just see that the PTP UDP send function immediately frees the pbuf after calling LWIP's udp_sendto().
And other functions (e.g. SNTP) do the same.

 

Don't know if that helps... good luck!

Andrew Neil
Super User

@ajmw_ wrote:
  1. What could cause pbuf allocation failure after long run time while ping still works?

  2. Could this indicate lwIP heap exhaustion / memory leak / fragmentation?


Yes, it certainly sounds like it!

 


@ajmw_ wrote:

Any suggestions or debugging directions ? 


Instrument your memory allocation & release.

Inspect your heap when the allocation failure occurs.

 

Some tips on LwIP debugging here.

A complex system that works is invariably found to have evolved from a simple system that worked.
A complex system designed from scratch never works and cannot be patched up to make it work.
ajmw_
Associate III

Thanks for the explanation, 

In my case, I'm not using PTP or any time-sync protocol. My application only performs high-rate UDP data streaming. 

UDP transmit path is

pbuf_alloc(PBUF_TRANSPORT, len, PBUF_RAM);
pbuf_take(...);
udp_send(...);
pbuf_free(...);

 

~AJ

LCE
Principal II

You could try using PBUF_ROM, which - I think - would also save you the copying time.

But then check how to set up your UDP send functions and buffer + pbuf handling.

ajmw_
Associate III

Hi @LCE ,

After changing the pbuf type from PBUF_RAM to PBUF_ROM, the UDP streaming has been running continuously for the last four days without any interruptions.

That said, I noticed that the TCP client in the same application disconnected during this time. I am currently analyzing the cause.

Thank you. 

~ AJ