Showing results for 
Search instead for 
Did you mean: 

Poor H7 Ethernet Rx Descriptor DMA handler implementation.

Associate II

Took me quite a while to work through this lot, identifying what was actually going on, but here's a heads up for anyone else. The symptoms I was seeing were TCP failures in high traffic conditions (~ 0.2ms per rx). In particular, accessing port 80 / 0x0050 (HTTP) caused LwIP to complain about a not-found socket connection for port 20480 / 0x5000. The cause was that a receive buffer was being reused while still being processed by LwIP, and the network-endian/host-endian conversions were being detected by the packet data being modified to hold new network-endian data after the original packet had already had it's data transformed to host-endian.

Turned out the following were contributory and/or noticeably defective in general.

  1. Buffers marked as available for reuse by the DMAC after being *queued* by LwIP, not after being completely processed by LwIP or the app.
  2. Too much work being done in the receive ISR: instead of notifying a worker thread that there was stuff to be done, the ISR was trying to do some of it.
  3. Calling the same (mutating) routine from both ISR and service thread, with a bodged together multi-use flag to prevent them overlapping.
  4. Routine above, named ETH_HAL_IsRxDataAvailable(), also performs mutations on state, contrary to the implications of the name.
  5. Excessively and needlessly complex code for maintaining the app-side data structure for managing the descriptor list.

I slightly rewrote the code, mostly just simplifying it while maintaining the implied functionality, and separating mutating functions from queries. (The design and naming of functions in the HAL is horrendous.) Also doubled the buffers in use to reduce the risk of overrunning in future.

Can't share the code, hopefully documenting the cause is some help.