2019-05-03 02:58 AM
Took me quite a while to work through this lot, identifying what was actually going on, but here's a heads up for anyone else. The symptoms I was seeing were TCP failures in high traffic conditions (~ 0.2ms per rx). In particular, accessing port 80 / 0x0050 (HTTP) caused LwIP to complain about a not-found socket connection for port 20480 / 0x5000. The cause was that a receive buffer was being reused while still being processed by LwIP, and the network-endian/host-endian conversions were being detected by the packet data being modified to hold new network-endian data after the original packet had already had it's data transformed to host-endian.
Turned out the following were contributory and/or noticeably defective in general.
I slightly rewrote the code, mostly just simplifying it while maintaining the implied functionality, and separating mutating functions from queries. (The design and naming of functions in the HAL is horrendous.) Also doubled the buffers in use to reduce the risk of overrunning in future.
Can't share the code, hopefully documenting the cause is some help.