STM32H563 NetxDuo Packet Pool Deadlock

stst9180 · ‎2024-09-16

Hi Community,
I'm currently facing an issue where NetXDuo with the STM32H563 locks itself out if the packet pool runs out of packets. In the current szenario the problem occurs when using the ftp-server addon and retriving a file. As the chunks are posted to the tcp stack very fast, the socket_send routine returns success, but packets are not released (as they are currently not sent by the stack -> normal behaviour)
But if this happens to fast, the packet pool runs empty (which may also be an ok behaviour here). But then the network driver can't fill up it's rx-descriptors anymore and locks up completely. The packets are never released and a reboot of the device is necessary.

Is this a known issue somwhere? Any Ideas how/ where to debug further?

stst9180 · ‎2024-09-16

I found my self a solution by dividing the packet pool according to:

https://en.na4.teamsupport.com/knowledgeBase/18188850
Nevertheless for my opinion the networking should recover from an empty internal packet pool on the driver side.
I can't find any documentation about how big the internal packet pools needs to be to make sure a deadlock is not possible.

Regards Pascal

stst9180 · ‎2025-01-27

Hi guys, unfortunately this work-around does not really solve the problem but does only shift it to "seldom occurence". It may still happen that on an occupied system the internal packet pool will exhaust. ( We're currently experiencing this problem with the telnet-addon) as here the internal packets are handled in the telnet-thread. If this one is to slow the internal packet pool will exhaust and never recover. For my opinion NetXDuo should be altered in a way that this situation is really recoverable. Maybe also by killing some sockets or something. This is unfortunately currently not done so the network stays unusable and a manual reboot of the device has to be done.
I currently could not find out why the system isn't recovering as if telnet thread will free the packets l8er, networking should start running again, but there seems to be an issue with the driver which does not do the recover well.

May someone of the ST-Internals have a look on this issue too?

Azelio · ‎2025-05-01

Have you found any solution? I have the same problem, for the moment I succeeded in my TCP send thread by using the tx_thread_relinquish() at every loop iteration but in the WEB server code it is supposed that I have not to alter anything... I will try to found a point where I can insert a tx_thread_relinquish() that will be wiped out at the next code generation.

Azelio · ‎2025-05-01

It worked (sort of) it took 3 minutes to load a 15k HTML page but no packet pool damage. I used a tx_thread_sleep(100) (the relinquish was not "strong" enough) in the send data loop of the WEB server. Of course this is not the way to go but, in my opinion, it confirms that the problem is the too fast data dumping in the packet pool.