cancel
Showing results for 
Search instead for 
Did you mean: 

STM32F407 Ethernet stops working after being working for a while (Discovery+Expansion Boards)

sebastian2399
Associate
Posted on March 15, 2016 at 12:49

Hello everyone,

I'm using the STM32F407 Discovery Kit with the Expansion Board Discover-MO:) (DM-STF4BB) with ethernet. The ports and clocks initialization are based in the code generated with Cube MX. The system is working with HAL-Drivers, FreeRTOS, LwIP and PTPv2 (taken from the STM32F107 example). I have modified the file ethernetif.c to support PTP time stamping, the LAN8720 PHY and to generate a Receive buffer unavailable interrupt that resumes the receiving process.

Everything is working fine for a while: I can reach the web server with the example web page with the task list, the board responds to a continuous ping and PTP is working fine.

After some time (sometimes 5 minutes and sometimes 5 hours) the ethernet connection stops working. first I see that the Pulse Per Second (PPS) signal doesn't synchronize any more from PTP, I also see that the web page is getting slower but the ping response is still there. After a short time the ping and the web server becomes unreachable.

I've checked for Malloc Fail and Stack Overflow as well as Hard Faults but that dosen't seam to be the problem. If I suspend execution, I see that FreeRTOS is still running. The problem is that pbuf_alloc() fails and sometimes the Ethernet interrupt stops being generated, that's why the receive buffer is not being freed any more and its impossible to receive new packages.

Does any one have the same or a similar problem? Does any body have some idea how to solve it? Is there a method to determinate the right size of the buffers, memory pools and number of control blocks for LwIP?

The problem dosen't appear if no data is send to the board, in this case I test it after a week and it was still working fine.

Thanks in advance for helping.

Best regards
3 REPLIES 3
AvaTar
Lead
Posted on March 15, 2016 at 14:33

> Does any one have the same or a similar problem? Does any body have some idea how to solve it?

 

Not really, but I recall there are very similar thread e.g. on

http://e2e.ti.com/support/microcontrollers/tiva_arm/

forum page. With LwIP and FreeRTOS involved, I would not limit my search to this forum.

There is a webserver example on the Element14 website for this this hardware combination available, based on the ''old'' SPL. I had successfully built and run it about two years ago, but aside from a ping and a short access test, I did not dwell on it ...

Posted on March 15, 2016 at 17:56

Done similar things as AvaTar, modded a server example to use SD Cards on both F2 and F4 platforms.

I would make sure there isn't some resource leak associated with pbuf_alloc() , check all error and exit paths. Add some basic checks so you can see when things start going pear shaped.

Definitely recall people complaining about similar problems over the years, I think you'll need to dig into the code to understand what is going on. Most ''examples'' are just that, quick demonstrations, with these and open source, you are expected to do some work to get them more robust and industrial.

Tips, buy me a coffee, or three.. PayPal Venmo Up vote any posts that you find helpful, it shows what's working..
sebastian2399
Associate
Posted on March 17, 2016 at 16:01

Hello Clive1 and AvaTar,

thanks for the replays. I've activated LWIP_STATS and LWIP_STATS_DISPLAY for tuning the LwIP memory pools and heap, and I have seen that the Heap was too big and almost unused and PBUF_POOL_SIZE was not enough. The problem is that every new PBUF needs a lot of memory and if it is set to high you may run out of memory. Now It is set to 32 and I'm using around 20 PBUFs.

Other thing to notice is that SYS_LIGHTWEIGHT_PROT must be 1 in order to have

inter-task protection for malloc() and free() calls. If it is not activated, a call to a memory allocation/deallocation may produce a Hard Fault.

In the ethernet driver if the pbuf_alloc() fails, I have put a delay to let other tasks with lower priorities run and free the PBUF pool. I don't know if this is really necessary.

Now every thing seams to be stable and working for longer times (tested for 24 hours) without problems.

Thanks again