Skip to main content
willcfj
Associate III
December 3, 2019
Question

Has anyone used the LwIP Raw API with RTOS?

  • December 3, 2019
  • 5 replies
  • 4421 views

I'm starting to get comfortable with CMSIS-RTOS (and the STM32 implementation "uniqueness's" :) ), and next phase is adding LwIP support to my project. My IP needs are fairly simple, so was originally planning on using the Raw API. I see using CubeMX that if FreeRTOS is enabled, it forces the RTOS-based LwIP implementation.

I'm only using the RTOS for some high level stuff, so it's timer tick is 10ms instead of 1ms. I was trying to keep it mostly out of the way. Is trying to ignore the RTOS mode a bad idea given it is there already? Should I just use RTOS mode and deal with the overhead? I am concerned about flash space, I'm on an H750 with only 128KB of flash and it's almost 50% used in early stages of the project.

Any insights appreciated, thanks in advance.

will

This topic has been closed for replies.

5 replies

Piranha
Principal III
December 4, 2019

RAW or callback-style API is the only interface I use with and without RTOS and that is how it is implemented in my demo firmware: https://community.st.com/s/question/0D50X0000AhNBoWSQW/actually-working-stm32-ethernet-and-lwip-demonstration-firmware

The most coherent way would be to use lwIP in RTOS mode and use (UN)LOCK_TCPIP_CORE() macros for RAW API. For more details read this:

https://www.nongnu.org/lwip/2_1_x/multithreading.html

You can also use lwIP in NOSYS mode, but then you can use it's API only from a single thread from which you are calling input frame and timer processing, which basically means that all application network code will have to be in connection callbacks.

https://www.nongnu.org/lwip/2_1_x/group__lwip__nosys.html

willcfj
willcfjAuthor
Associate III
December 4, 2019

Thank you Piranha!

Thank you for the links, I'll start looking at those. That gives me a lot of hope I'm heading down a good path. I initially started tinkering with copying the echo demo source into my app and started with the Raw API and saw a modest code size increase. I also tried some of netcomm calls out of curiousity and got to 97% code space usage (60+K added!). Definitely heading back to the Raw API which is perfectly fine for my intended use, a single TCP connection using a dedicated thread. I was worried the code for the RTOS-based solutions would be included regardless of the APIs I called, but at least anecdotally it seems not to be, so feeling better about that too.

will

Piranha
Principal III
December 7, 2019

Netconn API takes a few KB, but not even near to 60 KB! There must be something other that was added through some dependency in your case.

willcfj
willcfjAuthor
Associate III
December 7, 2019

Piranha:

I believe the 60K was the entire LWIP library, netconn API, and socket API. The linker seems to go a good job of omitting the entire LwIP stack until you make at least one routine call that needs it. When I had code with just the raw API calls, it was +40K-ish over no LWIP at all.

will

willcfj
willcfjAuthor
Associate III
December 7, 2019

Piranha:

It does look like I get into a mess pretty quickly with CubeMX code trying to make the raw calls. One of the core routines in raw mode is sys_check_timeouts() and that is only created if NOSYS=1. Manually overriding it to 1 breaks a bunch of other stuff, so not as simple as that. This is going to take some more sleuthing. Thanks for the hints, I know I'm heading down the right path at least, it is just going to be a longer path.

will

Pavel A.
Super User
December 7, 2019

>  I'm on an H750 with only 128KB of flash and it's almost 50% used in early stages of the project.

Here's a useful tip for you: as already noted by others here, H750 may have more than one flash page.

Test pages 1-7 in bank 1 (erase and write) . You'll find at least one more usable page.

Of course these extra pages come without any warranty from ST...

-- pa

Piranha
Principal III
December 7, 2019

I should note that my previous answer was about lwIP itself, not about how to do it with CubeMX... Also take a strong look on this:

https://community.st.com/s/question/0D50X0000BOtfhnSQB/how-to-make-ethernet-and-lwip-working-on-stm32

To really save flash space, drop CubeMX, HAL and CMSIS wrapper bloatware. Additionally you'll get rid of ton of bugs and limitations like that stupid HAL timer design. Also it will make it possible to use NOSYS=1 with RTOS, if you really want it. Though I would not concentrate on it at least initially.

willcfj
willcfjAuthor
Associate III
December 7, 2019

Piranha:

Thanks for the link above. I've certainly seen the bloat in other HAL code for which I know enough to be able to do my own optimizations. Alas I'm to that level yet with FreeRTOS or LwIP yet. Looks like I am going to have to get there sooner than I hoped though. I am hoping to at least get something working even if inefficient, then with a working model, start the optimizing.

Again, thanks for all the feedback. Knowing I can get to where I want to go is good for keeping up the effort to get there.

will

Piranha
Principal III
December 7, 2019

When you fix those reported issues, even ST's bloatware should at least be working. Then you can optimize/rewrite later. :)

For an encouragement to see where you can get, look at my other topic:

https://community.st.com/s/question/0D50X0000AhNBoWSQW/actually-working-stm32-ethernet-and-lwip-demonstration-firmware

willcfj
willcfjAuthor
Associate III
December 15, 2019

More for others than Piranha:

This is just concluding this thread with the "final issue" to get Ethernet working on the STMH750. I really should create an independent post, and hope to do that a bit later, there was a lot to figure out that if I post it, can save others time.

The last issue was indeed memory management. The program only worked if I disabled data caching entirely. With some sleuthing it turned out to be the main ran_heap (defined in mem.c) used by LwIP. It is just put in "regular" RAM without any special handling. When data is ready to be sent to the Ethernet controller in ethernetif.c in low_level_output(), there is no cache flushing before calling HAL_ETH_Transmit(). There is cache invalidating for Ethernet reads, just not for writes which is weird. Anyhow, in low_level_output(), the best place to flush the cache seems to be in the for loop near the top. Code snippet below with the only change being the cache flush line with the //WMB comment. This isn't a great solution as this code will get erased on every CubeMX build, so need to come up with a better long term solution.

 for(q = p; q != NULL; q = q->next)
 {
 if(i >= ETH_TX_DESC_CNT)	
 return ERR_IF;
 
 Txbuffer[i].buffer = q->payload;
 Txbuffer[i].len = q->len;
 SCB_CleanDCache_by_Addr((uint32_t *)Txbuffer[i].buffer,Txbuffer[i].len); //WMB
 framelen += q->len;
 

Given the above solution, I'm curious how the example code works. I can't run it as it's for a different MCU (STM32H743) on a different platform (STM32H743I-EVAL), but seems as though it should have the same problem.

Thanks again for the guidance Pavel and Piranha. Nice to have this working finally. Now, I need to decide if I go back to answering the original question of going back to the Raw API or just sticking with Netconn

will