cancel
Showing results for 
Search instead for 
Did you mean: 

Optimization Settings for HAL Driver (STM32F7)

fulcrumEFX
Associate II

Hello,

I have a project with FreeRTOS and LWIP (plus some other stuff). I ran into the problem, that if I set the optimization for my project (-O3 in my case) I get weird communication glitches. For example I only get a reply every 3rd or 4th ping, Ping reply takes up to 2 seconds, even UART isn't working properly.

I finally figured out that the Problem is caused when STM32F7xx_HAL_Driver gets optimized and I have to set -O0 for that folder manually.

How can this even happen? If the driver doesn't work, shouldn't it at least be guarded from optimization changes with something like this?

#pragma GCC push_options
#pragma GCC optimize ("O0")
 
//code
 
#pragma GCC pop_options

8 REPLIES 8
KnarfB
Principal III

Disabling compiler optimization is an easy symptomatic cure for such issues. But, eventually you want to find out the root cause. It is not clear from your vague description that HAL is to blame.

fulcrumEFX
Associate II

The base Project was provided for evaluation and contains a proprietary runtime which is why I can't disclose too much. So you are saying the problem is something else? I found a forum thread where similar problems are described: https://sysprogs.com/w/forums/topic/stm32-not-behaving-after-optimization/

KnarfB
Principal III

No , I was saying that the root cause is not clear, and, disabling optimization at a global level are somehow steamroller tactics.

Most likely an issue with something needing to be volatile, or a caching / buffering one.

​Check memory address expectations for buffers and MPU configuration, and if you need to more aggressively manage cache coherency.

Watch also 32 byte granularity of cache and what's happening to other variables spanning those lines.​

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

And should tell any developer with a little experience something about the overall quality.

Piranha
Chief II

Saying that the cause of this is not known is kind of a cognitive dissonance... 😉

https://community.st.com/s/question/0D50X0000BOtfhnSQB/how-to-make-ethernet-and-lwip-working-on-stm32

Of course the ultimate root cause of all this is the absolute incompetence of ST's HAL/Cube developers, which can clearly be seen throughout all of the HAL, Cube and example code. Even the HAL API is flawed by design and unusable. They cannot get even the basic UART working...

PMath.4
Senior III

My experience is that O3 is pretty much unusable, almost always slower than O2 even when it works. O2 seems the best compromise. Os can be useful if space is an issue but seems to run about 30% slower for a typical applications. Ofast can also be useful and sometimes outperforms O2. I'm yet to find anything that O3 improves and I'm not talking about HAL just simple code, no RTOS or anything else complex.

I also did some experimenting and compared different optimizations to O2 without LTO on this demo.

  • O2+LTO: speed +11%, size -5%.
  • O3: speed +8%, size +22%.
  • O3+LTO: speed +17%, size +22%.
  • Ofast: speed +9%, size +22%.
  • Ofast+LTO: speed +20%, size +22%.

Therefore O2+LTO seems to be the best compromise, while Ofast+LTO is the king of speed.

> even when it works

Correct code must work with any optimization.