UART not working on STM32F4 when GCC compiler is in O0 (None) optimization

Vagni · ‎2019-11-25

In the last week I had to debug my MODBUS RTU slave application on my custom board for a communication issue on my RS485 network.

My board is based on the STM32F411VET6 mcu and I developed my application with STM32CubeMX v5.0.0 and STM32F4 Firmware Package v1.22.0, starting first using Atollic TrueSTUDIO up to v9.3.0 and then STM32CubeIDE up to v1.1.0.

The communication issue started to occurr randomly from the application startup since I started using STM32CubeIDE.

The issue was a sudden block of the RS485 network, due to one of my boards that left the external RS485 driver with trasmission activated.

With debugger I was able to know the cause: the HAL_UART_Transmit_IT() function could not always return HAL_OK; so my transmission finite-state-machine could remain indefinitely to wait an end-transmission event (for disabling the transmission on the external RS845 driver).

During the debugging sessions I also tried to upgrade my project to STM32F4 Firmware Package v1.24.1 with STM32CubeMX v5.4.0 and tried to disable the compiler optimizations in order to fix the issue.

I found another unexpected issue: the reception suddenly and randomly blocked from the application startup.

After many investigations, I found the UART RX interrupt was permanently disabled and this new issue seems to occur only without compiler optimization (-O0).

Due to MODBUS protocol, I need to receive characters one by one, so I start reception calling HAL_UART_Receive_IT() only for one byte, then in HAL_UART_RxCpltCallback() I always re-trigger reception calling HAL_UART_Receive_IT() only for one byte again.

I confirm HAL_UART_Receive_IT() never returns something different to HAL_OK.

But the UART RX interrupt is suddenly disabled only if I build my app without optimizations. With some other optimization level the issue never occurs.

I tried to build -O0 also the original release with STM32F4 Firmware Package v1.22.0: the reception issue occurs the same.

So, it is not a Firmware Package matter, but it is a compiler optimization matter.

Is there anyone that could explain this behavior?

Till now, I knew compiler optimizations could generate issues, not the contrary.

How can I be sure of the code I build?

Shall STM32F4 Firmware Package code be always built with some compiler optimization level?

Should I use HAL_UART_Receive_IT() in a different way? And which one?

Bob S · ‎2019-11-25

My guess is that with -o0 the code takes JUST long enough that your callback's call to HAL_UART_Receive_IT() misses the falling edge of the start bit. Are you checking/handling errors (framing, etc.)? This is an obvious race condition with -o0, and may be a lulrking race condition with optimization. For example, do you have interrupt priorities such that your UART interrupt can be interrupted? If so, you cannot ever rely on the callback function to restart RX.

You would do better to use DMA RX in circular mode, so that the UART is always receiving data. There have been several posts about this on this forum, some may even have included buffer management code. Basically you need an "extract data" index into the buffer and use the DMA CNDTR register as the "input data" index. You don't get an interrupt for each character, and you don't need to use the DMA half full and full interrupts. You can poll for data, or have some periodic task check for new data.

Tesla DeLorean · ‎2019-11-25

Changes in behaviour with optimizations on/off usually suggests underlying latent issues with the code.

With it off, I might guess speed related, or higher stack usage and vulnerability to variables there getting trashed.

Don't assume auto/local variables will clear themselves.

The alternate to HAL_UART_Receive_IT() is just to service the IRQ Handler work more directly.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

Vagni · ‎2019-11-25

Thank you all.

Unfortunately, MODBUS RTU communication protocol requires to check the received inter-character time (t1.5) and the received inter-frame time (t3.5), so I need the UART RX interrupt on every received character (calling the HAL_UART_Receive_IT() function) and DMA is useless in this case (at least in reception).

The point is:

The same application code, both with Firmware Package v1.22.0 and v1.24.1, built with optimization, works properly in reception.
The same application code, both with Firmware Package v1.22.0 and v1.24.1, built without optimization, works properly in reception only for a few seconds, than UART RX interrupt is no more armed.

It seems that STM32CubeMX generates projects with two default build configuration (Debug and Release), each one with one different optimization level (-Og for Debug, -Os for Release). And till now, all of my STM32 applications were actually developed and released with those optimization levels.

Should an HAL-based application be built with those optimization levels only?

It should be better to know it...

Tesla DeLorean · ‎2019-11-25

The HAL library code should work at all optimization levels.

Some of the examples/demos, perhaps not so much.

Surrounding code may induce race conditions.

Try with maximum warning levels.

Turn on asserts. Turn on stack checking.

Run your code through LINT or equivalent static analysis tools.

Build the code with professional tools, ie IAR or KEIL

>>It should be better to know it...

Would be better to analyze the generated code in working vs non-working case to understand why it isn't robust.

Software that exhibits failure in the lab needs to have proper QA done, optimization on/off is far too broad a brush.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

Ozone · ‎2019-11-25

> The same application code, both with Firmware Package v1.22.0 and v1.24.1, built without optimization, works properly in reception only for a few seconds, than UART RX interrupt is no more armed.

This suggests a runtime problem.

Turning off optimization usually increases the runtimes, so preempting interrupts are more probable to cause overflows.

Do you check and handle/clear the receive error registers ?

Cube is famous for it's reckless code in this regard.

Vagni · ‎2019-11-26

> Do you check and handle/clear the receive error registers ?

Yes, my handle is the following:

/**

* @brief UART error callbacks.

* @param huart: pointer to a UART_HandleTypeDef structure that contains

* the configuration information for the specified UART module.

* @retval None

*/

void HAL_UART_ErrorCallback(UART_HandleTypeDef *huart)

{

// try to clear all the UART status flags

while( huart->Instance->SR &

(UART_FLAG_CTS | UART_FLAG_LBD | UART_FLAG_RXNE |

UART_FLAG_IDLE | UART_FLAG_ORE | UART_FLAG_NE | UART_FLAG_FE | UART_FLAG_PE)

)

{

__HAL_UART_CLEAR_PEFLAG(huart);

huart->Instance->SR &= ~(UART_FLAG_CTS|UART_FLAG_LBD|UART_FLAG_RXNE);

}

but it is never called.

I use the UART HAL driver in interrupt mode IO operation, so all the UART interrupts sources should be handled by the UART HAL driver.

MODBUS is an half-duplex communication protocol, every transaction is composed by one request from the master unit to the slave unit and one reply from slave unit to the master unit.

When my app receives a request, abort the reception with HAL_UART_AbortReceive_IT() and sends its reply with HAL_UART_Transmit_IT().

At the end of the reply transmission, HAL_UART_AbortTransmit_IT() is called and one character reception is armed again with HAL_UART_Receive_IT().

I did not defined my HAL_UART_AbortCpltCallback(), HAL_UART_AbortTransmitCpltCallback() and HAL_UART_AbortReceiveCpltCallback(). I only defined my HAL_UART_RxCpltCallback(), HAL_UART_TxCpltCallback() and the HAL_UART_ErrorCallback() above.

Also, in my app I use FreeRTOS. All my tasks have the same osPriorityNormal priority. All the enabled interrupt sources have priority 6, except 15 (lowest) for PendSV_IRQn and SysTick (used by FreeRTOS) and 5 (highest) for DMA2_Stream0_IRQn (used by ADC).

In this condition how can UART RX interrupt be suddenly no more armed (I see the flag RXNEIE = 0) after running for a few seconds?

Ozone · ‎2019-11-26

I know Modbus quite well, both RTU and ASCII.

Only not Cube/HAL, I thing I don't want to get involved in.

> (UART_FLAG_CTS | UART_FLAG_LBD | UART_FLAG_RXNE |

> UART_FLAG_IDLE | UART_FLAG_ORE | UART_FLAG_NE | UART_FLAG_FE | UART_FLAG_PE)

I don't see a check for overflow.

> In this condition how can UART RX interrupt be suddenly no more armed (I see the flag RXNEIE = 0) after running for a few seconds?

If another interrupt preempts, or the handler itself takes too long that characters get lost (one overwritten in RX), the OV flag gets set.

Reception then stops (i.e no more RXNE interrupts) until you clear OV.

Vagni · ‎2019-11-26

What do you mean with overflow? Maybe overrun?

STM32F411 mcu USART does not have any OV flag. The overrun flag is ORE and yes, when overrun occurs, no more data is received until ORE flag is cleared by software.

But my problem is not overrun, it is RXNEIE bit that suddenly is cleared and not armed any more if I build my app without optimization. And this issue make me puzzled about how should I build an HAL-based application.

Ozone · ‎2019-11-26

> What do you mean with overflow? Maybe overrun?

I guess we mean the same.

> But my problem is not overrun, it is RXNEIE bit that suddenly is cleared and not armed any more if I build my app without optimization. And this issue make me puzzled about how should I build an HAL-based application

I would try a data watchpoint, on write to the UART config register. Not sure if that works.

I don't fully really trust Cube/HAL.