cancel
Showing results for 
Search instead for 
Did you mean: 

STM32F415RG missing bytes on UART RX

TSchm.1
Associate II

Hi,

I have a board containing an STM32F415RG and another board containing an MCU from another manufacturer. Both boards are supposed to communicate with each other using RS-232 transceivers. The communication is running at 9600 baud and working well for some minutes, after which the STM32 seems to miss received bytes.

I work in IAR, but I have configured the STM32 using CubeMX with Firmware package F4 V1.25.0. My application is running on CMSISv2/freeRTOS. I use the HAL API for all module accesses.

Communication is always initiated by the STM32 by sending 64 bytes of data to the slave MCU. The STM32 expects always 64 bytes in return. It does not matter if I use the _IT or _DMA implementations of the StartRX/TX methods, I have checked both. Every data packet ends with a 16 bit checksum.

I implemented a RX timeout using a semaphore which works fine, the timeout is long enough to accept some delay between the request and the response. When a timeout occurs, I abort the transfer (either by caling AbortReceive_IT() or DmaStop(), I have tried both). I put a breakpoint here, and I can see in my register view that the corresponding DMA data length register (NDTR) is non-zero, and the RX buffer array is not full. The missing bytes are somwhere in the middle because the checksum bytes are visible in the buffer, but not at its end where they should be.

So, I connected an oscilloscope to the RS-232 signals and also to the UART signals right before the STM32. Also, I connected a logic analyzer to the same signals. In the logic analyzer output, I see that all bytes are transmitted correctly. I also cannot observe any signal distortion on the analog side. But the STM32 keeps missing bytes, even if the transfer seems OK for the analyzer.

When a byte loss occurs, the following transmissions are most likely to be corrupted also. This keeps going for a while (minutes), after which the transmissions are working fine again. I have read that the HAL API is prone to race conditions, but since I have only one task accessing the UART peripheral, I don't think this a problem here.

Has anyone encountered this problem before, or do you have a suggestion?

1 ACCEPTED SOLUTION

Accepted Solutions
TSchm.1
Associate II

After a whole day of investigation, it seems to me that actually the RS-232 interface driver is the problem. It appears that the receiver threshold is drifting over time which leads to "noise bits" after some time, and to complete communication failure after somewhat longer time. The problems got worse during this day, so I will replace the interface driver tomorrow an see what happens next. Thank you for your hints, I will still consider them when making the communication more robust.

View solution in original post

6 REPLIES 6

Do you have any UART error handling in place?

JW

Having it wait for 64 characters and to recover/remain in sync is a bit hopeful.

Usually you want a get a bulletproof 1-byte Rx implementation working, and dig out any bugs there first.

Identify what's happening. If the receiver is getting overrun, parity, framing, noise or other error flagging that you need to clear immediately.

If you expect a burst of bytes, and you don't receive as many as expected, or paced properly, flag to a GPIO, and use that to trigger a logic analyzer for inspection.

Don't observe the peripheral registers in a debug view, this is invasive. If you want telemetry or diagnostic output add that to your routines so you can observe flow and patterns to failure.

9600 baud is a pretty low rate, the F4 should have no problem with this. Check the bit-timings, this type of serial operation can have relatively broad timing tolerance. You could try setting baud rate 5% high/low and see if the problem gets better/worse.

For RX DMA I use a continuous circular buffer as a FIFO that I sweep periodically with size/rate determined by bit/byte rate expectations.

Enabling/Disabling RX DMA has potential race conditions, make sure you're not losing data because you're not listening.

Tips, buy me a coffee, or three.. PayPal Venmo Up vote any posts that you find helpful, it shows what's working..
TSchm.1
Associate II

Hi Jan, hi Clive,

Thanks for your replies. I had already triggered the logic analyzer to a GPIO in the error case, and the data is OK for the analyzer but not for the STM32. Also, when I trigger a scope to the very same event, I cannot observe any excessive noise.

Having a closer look with the debugger tells me that the NF (noise detection) and FE (framing error) flags are set on the very first erroneous transmission. On every following transmission, these flags are not set, but the communication still goes wrong. So it seems to me that it is a bit of a synchronization issue, as you pointed out it might be. It seems like the STM is missing some bytes on the first transmission and does not recover from this. But as I said, the STM32 is the master, the timeout is long enough, and I stop the transfer after the timeout, so when the STM (as the master) initiates the next transfer, the synchronization should be restored.

I will investigate this further.

TSchm.1
Associate II

After a whole day of investigation, it seems to me that actually the RS-232 interface driver is the problem. It appears that the receiver threshold is drifting over time which leads to "noise bits" after some time, and to complete communication failure after somewhat longer time. The problems got worse during this day, so I will replace the interface driver tomorrow an see what happens next. Thank you for your hints, I will still consider them when making the communication more robust.

A corollary here about RS232 drivers, the bandwidth ceiling on them is below 1Mbps, make, model and capacitors might also limit this.

Watch the capacitor values/characteristics, most people cut-n-paste other schematics, check what the datasheet actually specifies.

Tips, buy me a coffee, or three.. PayPal Venmo Up vote any posts that you find helpful, it shows what's working..

I'm yet to see an RS232 transceiver which has problems with levels/thresholds at 9600 Baud. The usual voltage swing is +-10V into a receiver which has threshold and hysteresis at TTL levels, i.e. hundreds of mV. We are also talking few us of surprisingly well-controlled slew rate.

My tip #1 is ground/return issue.

My second tip is baudrate mismatch ("oh, the builtin RC oscillator is surely good enough").

JW