cancel
Showing results for 
Search instead for 
Did you mean: 

STM32F030 USART bug?

JHans.6
Associate II

When working with a STM32F030C8T6 board, communicating with another board through serial comms (115200bps), I suddenly started to get erroneous characters once in a while, when the remote board were sending data at the same time.

It's a block protocol, with the remote board sending 256 byte blocks and the F030 board sending ACKs (hex 0x06) along the way.

It had worked fine, when the ACK from the F030 was sent after a full block. When sending while there was still incoming data, the start bit of the outgoing data was suddenly shortened a bit, resulting in incorrect data being received in the remote end.

The logic traces show this.

Start bit + one bit should be 17.38us (bit time 8.68us), which can be seen to be correct in the "normal startbit" screenshot. In the "short startbit" screenshot, this time is only 12.62us.

( I don't know if it's actually the startbit or the first bit that is corrupted, but assume it is the startbit). Following bits are all okay.

This causes the 0x06 to be read as 0x83 (as the analyser also shows). I've had a case where it was read as 0x07 instead of 0x06, which could indicate that the bit corruption is not just happening on the startbit.

The data were captured on the TX/TX of the CPU, USART1.

USART1 was running code with interrupt driven TX and RX. I've tried changing the TX to be "direct" (look at TXE bit to see if it's ready, then write to TDR). This does NOT change the behaviour.

This seems to indicate that there might be an issue with the HW implementation of the USART.

I've checked the Erratas, but could not find anything related to this.

The CPU marking is STM32F030C8T6 AA039 1198 TWN AA 826

Any takers? Comments?

1 ACCEPTED SOLUTION

Accepted Solutions
JHans.6
Associate II

Indeed a different HW issue, Pavel. It was false alarm, obviously.

I already have serial resistors between the modem and CPU, 1k.

When going to setup a scope test and grabbing the erroneous byte, as suggested bi KIC8462852, I realised that the latest batch of boards, had 10k resistors mounted instead of 1k.

This caused signals to float up a bit and bits to be detected incorrectly. Oddly enough this only happened when I was receiving data at the same time, and it happened simultaneously at the receiving end AND the analyser, so now, at least I know that the thresholds are very similar ;)

So, as always, never trust a HW you haven't assembled (or triple inspected) yourself.

Thanks to everyone for ideas and suggestions.

View solution in original post

9 REPLIES 9
S.Ma
Principal

How looks like the rise and fall edge of signals? (oscilloscope shall be used before any logic analyser)

What is the clock source on each end? Crystal? Resonator? Tolerances/spread (PVT process voltage temperature dependencies)

What are the USART register values when code is running?

JHans.6
Associate II

It's not possible to grab this specific instance with the scope, as it only happens quite seldom. It is related to what happens in the RX line.

But signals are good, HSI is used with LSE calibration, <0.5% accuracy.

I have no idea of the USART registers when the code is running, they obviously change, so cannot be captured.

Note that the USART works perfectly well, AS LONG as there's no data coming in on the RX line. This happens only when the RX is handling data.

Tell us about the hardware: is there any level conversion involved, how are the two boards mechanically arranged, any cabling, how is common ground/return arranged.

JW

JHans.6
Associate II

For simplicity, I described it as two boards. It's a F030 communicating through a GSM modem on the same board, to a remote Ununtu server.

It's not really relevant, as the issue is measured directly on the F030.

I've sent several MB of data over the link with no issues, as long as I keep it as "half duplex" and wait until data has been received before sending an ACK byte back.

So, the link itself is perfect and stable.

This is a design that has been in production for over a year, but I just recently changed the protocol slightly to speed it up, by sending blocks in burst of 10 and "picking" up 10 ACK's in the other end. That did NOT work, due to the corrupted transmit bytes.

JHans.6
Associate II

Note that is is not all of the ACK's that are corrupted, is roughly 1 in 10 or 20. It depends on the timing or bits coming in on the RX

Here's how it is sent:0690X000008AkbfQAC.png

Now this is completely out of normal. The 'F0 are here for quite some time and if there would be a hardware bug in duplex UART, I'm sure we'd already know about it.

So it's time for desperate measures:

  • read out and post content of the USART registers (all of them), at any point, say at the beginning of communication, for to have something to chew on
  • cut the trace from Tx to modem, insert a resistor say something between 100R and 1k, and connect the LA on both sides of that resistor
  • reduce the program to minimal but complete compilable form which still exhibits the problem, and post it

JW

If the MCU can detect the error, it can toggle a GPIO which will trigger the scope. Then you'll be able to do as speed check radar automated machine and catch the bug.

Pavel A.
Evangelist III

> but I just recently changed the protocol slightly to speed it up,

Or... just revert to the previous variant. Better slower but working, than fast but buggy.

It may be indeed a hardware issue, not in STM32 itself, though.

-- pa

JHans.6
Associate II

Indeed a different HW issue, Pavel. It was false alarm, obviously.

I already have serial resistors between the modem and CPU, 1k.

When going to setup a scope test and grabbing the erroneous byte, as suggested bi KIC8462852, I realised that the latest batch of boards, had 10k resistors mounted instead of 1k.

This caused signals to float up a bit and bits to be detected incorrectly. Oddly enough this only happened when I was receiving data at the same time, and it happened simultaneously at the receiving end AND the analyser, so now, at least I know that the thresholds are very similar ;)

So, as always, never trust a HW you haven't assembled (or triple inspected) yourself.

Thanks to everyone for ideas and suggestions.