What's the best protocol (USART mode?) for high-speed bidirectional comm between two STM32s?

TB · ‎2020-02-21

I’m designing a motor control PCB with two STM32G432RBs (situated close to each other on the PCB) driving a boatload of H-Bridges, encoders, servos, and other I/O. I’m planning a dedicated (synchronous) USART connection between the two STM32G4s to keep them coordinated. Speed and bidirectional comm matter in this setting (power not as much). What is the maximum data rate (Mbps) that I can reasonably expect in this set up (I can only find one mention of a "max USART clock freq", of 21 MHz, in the datasheet),

[Or should I use SPI instead? I am confused by the phrase "SPI-like" in the sentence from the datasheet "The USART1, USART2 and USART3 also provide a Smartcard mode (ISO 7816 compliant) and an SPI-like communication capability."....]

Finally: USART can run in many different "modes"; what is the best ("native"?) USART comm mode to use in such a setting?

Tesla DeLorean · ‎2020-02-21

SLIP ?

Typical baud ceiling is APB clock DIV8 or DIV16

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

TB · ‎2020-02-21

Rick Adams? RFC914 appendix D? Van Jacobson (CSLIP)? https://en.wikipedia.org/wiki/Serial_Line_Internet_Protocol

Is there any need for error correction? This seems like a (likely) common question with a myriad of possible answers, I was hoping that ST had a default "best" answer for such a situation...

waclawek.jan · ‎2020-02-21

Synchronous USART is in principle identical with SPI except its master has 4x lower max baudrate, makes pauses between bytes, frames are limited to bytes, and is LSB-first only.

Compared to UART, USART/SPI is more fragile in noisy environment due to the explicit clock signal.

As Clive said, for link level protocol you want some framing+checksumming, SLIP may be a reasonable starting point/inspiration. Further details much depend on particularities of your application, there's no one size fits all here.

JW

TB · ‎2020-02-21

Thank you very much to both Clive and JW for your thoughts thus far. I was hoping my question was framed in a way not leaving too many ambiguities on the particularities of the application. It's a dedicated STM32 <-> STM32 2-way channel, with the two devices placed close to each other on a single custom PCB (so there should be good impedance control and low noise on the data lines), and I need speed and, hopefully, error free 2-way comm between the two STM32s, but low power is not really an issue. I guess the only thing I haven't really specified for y'all yet is the size of the data packets to be shipped back and forth. At this point, I don't really know, so let me say a "mix" of small and big chunks of data. Does that help specify any potentially missing particularities? If there is one best answer for small data chunks, and another answer for large data chunks, that would also be helpful to know...

Tesla DeLorean · ‎2020-02-21

Problem is "Best" is hugely subjective. The ST U(S)ART implement is pretty "naive" compared to other IP I've used.

Do people really want to use Async Serial at rates >= 1Mbps? It is not inherently robust. You can layer protection and retries over the basic transport.

How large are the packets? Could you use CAN? Got CRC's and acknowledgement methods

Do you need to retry, or is the data throw-away/transient? For example I use RTCM3 data, it needs to be TIMELY, retrying and backlogging data is unhelpful.

Want CRC, or FEC/ECC?

SLIC is inherently self-synchronizing, you can come up with alternate packet forms, but you need to be able to recover from bit/byte loss

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..

TDK · ‎2020-02-21

I would use UART if the data rate is sufficient, up to several Mbps, and otherwise use SPI. The SPI will certainly be faster but UART is nice in that both directions are asynchronous with respect to each other. Depends on your application details. I dont think packet size factors into it much. More a decision of data rate and responsiveness.

If you feel a post has answered your question, please click "Accept as Solution".

TB · ‎2020-02-21

A major thing I will be doing over this channel on this general-purpose motor control board is passing back and forth short data bursts. Given the limited number of timers on an STM32, I'll be driving 4 DRV881P H-bridges and reading 4 quadrature encoders on one STM32G4 (lets call it G4a), and a driving up to an additional 4 H-bridges and reading up to an additional 4 quadrature encoders on the other STM32G4 (lets call it G4b). The estimation/control algorithm will be on G4a, which has to pass info back and forth quickly with G4b to operate properly. So, at least for this task, there will short/fast/timely data bursts - up to four 16-bit to 32-bit encoder positions, and up to four 8-bit to 12-bit motor commands, which can each be bundled (i.e., four at a time) as they become available. (But I reckon, at some point, this channel will be used to pass back and forth much bigger chunks of data, for applications I don't yet foresee, and I want to be prepared for that.). As with Clive, I don't think it feasible in such time-critical problems to bother with the backlogging overhead necessary to enable automatic retries. So, an emphatic YES on implementing a simple ECC.

Assuming good impedance control and low noise on the (short, hopefully clean) data lines to be used, I reckon that using CAN is unnecessary (though we are using the CAN-FD module in the STEM32G4 for communication to some off board (remote) peripherals over noisy/lossy data lines), and that (async) UART can't really be trusted as I push up data rates over 1 Mbps (perhaps TDK disagrees?). However, I don't think I yet understand the subtleties of the USART-vs-SPI question, nor the what-kind-of-USART-protocol-to-use question.

I also don't know what library to use at the higher level for encoding/decoding. For very short messages, a super simple SED (d=2) parity ([9,8], [17,16], etc) might suffice. For longer messages, I am probably fond of a simple SECDED extended binary Hamming code, of length appropriate for the messages being sent, with SECDED (d=4) [128,120] and [64,57] extended binary Hamming codes (or some appropriately shortened versions thereof) seeming reasonable (and, well fitting the data bundles to be sent, as described in the first paragraph). In either case, if the algorithm detects uncorrectable errors too frequently at the comm rate being used over the channel, then that rate can/should probably be throttled back automatically by the software controlling the channel.

Anyway, my hope was that the ECC encoding/decoding stuff (including the automatic data rate adjustment on the channel) would all be available in an existing and STM32-optimized library which I had not yet found (the datasheet glows about ART and the DSP on the STM32, neither of which I know how to take advantage of). I also hoped that a preferred STM32-to-STM32 comm protocol would be identified for simple tasks like this, which I thought would be a common need. But if I have to do this all from scratch, and hand write an encoding/decoding algorithm on the CPU, so be it. I guess I'd lean towards some sort of bidirectional USART (or, two unidirectional USARTs, to avoid channel contention??), and I'll break out Kernighan & Ritchie and do it old school.

S.Ma · ‎2020-02-21

Be futureproof: connect both USART and SPI (that will be 4 wires by shorting pins together MOSI+TX, MISO+RX)

USART is simpler to use and you need a SW mecanism to wrap and sync data payloads.

For debug, you can slow down to 115200 bps and hook up an HC 06 and monitor/debug on a teraterm.

SPI requires more SW coding investment upfront.

There will be a master and slaves. Regular time interval SPI exchange is needed to "poll" to check if slaves want to send data.

SPI+DMA+TIMER+Interrupt can make all the mecanism hidden from baremetal or RTOS as interrupt based simple state machine.

1 meter SPI between 1 master and 12 slaves through connectors gets 12 Mbps over approx 1 meter without error (just checking though fixed header)

To make it simpler when starting implemetation, go fix the payload size (say 1024 bytes) and trigger an exchange every 2 msec

berendi · ‎2020-02-22

There are a few key differences between SPI and UART.

Bidirectional SPI needs 4 lines to communicate, data in, data out, clock and chip select. UART requires only 2, an input and an output, and optionally 2 more for flow control.

SPI is a master-slave protocol. The master controls all aspects of communication, the slave must have the answer ready (already present in the output FIFO or DMA buffer) when the master asks for it. The slave cannot initiate communication, it must wait until the master polls, because when the master does not want to communicate, there is no clock. This can be circumvented by having an extra signal line from the slave to the master, but then you'd need a fifth signal between the two MCUs. UART has no such control, there is no master-slave hierarchy, each side can send data whenever it wants to, unless the flow control line indicates that the other side is not ready to receive. The SPI slave has no facility for flow control, the master will just hold the clock if it can't supply or process data.

UART uses start and stop bits for framing before and after each byte, wasting 20% of the bandwith on that. It requires the master and slave clocks frequencies to be within a few % of each other. One MCU can supply its clock on the MCO output to the other one for perfect clock synchronization.

It might not be immediately obvious from the documentation, both UARTs and USARTs can work in asynchronous mode, and they are fully compatible as long as the "SPI like" synchronous features are not used.

USART in SPI mode combines the disadvantages of both, transmitting the start and stop bits (but without clock pulses) and requiring separate clock and chip select. I've never used it.

I would not worry much about signal integrity if the two MCUs are sitting next to each other on a PCB. I doubt you'd encounter a single bit error in your life. Go for the fastest possible ECC algorithm that you can find. You might not want to bother with sophisticated ECC at all, because they are somewhat computation intensive, but just send each packet twice at high speed, using the hardware parity checker of the UART or CRC checking on SPI.

UART transmit and receive channels have only the baud rate in common, data flow is completely independent, with separate FIFOs and DMA channels. Use two unidirectional UARTS only if it would simplify the PCB layout somehow.

Althought UARTs on the G4 support character match and timeout detection, the receiving process would be simplified if there were larger packets of uniform size, or a multiple of the smallest packet size.