cancel
Showing results for 
Search instead for 
Did you mean: 

STM32H7 UART DMA TX/RX issues

GStee.2
Associate II

We are currently seeing issues while trying to get UART communication working : we have have 2 stm32h7 (STM32H753) boards connected via RS-232. On board 1 we are sending 4kB of data each second and on board 2 we are receiving this data.

Tx on board 1 is done through DMA via a 4k buffer. Rx on board 2 is done via a 256 byte DMA buffer (so in order to get all the 4kB sent from board 1 we will have 32 interrupts, (16 for the half transfer and 16 for the transfer complete).

In this configuration we are able to receive all data properly. So far so good (we have done this test for 2 speeds : 921600 kbit and 460800 kbit).

In order to lower the number of interrupts we now increase the DMA Rx buffer from 256 bytes to 512 bytes. From now on it goes wrong. We do see on oscilloscope that board 1 is sending data but we don't see the data coming in on board 2.

We did take into account the information in the knowledge base DMA is not working on STM32H7 devices by aligning the tx buffer on 4kB base and the rx buffer on 512 byte base. furthermore we did disable I/D caching for this test.

Important remark : When we set a breakpoint in the callback of the HT/TC Rx interrupt we do stop in the callback and when we continue from there we are receiving data for a very short while but then it stops again (until we re-apply the breakpoint).

Any idea what could be causing this behavior? Please note that we are not using the FIFO mode of the UART's.

24 REPLIES 24

Cache coherency will be a persistent issue.

For RX failing check for noise, framing and other reception errors for the cause of the stall.

W​ould generally recommend instrumentation over breakpoint, and avoid looking at peripheral registers in the debugger as this is invasive.

Tips, Buy me a coffee, or three.. PayPal Venmo
Up vote any posts that you find helpful, it shows what's working..
GStee.2
Associate II

Ok but in case of reception errors I would expect that these also showed up when using a DMA RX buffer size of 256 instead of 512 but this wasn't the case. We did measure cpu load in both cases and it was very low (as expected as only our test task is running). The only actual difference between the two use cases is the fact that we doubled the size of the DMA reception buffer and changed the buffer alignment according the new buffer size.

Do you use circular DMA for Rx? How do you process the received data, exactly?

Try a single case, i.e. transmit only one 4k buffer, and count how far will the receiver get (how many bytes will it receive).

Note that observing UART registers in debugger will break things.

JW

gbm
Lead III

Looks like you are asking for trouble. First, it's not a good idea to send so big packets via UART. I would rather split the data into smaller fragments with some idle time between them - easy to implement by sending idle frame. Think what happens if there is any problem with data received via DMA in packets this big - no easy way to recover from the error.

My STM32 stuff on github - compact USB device stack and more: https://github.com/gbm-ii/gbmUSBdevice
GStee.2
Associate II

We are not using circular DMA (so direct mode is used). By using a counter we found out that our rx callback was still being called but not handled properly as we went to a 512 DMA buffer but our length/position was still being expressed in byte format (so 255 max).

After updating the data type for length/position we got it also working with a 512 byte DMA buffer (test was running for more than 15 minutes without any issues).

As a next step I wanted to use both TX and RX on each board (so full duplex communication meaning that both boards will send/receive 4kB each second). I do notice now however that bytes will get lost once full duplex comm. has started. As long as only one board is sending everything is being received properly by the other board but after enabling sending on the other board it goes wrong after a while...

The data that is being sent between the boards is a test pattern (incremental value from 0 to 255 that is continuously repeated). After a while one of the boards will report failure on the received 4kB block. When I look into the failed block it seems that a byte is misisng e.g instead of ...7-8-9-10... the block contains ...7-8-10.... It is not clear yet if this is an issue caused by the receiver or the sender.

I did already tests with lower baudrate and also here I can reproduce this. Any idea what could be causing this (I am going to check now on errors reported by UART) ? Please note that currently we are not using the FIFO of the UART.

GStee.2
Associate II

After some more testing I noticed that the issue only occurs when tx and rx are happening at the same time. When there is no overlap between tx and rx the test can run without any issues. When tx and rx overlap (even for a very short period of time) the issue will pop up at some point.

IIvan.22
Associate III

We observe a similar behavior on STM32H743. The MCU freezes solid when both TX and RX are simultaneously served by DMA, even if only one either TX or RX is ongoing. Any buffer sizes are affected, but after 256 bytes happens more often. Observed regardless of bit 20 (mentioned in errata) in DMA stream CR register. Only Reset pin works after that...

FBL
ST Employee

Hello @GStee.2​ ,

Please make sure that you are not facing the limitation "DMA stream could be locked when transferring data to/from USART/UART" documented as following in the errata sheet ES0396

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.


Hello F.Belaid,

Errata and reference manual tells to use bit 20 for all cases, but HAL does it only for selected silicone versions. Which one is correct? Any other settings should be used for work around? In our case of STM32H743 bit 20 does not help.