STM32H743 CANBus TX Timing

joh06937 · ‎2023-10-06

Hi all, I have what's probably a niche problem but a problem (seemingly) nonetheless. I'm making a CANBus application that I'd like to use DAR (Disable Auto-Retransmit) mode for. I'm running into an issue where, after a certain amount of time while the only node on the bus (so each TX ends in an ACK error), the CAN peripheral will stop handling TX requests made through FDCAN_TXBAR (the last-queued message never goes out the wire) and no further IRQs occur.

I'm using just a single TX buffer (no queue or FIFO, just the one buffer). I've been probing the bus using a Saleae logic analyzer, and I've found an interesting thing about the timing of the CAN peripheral. A message will go out, the ACK error will occur near the end, and the peripheral will inject its own error frame (6 dominant or recessive bits in a row (depending on pre- or post-error passive mode activation) followed by 8 error delimiter bits and 3 IFS bits). All of that is expected. However, somewhat unexpectedly, when I toggle a GPIO at the beginning of the CAN IRQ to debug the timing, I notice the interrupt fires only a few bits into the error frame's initial 6 bits (nearly immediately after the ACK didn't occur).

This doesn't necessarily spell any doom yet, but I've noticed that if I let my application immediately send another message before the error delimiter and IFS bits are finished elapsing, I can get the CAN peripheral to go into the state I mentioned, where it stops sending out messages queued and requested using FDCAN_TXBAR. If I wait -- say, 1 millisecond -- before setting the bit in FDCAN_TXBAR, I can send thousands of messages in a row without any issues. After those thousands of requests, if I let my application randomly _not_ wait the 1-millisecond delay, the CAN peripheral goes into its not-sending state.

Is this a known thing with the STM32H7 (or broader) CAN peripheral? Is there something somewhere that says I have to manually wait for those bits to elapse before performing a subsequent FDCAN_TXBAR request? Or maybe a configuration somewhere that I missed that controls the timing of when the IRQ fires?

Thanks!

joh06937 · ‎2023-10-06

Very interestingly, it seems that:

1) If I only re-request the single buffer to be transmitted without writing new data to it and without any delays in IRQ handling, that doesn't solve the issue. It doesn't seem related to altering the buffer's contents.

2) If I add a second buffer and request sending the other buffer without any delays in IRQ handling (i.e. I'm never requesting sending the buffer that's doing its error delimiter and IFS handling still), then the problem *does* go away. So this seems to come down to requesting a buffer to be sent when the CAN peripheral likely still thinks it's in the middle of sending that buffer.

I don't think there's really any "solution" to this in the end. I have a few different ways I can work around this (buffer swapping being the easiest). But maybe someday this post will be useful to someone else.

View solution in original post

joh06937 · ‎2023-10-06

Rereading this, I should clarify that when I do request using FDCAN_TXBAR prior to the error delimiter and IFS elapsing, it will sometimes successfully send the next message (maybe 6-12 times in a row) before it hits the unresponsive state (other times it does so on the first one). The CAN peripheral doesn't break the CAN standard and not perform the delimiter and IFS; it does wait before sending the next-requested message. It just seems like the peripheral itself is getting into a weird state if I buffer another one too soon.

I also am going to do an experiment with a second buffer, as I could see an issue with writing a new message to the single buffer if the CAN peripheral is still doing things to it post-ACK failure (perhaps it doesn't save off a copy and it does some end-of-transmission stuff with the buffer?).

joh06937 · ‎2023-10-06

Very interestingly, it seems that:

1) If I only re-request the single buffer to be transmitted without writing new data to it and without any delays in IRQ handling, that doesn't solve the issue. It doesn't seem related to altering the buffer's contents.

2) If I add a second buffer and request sending the other buffer without any delays in IRQ handling (i.e. I'm never requesting sending the buffer that's doing its error delimiter and IFS handling still), then the problem *does* go away. So this seems to come down to requesting a buffer to be sent when the CAN peripheral likely still thinks it's in the middle of sending that buffer.

I don't think there's really any "solution" to this in the end. I have a few different ways I can work around this (buffer swapping being the easiest). But maybe someday this post will be useful to someone else.

Karl Yamashita · ‎2023-10-06

You haven't shown any code so we don't know how you're handling the CAN transmit.

I was told that if a devices starts to smoke, put the smoke back in. I guess I never got all the smoke because the device never worked afterwards.
Don't worry, I won't byte.
TimerCallback tutorial! | UART and DMA Idle tutorial!

If you find my solution useful, please click the Accept as Solution so others see the solution.