2025-12-03 4:48 AM - last edited on 2025-12-03 8:27 AM by mƎALLEm
Hello,
We are facing an intermittent CAN-FD issue in the field and would appreciate guidance from the community.
Our system has two devices on the bus (no other device on the bus) using a request–response architecture. The master sends a request every 30 ms and the slave responds. This setup is deployed in hundreds of units running continuously (24 hours). Out of these, around 3 to 5 units per day show acknowledgement errors, which we track through the CAN protocol error counters.
The behaviour is unusual:
• The issue appears randomly on any unit.
• No bus-off condition is ever reported.
• Despite no bus-off, communication between the two nodes temporarily stops.
• Communication recovers automatically after a few seconds without any intervention.
Initially, we suspected a physical wiring problem. We re-checked all connectors and even secured them with glue. The bus has 120-ohm termination at both ends. However, the issue still appears randomly.
Below are the system details:
Microcontroller: STM32G0B1CBT6
Baudrate: 125 kbps
CAN bus length: ~100 cm
Termination: 120 ohms at both ends
FD-CAN Core Clock: 50 MHz
ClockDivider: 1
Bitrate Switching: Disabled
AutoRetransmission: Disabled
TransmitPause: Disabled
ProtocolException: Disabled
Nominal Bit Timing:
• Prescaler = 10
• SyncJumpWidth = 8
• TimeSeg1 = 31
• TimeSeg2 = 8
Data Bit Timing (BRS disabled, same as nominal):
• Prescaler = 10
• SyncJumpWidth = 8
• TimeSeg1 = 31
• TimeSeg2 = 8
Filters:
• StdFiltersNbr = 1
• ExtFiltersNbr = 0
Physical wiring check
• Verified connector seating and cable condition
• Applied glue to prevent vibration-related disconnection
• Confirmed correct 120-ohm termination at both ends
Error counter monitoring
• ACK errors observed in protocol error counters
• No error-warning, error-passive, or bus-off states reported
Timing verification
• Checked nominal bit timing settings
• Ensured both nodes use identical configurations
• Bitrate switching is disabled on both sides
Bus recovery logic
• Bus-off recovery is implemented
• Never triggered during these events
Environmental factors
• Units run 24×7
• Errors occur randomly across different devices and locations
Master Device
↕ (approx. 100 cm cable)
Slave Device
Termination resistors (120 ohms) are present at both ends. No other nodes are connected.
Any insights or suggestions on what could cause intermittent ACK errors without bus-off would be greatly appreciated.
2025-12-05 2:25 AM
@TSola.1 wrote:
1. Clock Source
The nodes are running from the internal RC oscillator. Unfortunately, the hardware design does not include a crystal oscillator, so we cannot switch to an external clock source.
Unfortunately, the crystal is something crucial for CAN communication.
+ Read also this article: CAN reception issues: Reasons and general troubleshooting
2025-12-05 2:48 AM
I overlooked that.
The internal RC oscillator is not suited for most communication protocols. Especially when non-constant environmental temperatures are involved.
2025-12-05 3:47 AM
Okay, "HSI", bad idea...
Anyway, I would not give up due to the low bit rate.
Your settings:
Nominal Bit Timing:
Prescaler = 10
SyncJumpWidth = 8
TimeSeg1 = 31
TimeSeg2 = 8
1) reduce the prescaler so there's more room for fine tuning the segments
2) I think SyncJumpWidth must be smaller than TimeSeg2 (SJW is used to "lengthen" Tseg1 for sync purposes), the maximum should be SJW should be SJWmax = Tseg2 -1
3) ... but use the maximum SJW!
4) check your HSI frequency and check for the STM's HSI calibration features (if any - never used that, always using external clock)
2025-12-05 4:25 AM - last edited on 2025-12-05 4:43 AM by mƎALLEm
Post edited by ST moderator to be inline with the community rules for the code sharing. In next time please use </> button to paste your code and a linker script content. Please read this post: How to insert source code.
1. Clock Configuration of STM32G0B1CBT6 with baudrate 12.5 kbps:
hfdcan1.Instance = FDCAN1;
hfdcan1.Init.ClockDivider = FDCAN_CLOCK_DIV1;
hfdcan1.Init.FrameFormat = FDCAN_FRAME_FD_NO_BRS;
hfdcan1.Init.Mode = FDCAN_MODE_NORMAL;
hfdcan1.Init.AutoRetransmission = DISABLE;
hfdcan1.Init.TransmitPause = DISABLE;
hfdcan1.Init.ProtocolException = DISABLE;
hfdcan1.Init.NominalPrescaler = 10;
hfdcan1.Init.NominalSyncJumpWidth = 2;
hfdcan1.Init.NominalTimeSeg1 = 15;
hfdcan1.Init.NominalTimeSeg2 = 5;
hfdcan1.Init.DataPrescaler = 10;
hfdcan1.Init.DataSyncJumpWidth = 2;
hfdcan1.Init.DataTimeSeg1 = 15;
hfdcan1.Init.DataTimeSeg2 = 5;
hfdcan1.Init.StdFiltersNbr = 1;
hfdcan1.Init.ExtFiltersNbr = 0;
hfdcan1.Init.TxFifoQueueMode = FDCAN_TX_FIFO_OPERATION; 2. Curren Configuration of STM32H743 with baudrate 12.5 kbps:
hfdcan1.Instance = FDCAN1;
hfdcan1.Init.FrameFormat = FDCAN_FRAME_FD_NO_BRS;
hfdcan1.Init.Mode = FDCAN_MODE_NORMAL;
hfdcan1.Init.AutoRetransmission = DISABLE;
hfdcan1.Init.TransmitPause = DISABLE;
hfdcan1.Init.ProtocolException = DISABLE;
hfdcan1.Init.NominalPrescaler = 20;
hfdcan1.Init.NominalSyncJumpWidth = 2;
hfdcan1.Init.NominalTimeSeg1 = 15;
hfdcan1.Init.NominalTimeSeg2 = 5;
hfdcan1.Init.DataPrescaler = 20;
hfdcan1.Init.DataSyncJumpWidth = 2;
hfdcan1.Init.DataTimeSeg1 = 15;
hfdcan1.Init.DataTimeSeg2 = 5;
hfdcan1.Init.MessageRAMOffset = 0;
hfdcan1.Init.StdFiltersNbr = 1;
hfdcan1.Init.ExtFiltersNbr = 0;
hfdcan1.Init.RxFifo0ElmtsNbr = 1;
hfdcan1.Init.RxFifo0ElmtSize = FDCAN_DATA_BYTES_12;
hfdcan1.Init.RxFifo1ElmtsNbr = 1;
hfdcan1.Init.RxFifo1ElmtSize = FDCAN_DATA_BYTES_12;
hfdcan1.Init.RxBuffersNbr = 0;
hfdcan1.Init.RxBufferSize = FDCAN_DATA_BYTES_12;
hfdcan1.Init.TxEventsNbr = 1;
hfdcan1.Init.TxBuffersNbr = 1;
hfdcan1.Init.TxFifoQueueElmtsNbr = 1;
hfdcan1.Init.TxFifoQueueMode = FDCAN_TX_FIFO_OPERATION;
hfdcan1.Init.TxElmtSize = FDCAN_DATA_BYTES_8;
Will be trying to reduce the prescaler and HSI calibration feature if any.
2025-12-05 4:53 AM - edited 2025-12-05 4:54 AM
Even though I still insist on the usage of a crystal, please post some photos of your CAN network set-up: the different nodes with the CAN bus (the wiring).