cancel
Showing results for 
Search instead for 
Did you mean: 

CAN – Message Conflict Detection

Mark Edwards
Associate II
Posted on May 13, 2015 at 23:25

I am building a CAN Analyser using the STM32F429.

At present I am only using one of the CAN channels and I have I found that if there is

a collision during transmission when the NART bit is set then the MPU continues as if

no problem occurred despite the message not getting sent. Clearing the NART bit, the

message is sent but I still don�t see any indication that a problem occurred.

So my question is, is there something I am missing?

At the moment the only way around the problem that I can see is to connect the CAN1&2

channels together and transmit on one channel and monitor the received data on the

second to ensure that that messages I send are actually sent.

I have designed in the capability to monitor the CAN TX/RX lines using EXTI so could

examine the packet structure to look for collisions but I feel that transmission

errors should be visible using the CAN registers but can�t see where.

Schematic Attached.

#can-collisions
5 REPLIES 5
jpeacock
Associate II
Posted on May 14, 2015 at 14:09

A collision is normal and frequent so it isn't considered to be a fault.  As soon as a bit difference occurs in the header the colliding transmitter stops, waits, and retries (that's what NART does).  The more nodes on line, the more traffic being sent, the more likely a collision will occur.  Some CAN protocols have mechanisms to minimize this by setting time slices for each node to send periodic traffic.

For a bus analyzer to catch this you'd have to detect a high bit for transmitter A being overridden a (very) short time later with a low bit from transmitter B.  Even then there's no guarantee you'd catch all cases since the high bit may start at the same time as the low bit, or slightly later but before the collision is detected by the second transmitter.  All you'd see is a very small amount of noise on the low bit.

I don't see a way to detect collisions digitally on the bus itself.  It might be possible by looking at the analog signal characteristics on the wire pair.  For a single node you can compare the CAN controller TX out (before the transceiver) to what's on the bus, adjusting for phase shift through the transceiver.

  Jack Peacock

Mark Edwards
Associate II
Posted on May 15, 2015 at 20:50

Thanks Jack. It’s nice to get some confirmation that I am not missing the obvious.

My analyser is designed to work on a specific application (lighting control) which uses a (coarse) non standard implementation of the CAN standard.

I hadn’t even considered looking for slight changes in the CAN signal level but it’s something to bear in mind. My intention was to use the signals levels to profile devices attached to the network (so as to identify which devices are TX’ing data onto the network), but that’s certainly something to look out for.

So, I am going to have to drill down into the CAN data to see what is going on in the event of missed message. But that raises the question of how to differentiate a stuffed bit from a data bit?

I am hoping that at a coarse level I can use timestamps on the EXTI interrupts of the signal transitions (or at a finer level – direct monitoring of the CAN RX signal) but which assumes the latency between the Core and the IO lines is somewhere near constant.

As the above is implemented in silicon there has to be some ‘simple’ rules, but I have yet to find a description of the definitive process.

I suppose it’s now time to get really intimate with the CAN spec.
jpeacock
Associate II
Posted on May 18, 2015 at 19:35

There really isn't anything like a ''missed'' message on CAN except for a mangled transmission, in which case you have a hardware fault.  The philosophy is object oriented messages at a rate sufficient to keep process data current.  In lighting there's the mechanical pan and tilt to aim spots and dimming levels, all take place in a relatively slow time period.  Motors take time to move, and dimming can't go any faster than the zero cross frequency (for AC) or flicker rate (for DC studio backlot lighting).  So it's far easier to send redundant messages across the bus rather than try to detect a collision.

Take a look at CANopen and how it sets up nodes with a time slice.  That's ideal for lighting, very low likelihood of collisions, and even if one occurs the NART will quickly retransmit before the next node slice starts.  Coordinate all the nodes with a sync message from the bus master (the lighting console) so the time slices don't drift.

Messages set a state so if a node receives the same command several times it ignores the redundant ones.  Even programmed movement like sweeping a spot across the stage can be synchronized to vary speed between set points along the path, with the console sending sync messages at the fixed points.

  Jack Peacock (ex-programmer for Las Vegas Strip resort signs)
este00
Associate II
Posted on May 19, 2015 at 19:30

Jack has it right.

Consider sending periodic messages every X ms. You don't care if they all get through on their first try or not.

CAN has automatic re-send. So if you lose arbitration or there is a buss off condition, the module will re-send that message until it gets through, or you overwrite it in the TX buffer. One time send can be enabled, then you could watch for arbitration loss, but meh, just let it do it's thing. Watch for buss-error or buss-off states in the hardware, send periodic messages, move along.

Mark Edwards
Associate II
Posted on May 20, 2015 at 20:06

>Jack - Not theatrical lighting (DMX/RDM products take care of that), this is for networks used

in established architectural/domestic products. Also not a hardware fault, more a historical

wetware fault (poorly implemented CAN application where each device can act independently as

Master and conflicting messages are a fact of life) hence a custom/application specific  tester.

>este – Some of the products do automatically re-send, other (older products/incorrectly

configured development products) may not. When a user presses a button to make a light come on

they expect it to work. 99.99% of the time it does. But as products are being added to existing

networks the traffic can increase significantly (power usage, occupancy, daylight linking, Li-Fi,

temperature, etc), my customer has now taken notice of the occasional problems. They need to get

a handle on this and as this isn’t exactly a standard implementation, an off the shelf analyser

doesn’t give the detail they require. Hence my need to detect occasional conflicts, corrupted &

ghost messages.

But still, thanks both for your input.