SPI MISO "broken" after a while
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2024-09-21 12:25 PM
Hello,
I have a Nucleo-F439ZI as SPI master connected to four Nucleo-F030R8 SPI slaves.
Initially, the SPI communication was working perfectly fine. But after a while, the master received only zeros on MISO line even though, the slaves where sending data != 0 (verified by logic analyzer).
I first suspected a bug in my code, wrong SPI configuration, wiring issues and so on. But soon I recognized, the SPI1_DR always reads zero, even when the MISO pin (PA6) is tied to 3.3V! The port config looks fine (Alternate function 5, PP, no pull devices). Also SCLK and MOSI are working perfectly fine.
So, I assumed a broken MCU, replaced the master Nucleo board by another new one and everything was working fine again. I ran a test, sending/receiving more than 100,000,000 SPI frames successfully without a single error (verified by SW CRC check on protocol layer).
One day later, I powered the setup again without any change to HW or SW and now, the second board/MCU was broken. Same issue, SPI_DR reads only 0, no matter what the actual input is. I tried a third Nucleo board and it was working fine again.
I checked the GPIOA_IDR.IDR6 and it perfectly reflects the logic level applied to the pin, so, it's not a broken port input. I also tried to use another port pin as MISO input (PB4 [AF5]), but still the same. Looks like the input path is broken somewhere after the port multiplexer.
Since the original SW is quite complex, using HAL, SPI with DMA and complex logic, I created a small test project configuring the GPIOs and SPI manually and just sending/receiving SPI data in a while-loop with polling the SPI1_SR.RXNE flag. Same behavior here: working well on a new Nucleo board, not receiving any data on the first two "broken" boards.
Also I could not find any errata that would explain this behavior.
I have to mention, that I use a ~45cm ribbon cable to connect the SPI master to the slaves, which I know is not really state-of-the-art. However, the signals look well on oscilloscope and logic analyzer. The layout of the ribbon cable is: CS-GND-MOSI-GND-MISO-GND-SCLK-GND.
Another observation: When I change the SPI mode from 0 to 3 (CPOL=1, CPHA=1) and tie the MISO pin to either GND or 3.3V, SPI1_DR reads whatever the initial status of the pin is when the SW is started:
- MISO connected to GND => start SW => DR reads 0 => connect MISO to 3.3V => DR still reads 0
- MISO connected to 3.3V => start SW => DR reads 0xFF => connect MISO to GND => DR still reads 0xFF
("start SW" means pressing "Reset the chip and restart debug session" followed by "Resume" in CubeIDE)
In SPI mode 0, it always reads 0, regardless of the initial state of MISO.
The settings are:
- f_PCLK2 = 42MHz
- SPI Prescaler 32 => Baudrate = 1.3125MHz
- Frame Format = Motorola
- Data Size = 8 Bits
- CPOL = 0
- CPHA = 0
- CRC = Disabled
- NSS controlled by SW
GPIOA Config:
- MODER6 = 0x2
- OTYPR6 = 0x0
- OSPEEDR6 = 0x3
- PUPDR6 = 0x0
- AFRL6 = 0x5
SPI1_CR1 Config:
- BIDIMODE = 0x0
- BIDIOE = 0x0
- CRCEN = 0x0
- CRCNEXT = 0x0
- DFF = 0x0
- RXONLY = 0x0
- SSM = 0x1
- SSI = 0x1
- LSBFIRST = 0x0
- SPE = 0x1
- BR = 0x4
- MSTR = 0x1
- CPOL = 0x0
- CPHA = 0x0
SPI1_CR2 = 0x0, SPI1_I2SCFGR = 0x0, SPI1_I2SPR = 0x0
The board is powered by a stable laboratory power supply with 8V on VIN (JP1 = OFF, JP3 = 5-6).
The boards were purchased on digikey, no china fake.
A similar issue was observed here:
https://www.mikrocontroller.net/topic/549031
(unfortunately in german language and no solution except replacing the MCU).
and here:
(but also here, the solution was just to replace the chip)
What could be the reason for the broken MISO input?
Is there any recommendation for external circuit to protect the input?
If it happened only once, I would just replace the chip and not bother anymore. But if it happens two times in a week, I'll have to find the root cause and a solution....
- Labels:
-
SPI
-
STM32F4 Series
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2024-09-22 08:28 PM
Sounds like you are the victim of LATCHUP. You need to change the connection hardware design, and/or power on sequence. You will just keep breaking them until you fix it...
Kind regards
Pedro
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2024-09-22 10:08 PM
So it looks like only the "alternate function path" was damaged, interested I am how that can happen.
Anyway, in case you connect anything from outside the board, always use at least some ESD protection.
For high speed SPI you can use the low capacitance TVS used for USB or other "user interfaces".
And for each pin, also the data lines!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2024-09-22 11:25 PM
Does SCLK work properly on the damaged boards?
JW
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2024-09-23 02:57 AM
Yes, if it is caused by external ESD, I would have expected the schmitt trigger of the port to break, but not something "deep inside the chip"...
I'll try to add some TVS, thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2024-09-23 02:59 AM
SCLK (and also MOSI) still work properly on the damaged boards.
Also, the RXNE flag is set as expected, the only problem is, that DR always reads 0x00.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2024-09-23 04:23 AM
> -SCLK (and also MOSI) still work properly on the damaged boards
Is SCLK=PA5?
Can you try to move it to PB3? (there may be some solder bridge on the Nucleo board on that pin, as it's TRACESWO).
JW
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2024-09-23 04:52 PM
That is definitely Latch-up.
@_Daniel_ you need to do some reading on how to design hardware to avoid Latch-up.
Good luck.
Kind regards
Pedro
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2024-09-23 05:52 PM
Also, don't fall into the trap where you think that ESD and Latch-up are the same thing - they are NOT!
Kind regards
Pedro
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
‎2024-09-23 10:14 PM
As far as I understand, latch-up is the fault caused by overvoltage, maybe due to ESD.
@PGump.1 so what would be the difference to standard ESD protection with TVS (plus whatever, depending on the circuit, serial Rs, inductors, common mode chokes, ...)?