cancel
Showing results for 
Search instead for 
Did you mean: 

Stuck USB peripheral does not recover after a NRST...needs power cycle...

ViaAppia
Associate II

Hi STM32 community.

I'm working on an embedded system which uses a RK3399 CPU (running Linux, functions as a USB host) and a STM32F302CC (Runing FreeRTOS, functions as a USB device). This runs on a custom motherboard. They communicate with each other through virtual COM ports over USB (using this library https://github.com/IntergatedCircuits/USBDevice).

This embedded system has been around and in use for several years and thousands of units are operational. 99,99% of the time (or something like that) everything just works just perfectly. But every now and then, during some special  circumstances which have not yet been fully understood (MCU brownout maybe?), some devices will end up in a  weird state where the MCU seems to be running, but the USB device (i.e. the STM32F302CC) does not enumerate in Linux on the CPU side. This can be seen in kernel log as something like this (or similar):

[81228.001719] usb 7-1.4: new full-speed USB device number 40 using xhci-hcd
[81228.065694] usb 7-1.4: device descriptor read/64, error -32
[81228.244730] usb 7-1.4: device descriptor read/64, error -32
[81228.420638] usb 7-1.4: new full-speed USB device number 41 using xhci-hcd
[81228.484767] usb 7-1.4: device descriptor read/64, error -32
[81228.660677] usb 7-1.4: device descriptor read/64, error -32
[81228.772925] usb 7-1-port4: attempt power cycle
[81229.352634] usb 7-1.4: new full-speed USB device number 42 using xhci-hcd
[81229.352740] usb 7-1.4: Device not responding to setup address.
[81229.556702] usb 7-1.4: Device not responding to setup address.
[81229.764596] usb 7-1.4: device not accepting address 42, error -71
[81229.835597] usb 7-1.4: new full-speed USB device number 43 using xhci-hcd
[81229.835696] usb 7-1.4: Device not responding to setup address.
[81230.044687] usb 7-1.4: Device not responding to setup address.
[81230.252587] usb 7-1.4: device not accepting address 43, error -71
[81230.260076] usb 7-1-port4: unable to enumerate USB device

User-space code in Linux will notice that the expected serial port (a virtual COM port over USB) never appears and will eventually perform a NRST reset on the MCU using CPU GPIO pins connected to MCU NRST pin. The CPU will perform another reset if needed, over and over (there are a several seconds wait time in between the resets of course) until the COM port shows up, which it never does in this case. These resets definitely do happen in the MCU (can be confirmed for example by following the debug UART from the MCU or by looking with oscilloscope on the NRST line). However, the MCU reset does not fix the USB enumeration issue. USB will still not enumerate in Linux after the MCU have been reset. And doing the same reset manually by shorting the MCU NRST pin directly on the circuit board also does not help to recover the USB  numeration issue. Rebooting the CPU also does not help. Power cycling the CPU does not help either. But physically cutting the power completely from the MCU (POR/PDR reset) does indeed help, every single time. After a POR/PDR reset everything works again, every single time.

One would assume that a NRST reset would be enough for a stuck USB peripheral to fully recover, but it seems not to be the case. I cannot find anything about this in STM32F302xB/C Errata sheet (ES0231 Rev 6).

This is what the RM0365 Reference manual says about resets

9.1.1 Power reset
A power reset is generated when one of the following events occurs:
1. Power-on/power-down reset (POR/PDR reset)
2. When exiting Standby mode
A power reset sets all registers to their reset values except the RTC domain (see Figure 8).

9.1.2 System reset
A system reset sets all registers to their reset values except the reset flags in the clock
controller CSR register and the registers in the RTC domain (see Figure 8).
A system reset is generated when one of the following events occurs:
1. A low level on the NRST pin (external reset)
2. Window watchdog event (WWDG reset)
3. Independent watchdog event (IWDG reset)
4. A software reset (SW reset) (see Software reset)
5. Low-power management reset (see Low-power management reset)
6. Option byte loader reset (see Option byte loader reset)
7. A power reset
The reset source can be identified by checking the reset flags in the Control/Status register,
RCC_CSR (see Section 9.4.10: Control/status register (RCC_CSR)).
These sources act on the NRST pin and it is always kept low during the delay phase. The
RESET service routine vector is fixed at address 0x0000_0004 in the memory map.
The system reset signal provided to the device is output on the NRST pin. The pulse
generator guarantees a minimum reset pulse duration of 20 μs for each internal reset
source. In case of an external reset, the reset pulse is generated while the NRST pin is
asserted low.

 

The initialization order of the USB peripheral is already done as described in the technical reference manual, and I have discussed with ChatGPT about possible improvements. But everything it has suggested either just breaks the USB enumeration completely or even worse, causes the whole MCU to hang. And anyway, the current code works 99,99% of the time and has done so for several years, so it must be somewhat correct.

So what I'm mainly wondering is if there is any known issues when the USB peripheral inside the STM32F302CC can end up in such a bad state the only a power cycle will reset it properly? If so, could anyone elaborate on what might cause this (brownout during USB operation etc.) so that if I cannot recover from it at least I can try to minimize the risk of it ending up in this state to begin with.

Currently the CPU can not cut the power from the MCU (but I can do it by hand on the circuit board), so "fixing" it that way would require a new HW revision. And anyway would not help in the many devices already in circulation.

Any relevant input which even remotely could be related to what I'm seeing is appreciated!

5 REPLIES 5
TDK
Super User

I'd recommend getting a USB packet viewer to see if the host or the device is the problem. If the STM32 is sending out bad packets, then you have somewhere to start.

Inserting a 500ms pause before initializing USB can be helpful.

If you feel a post has answered your question, please click "Accept as Solution".
FBL
ST Employee

Hi @ViaAppia 

Have you tried using ST USB Device library? In fact, we don't support the provided stack on our ST Community.

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.


Since this is an embedded system where the USB D+ and D- between the STM32 and the CPU are connected
directly on the circuit board there are no USB plugs nor any test pads that I could solder wires to. Only option would be to solder test wires directly to the MCU or CPU pins where they are mounted to the motherboard. But even that becomes very complicated and error prone due to how the device is assembled mechanically.

In addition, a device might need to operate weeks or months before this kind USB error even occurs. When/if it occurs it's too late to do any HW modifications, since then it would need to be completely disassembled, which would cut power from the STM32, which would make it recover from the problem anyway (because a MCU power cycle helps).

I have however logged USB traffic on the Linux side (CPU) using tcpdump and usbmon on a unit where this problem occurred. After looking at the resulting pcap in Wireshark and also letting several AI's analyze it, conclusions are that the STM32 USB interface simply does not answer at all to the bRequest 6 (GET_DESCRIPTOR) sent out by the host (i.e. by Linux running on the RK3399 CPU). This also matches what I'm seeing in dmesg.

Resetting (NRST) the STM32 does not help, no matter how many resets are done. Power cycling the STM32 however magically fixes everything, every time.

Unfortunately the CPU can only do MCU resets (NRST pin), it can not cut the power from the MCU. But I can cut the power manually on the circuit board once the device case is opened up.

No, this project have used the USBDevice library since day 1. The engineer who originally wrote the MCU code no longer works here so I do not know why this library was chosen instead of the ST USB Device library.

At this point of time in this investigation, re-writing the USB part of the MCU project is not yet on the table as we are still trying to figure out what exactly is happening when this occurs.

gbm
Principal

I suspect USB connect/disconnect issue. Check if the device is disconnected (USBDP line pullup is turned off) right after reset. It should be disconnected immediately after reset, then connected (USBDP pullup enabled) after 20..50 ms. If the pullup resistor is permanenty connected, then use the GPIO trick to disable pullup.

My STM32 stuff on github - compact USB device stack and more: https://github.com/gbm-ii/gbmUSBdevice