2025-11-24 10:40 AM
I have a situation where a USB endpoint stops working due to a control packet being NAKed in the handshake phase. I've traced the problem down to the STM USB stack setting DCTL.SGONAK = 1 in response to an INCOMPISOOUT event, yet in the case when it fails, no global OUT NAK status notification appears in the RxFIFO and the GINTSTS.GONAKEFF interrupt status never occurs. However, the DCTL.GONSTS bit does change from 0 to 1 after SGONAK is set.
Yet, according to the reference manual for the STM32H757 (RM0399 Rev 4, page 2908):
It appears that a SETUP packet arrives immediately after the INCOMPISOOUT so the top of the RxFIFO contains a "setup data packet received" notification instead of a "global OUT NAK" status notification. Yet, no "global OUT NAK" is ever popped from the RxFIFO in subsequent interrupts and because no GONAKEFF interrupt occurs, the global OUT NAK status (reflected in DCTL.GONSTS) is never cleared and all OUT packets (including those involved in the handshake phase of control packets) are NAKed and USB communication breaks down.
Despite what the reference manual says, an AI search claims that setting DCTL.SGONAK = 1 does NOT insert a "global OUT NAK" status notification in the RxFIFO but should cause a GONAKEFF interrupt (provided the core was not already in the global OUT NAK state, but I already confirmed that DCTL.GONSTS was 0 upon entry to the interrupt which set DCTL.SGONAK to 1).
Yet, whichever is true, the reality is that GINTSTS.GONAKEFF is never set when failure occurs. Note that in previous handling of INCOMPISOOUT, everything seemed to work fine, as expected. It is only in the failure case that the GONAKEFF interrupt never occurs, and this failure seems to be related to a SETUP packet arriving at an inopportune time.
Any clarification on the actual flow of the USB core and help resolving the issue would be appreciated.
Sincerely,
Dan
2025-11-25 2:51 AM - edited 2025-11-25 7:53 AM
Hello @drmadill
Could you confirm you are using latest stm32h7xx-hal-driver? If so, could you provide a minimum example or setup to reproduce the issue, preferably based on example from CubeH7 ? It could be an issue in HAL driver.
About, the application can clear this interrupt by clearing the SGONAK bit in OTG_DCTL.
This could be never implemented in HAL without issues. But first, I'd like to review your current Rx FIFO size configuration. This article might be helpful. Some practical use cases are provided here as well.
To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.
2025-11-26 7:18 AM - edited 2025-12-03 7:57 AM
Thank you for your reply. I have "STM32Cube MCU Package for STM32H7 Series" version 1.12.1. The version on the GitHub page you referenced appears to have 1.11.2 as its latest tag, so would seem to be older, unless these versions are not comparable.
Providing a simple example could be difficult. This issue has showed up in a complicated composite device. The primary driver of the issue, however, appears to be the combination of an interface with bulk endpoints operating at the same time as an interface with an isochronous OUT endpoint.
I am still reading the other links you sent for more information, but what I could provide right away is a Beagle USB Analyzer trace correlated with the contents of all USB registers at the entry to the HAL_PCD_IRQ_Handler and at the exit from the HAL_PCD_IRQ_Handler for every USB PCD interrupt as the device crosses from the operating state to the inoperative state, in an Excel spreadsheet (highlighted where changes occur). I have also made sheets decoding relevant registers at every interrupt (pre and post). Would this information be of use to you? The Beagle trace (which may be viewed with the TotalPhase DataCenter software, Data Center Software - Total Phase) is 14.43 MB in size, and the Excel spreadsheet is 3.762 MB)
I have currently reserved 0x200 words for the RxFIFO (2048 bytes). I have not seen the B2BSTUP interrupt occur AFAIK.
Best regards,
Dan
2025-11-27 5:56 AM - edited 2025-11-27 6:35 AM
Hi @drmadill
Indeed, would you share the trace? Would you also share your TxFiFO sizing for each endpoint?
To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.
2025-11-27 11:27 AM
Here are the Beagle USB Analyzer traces (in the zip file) and the Excel spreadsheet containing the USB register contents.
The Interfaces sheet contains the interfaces and endpoint assignments, as well as the TxFIFO sizes for each IN endpoint.
The snapshots sheet contains all the register contents. The Timestamp field is the time at which the USB registers in a column were sampled, in nanoseconds, based on the ARM DWT timer. The Timestamp delta is the time between columns. The Post field indicates whether the USB register contents are at the start of the HAL_PCD_IRQ_Handler (post=0) or at the end of the HAL_PCD_IRQ_Handler (post=1). I have highlighted the columns representing the end of the IRQ in orange on the snapshots sheet. Red highlighting indicates that the field in the red column differs from the previous column so you can see at a glance when registers change value. Note that I did not read the DFIFO or GRXSTSP register contents to avoid interfering with the functionality of the USB stack.
The subsequent sheets are individual decoded register contents across the same time frame. I have put comments on the GRXSTSR sheet around column AZU and column BAO. I also added rows that provide the Beagle analyzer index (TotalPhase index), and sample time (TotalPhase time). I then take the Timestamp from the USB register snapshots and compute an Adjusted snapshot time that correlates the timestamp with the TotalPhase sample time.
If you zoom out on the GRXSTSR sheet you can see that the register contents stop changing at column BAR. It is at this point that the USB communications stop working properly (as can be seen in the Beagle analyzer as well). The INCOMPISOOUT event occurred in column BAO.
As a comparison, you can see what happens with a successfully processed INCOMPISOOUT event at column AZU. My comments may be a little out of date in terms of what might be happening, but should be pretty close.
Thank you for your help.
Best regards,
Dan
2025-12-02 2:28 AM - edited 2025-12-05 2:44 AM
Hi @drmadill
Upon further review, your RxFIFO setup is indeed correct. I realize I may have overstepped in the initial investigation. The root cause of the issue is more likely related to the TxFIFO configuration.
Total TxFIFO size is 512 words, any adjustment to the control endpoint’s TxFIFO size must be carefully considered to avoid impacting overall FIFO allocation.
As I don't have your full SW setup, try to decrease TxFIFO . As I mentioned earlier, this article might be helpful. Some practical use cases are provided here for help.
To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.
2025-12-02 3:32 PM - edited 2025-12-02 4:04 PM
I'm confused. The total of my FIFO sizes is no more than the 4Kbytes (1024 words) available in the USB FIFO RAM. No FIFOs are overlapping. I thought the 288 locations was the minimum size for the Rx FIFO. Are you saying that specifying a larger size will mess up the operation of the USB stack? I did not expect that to be the case. The Tx FIFOs are all large enough for the size of the packets I'm sending for their respective IN endpoints.
2025-12-03 1:13 AM - edited 2025-12-03 1:14 AM
Hi @drmadill
I understand this can be confusing, but this is how the controller works. The total amount of FIFO is not all meant for data payload. You have meta data involved as well and not explicitly mentioned. Check section 3.1 Rx FIFO size calculation and 3.2 Tx FIFO size calculation.
Caution:
To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.
2025-12-03 7:20 AM
I understand that there is metadata and I had already seen the formula you gave in the reference manual (section 60.11.3 of RM0399 Rev. 4) and ensured that my Rx FIFO allocation was at least as large as given by the formula (which I understand accounts for this metadata, like status notifications, global OUT NAK, etc.). Is there metadata beyond the "10 locations reserved in the receive FIFO to receive SETUP packets", the one location for Global OUT NAK, the status information (1 word) for each received packet and the transfer complete status information (1 word) pushed along with the last packet? Note that I had allocated more for the Rx FIFO than the 288 words you suggest, so shouldn't my allocation have covered any extra metadata? Note that I had allocated 512 words for the Rx FIFO, as I had seen in RM0399 and is, in fact, recommended in section 3.3 of the article you posted:
"Generally, the minimum RxFIFO in HS mode we allocate 512 words (0x200)"
I also had set the Tx FIFO for each IN endpoint to the maximum packet size (based on the USB descriptor) for that endpoint (understanding that there is a dedicated Tx FIFO for each IN endpoint). It is my understanding that on the Tx FIFO side, only the maximum packet size is required to be allocated, and not extra space for metadata per Tx FIFO / IN endpoint, as it says in RM0399 and section 3.2 of the article you posted:
"Tx FIFOs do not store metadata, only payload data."
I was already familiar with all the information in the article you posted and followed it carefully, as far as I know. All you have really told me that is new to me is that I should reduce the size of my FIFOs to the calculated values rather than maximize my use of the 4K available (between the Rx FIFO and Tx FIFOs). That's the part I don't understand, since I was under the impression that maximizing my use of the 4K available could result in better performance. In fact, in section 3.3 of the article you posted, it recommends larger FIFOs (as long as you don't exceed the 4Kbytes of USB RAM):
"Device RxFIFO = 13 + 2 × ((largest USB packet size)/4 + 1) + (2 × number of OUT EP) + 1
For each transmit FIFO
Device TxFIFO = 2 × MPS (for each EP)"
I don't wish to belabor this, but we are planning to use STM devices for other USB devices in the future, so I want to make sure our USB is robust. If there is memory required in the USB FIFO RAM beyond the amounts in the formulae above (with or without the double buffering factor of 2 on the largest packet size), then that is important to know because otherwise it becomes impossible to determine if there is enough RAM available for all the endpoints used (as it is possible to run out of USB RAM using the above formulae if there are too many endpoints with large MPS). For example, if I have a device in which the device RxFIFO and TxFIFO calculations above result in exactly all 4Kbyte of USB RAM being used (1024 words), then will the device still work reliably?
2025-12-05 2:50 AM
Hi @drmadill
Sorry for the misunderstanding, your configuration for RxFIFO is definitely correct. However, try to dispatch the TxFIFO to the control EP0 and EP1 as follows:
/* Set Rx FIFO to accommodate 512 words*/
HAL_PCDEx_SetRxFiFo(&hpcd_USB1_OTG_HS, 0x200);
/* Set Tx FIFO 0 size to 16 words (64 bytes) for Control IN endpoint (EP0). */
HAL_PCDEx_SetTxFiFo(&hpcd_USB1_OTG_HS, 0, 0x10);
/* Set Tx FIFO 1 to 256 words (1KB) for Bulk IN endpoint */
HAL_PCDEx_SetTxFiFo(&hpcd_USB1_OTG_HS, 1, 0x100);
If this configuration works well, we can proceed and have a clearer focus to narrow down the root cause.
To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.