2023-12-24 11:59 AM
This behavior seems like a bug.
I’m using USBX CDC ACM on a H723 nucleo board. I’m trying to work with the blocking nature of the ux_device_class_cdc_acm_read. For my application, I would like to get data as soon as up to 256 characters are received but the length will be variable. I’ve tried a few things and found this behavior that seems wrong. My application currently just loops back any strings received.
The endpoint is set to 64 bytes. That was the setting from the example and seems ok. I’ve set the requested_length to 256 and have a buffer allocated to this. This is 4 times the size of the end point.
I’m using Hterm 0.8.9 on Windows 10.
STM32CubeIDE 1.13.2, STM32CubeMX 6.9.2-RC4, STM32Cube MCU Package for H7 1.11.1, X-CUBE-AZRTOS-H7 3.2.0.
ThreadX and USBX is 6.2.1.
When I send a 63 byte string, the read call returns with a length of 63. Data and length are good.
When I send a 65 byte string, the read call returns with a length of 65. Everything is ok.
However, when I send a 64 byte string, the call does not return. I am monitoring the USB traffic with a Beagle 480 USB protocol analyzer. I get an error T which is “Capture for transaction timed out while waiting for additional data.“ If I send an additional single byte, the read call returns with 65 characters. The original 64 and the additional byte.
This certainly seems like a bug as it works at 63 and 65 characters and doesn’t return at 64.
The attached PDF of this has capture information.
2024-11-21 06:20 AM
Thanks for the reply and sorry for the late response. I'm not so keen on doing the same workaround you have as my requirement is quite high performance. It does seem like a weakness in the ST HAL layer and I guess other USB CDC-ACM stacks build upon the top of that so may well suffer the same issue. Whilst I could work around it in the client (PC) app by perhaps appending a byte if the amount of data sent out ==64, it doesn't seem right.
I wonder how other MCU platforms (non ST) handle it...
Would also be helpful if anyone on here from ST can suggest anything?
2024-11-21 07:56 AM - edited 2024-11-21 07:59 AM
This is exactly the reason why ZLP should be used. If the host requests any number of bytes > endpoint packet size and the device sends multiple of endpoint packet size, the host controller expects more and does not end transaction. This is how USB works. Looks like the USBX driver does not inject ZLPs automatically because the developers thought it is responsibility of user. The driver does not know if, after sending multiple of endpoint packet size, user can send more data or not.
2024-11-21 08:04 AM
Yes, and from the device's point of view this is exactly what happens on tx - the USB device stack is configured to send ZLP when required and the host is happy.
The problem I'm having is on receive of data from the host (being a Windows or linux PC), which appears not to send a ZLP in this scenario. So the device sits there waiting for more data if the last data packet was exactly 64 bytes. Since the host application is writing to a serial COM/tty port which could be a traditional serial port or CDC-ACM, it doesn't (and shouldn't IMO) know about ZLPs.
2024-11-21 10:36 AM
We certainly can't be the only developers to have encountered this. The Windows driver behavior is what it is and we have to work with it. Similar situation with Linux. I'm not in a position to get out my USB bus analyzer to go back and checkout the actual communication.
2024-11-21 11:29 AM - edited 2024-11-21 11:31 AM
> We certainly can't be the only developers to have encountered this
So for the receiving device (STM32) side the found workaround looks good. As soon as the USB controller gets a buffer full (endpoint size) or less - just take that data and handle it. Whether USBX detects a timeout, and how well the [awesome] ST low level driver interacts with USBX, are details.
2024-11-21 01:31 PM
Having a timeout to handle exact multiples of endpoint size seems a bit of a kludge and not really suited to a high throughput or low latency scenario.
2024-11-22 04:11 AM
@AHugh.2 IMHO the most concerning part of the problem here is quality/stability of the low level USB code (ST library). If it really is as poor as reported and can cause loss of data in a "reliable" bulk pipe, this should be solved first.
How to handle exact multiples of endpoint size vs. latency depends on ability of the USB driver to pass data to the user. In embedded RTOS everything lives in the same memory space. So as soon as the USB (DMA) gets endpoint-size buffer, the user task can work in it immediately, no latency involved. If more data follows, it is received in background. Even in OS like Linux, low latency can be achieved with shared memory and so on. ZLP is an easy way to signal "end of data" but it is overhead. Not actually required for structure-less serial data.