32F417: CDC_Receive_FS() or USBD_CDC_ReceivePacket() - any known bugs?

PHolt.1 · ‎2022-08-30

I am using USB for both MSC (removable drive) and CDC (virtual COM port).

MSC works perfectly.

CDC out has been working perfectly for ~2 years.

CDC in had never really been tested, and ~ 1 byte every 20 is corrupted.

I am running CDC to Teraterm on the PC, and the out mode has been used for debugs.

The in mode gets activated of you press keys on the keyboard.

I have a simple RTOS task which copies in data to output. That is where I see the corruption.

It is entirely fixable by disabling certain RTOS tasks! But the ones which "fix" it don't do anything related to USB. I am thus suspecting there is a critical region which is unprotected and RTOS related task switching messes it up.

These functions should be well known because it is generated by Cube MX. I don't use MX (the code generated tends to work but is bloated junk and often works by luck because the cloated code covers up timing issues) but it was used when this project was being created ~ 3 years ago by someone else.

PHolt.1 · ‎2022-09-06

I think USBD_CDC_ReceivePacket(&hUsbDeviceFS) is supposed to go after the copy loop.

This is the init code

and that buffer was 2k, not 64 bytes, so not sure what the intention was there.

I just looked up some code from a year ago, which may be the original.

waclawek.jan · ‎2022-09-06

Isn't code between /* USER CODE BEGIN x*/ and /* USER CODE END x*/ user code?

OK we probably won't know unless somebody tries to generate this in CubeMX (and that in the edition which was used to generate this).

Not that interested, I see little point in wasting much time with Cube/CubeMX generally.

JW

Piranha · ‎2022-09-06

> On the one hand you have a go at me for polling ETH RX and on the other you say the above.

Interrupts should be used, but they should do only the minimum and return as quickly as possible. Typically it means clearing the necessary flags and notifying the respective processing thread. The time consuming processing is then done by the thread. But ST's USB stack does all the processing in interrupt context...

About the CDC_Receive_FS() code...

Why ST defined the Len parameter as a pointer? Does it have to be modified or is it just because someone was dumb?
Why the data copy is implemented as a custom loop instead of a memcpy() call?

PHolt.1 · ‎2022-09-06

"Isn't code between /* USER CODE BEGIN x*/ and /* USER CODE END x*/ user code?"

Yes, but the style of the stuff in between (which I see in many places in this project) suggests it came from some "pre-cooked" source. The guy used MX to generate the "skeleton" and then copied across the generated code to the real project; that is a reasonable way to get going quickly with MX while avoiding getting into a total mess which people get into when they try to do everything with MX.

I am too not interested in spending much time on it. It seems to be working very well. I need to do some loopback tests (loopback tester on the VCP, written by somebody on freelancer.com, and a loopback routine on the target which echoes everything) to make sure there is no data loss but it looks good now, with data rates of 100-200 kbytes/sec and that is with the loopback task running at IDLE priority.

"The time consuming processing is then done by the thread. But ST's USB stack does all the processing in interrupt context..."

I have already written that I really dislike the MX-generated ISRs which wade through dozens of conditionals to skip past unused features (something you get away with OK at 168MHz) but some things are not simple to do "properly" and doing them in an ISR is okay.

Example: How would you implement MSC (removable device) writing to a FatFS FAT12 filesystem implemented on a serial (SPI) FLASH whose sector write time is 16ms? Reading is fine, at 200us. But even the 16ms write runs fine in the ISR, in the right scenario (e.g. where you are not expecting real time performance while somebody is writing the filesystem, because that side is used mostly for firmware updates). USB timeouts are much longer than 16ms. I posted on this extensively and nobody had a clue, other than telling me I am stupid, or that I should write a "driver" for the FLASH (which is BS because after the first sector, you have to tell the host to stop sending data somehow and it has to work with all versions of windoze and linux). The answer to MSC flow control may be something like the CDC method where you tell the host it can now send more data, but I spent weeks on that and never found any info or anybody who knew how to do it (it is some kind of SCSI emulation, but the "busy" return code was elusive.

"Does it have to be modified or is it just because someone was dumb?"

No idea. Probably the code originated elsewhere.

"Why the data copy is implemented as a custom loop instead of a memcpy() call?"

No idea but now it is moving the USB block (1-64 bytes for CDC) to a 1k circular buffer which could be done with memcpy (there are several cases to handle when moving linear to circular) but normally isn't. Performance here isn't a big fish to fry, especially when you look at how far the data travels in USB already. These multi block transfers are low priority on USB anyway.

I prefer EEVBLOG because it is a usable site, unlike this one where you have to click to see more posts.

Piranha · ‎2022-11-05

> How would you implement MSC (removable device) writing to a FatFS FAT12 filesystem implemented on a serial (SPI) FLASH whose sector write time is 16ms?

An external USB device (the host) doesn't care if your code is waiting those 16 ms in an interrupt or a thread. The difference is that doing it in an interrupt blocks other interrupts of the same or lower priority and absolutely all threads. Contrary doing it in a thread allows all interrupts to execute normally and, if done properly, even other threads can run during that time.

TinyUSB defers all of the user related processing to a thread/task.