cancel
Showing results for 
Search instead for 
Did you mean: 

STM32Cube USB CDC (possible) Bug Found - Rx Tx Race condition

markdunstan9
Associate II
Posted on August 18, 2015 at 16:17

I can see there are a few posts regarding this, but I'm not sure if anyone has actually pinpointed the exact issue.

I noticed that when using the transmit and receive functions (CDC_Transmit_FS() and CDC_Receive_FS()) they work completely fine on their own, but if trying to receive and then send more than the max packet size (64 bytes by default for Full Speed) the micro will eventually stop receiving (maybe after 1-3 65Byte+ packets).  

This is because while the micro is transmitting it is locking the PCD handle using '__HAL_LOCK(hpcd);' found inside USBD_CDC_TransmitPacket() -> USBD_LL_Transmit() -> HAL_PCD_EP_Transmit(). 

BUT the receive function also needs this handle as it calls the same '__HAL_LOCK(hpcd);' inside  USBD_CDC_ReceivePacket() -> USBD_LL_PrepareReceive() -> HAL_PCD_EP_Receive() and __HAL_LOCK(handle) is a preprocessor macro which can 'return HAL_BUSY'. Knowing that the receive function is called via interrupt, while the transmit is via user code, the issue is that the receive function can interrupt the transmit process while the transmit is locking the handle, causing the receive to leave the function before preparing the endpoint for reception as it should. 

Nothing will be received after this point, unless the user calls USBD_CDC_ReceivePacket to prepare the receive after transmitting (after calling USBD_CDC_TransmitPacket).

In relation to this, another poor piece of coding is that USBD_CDC_ReceivePacket() will return USBD_OK even in the case when the most inner function (HAL_PCD_EP_Receive) returns HAL_BUSY. They should change the appropriate functions to actually pass this status back to the highest level function so e.g. USBD_BUSY could be handled apropriately.             
11 REPLIES 11
robert-hartung
Associate II
Posted on August 18, 2015 at 20:09

I had somewhat of the same problem with my own USB Driver, that only uses the HAL_PCD_* functions.

I used to write all my transfer logic in the while(1) in my main and I am using a status flag to indicate that there is a request for sending/receiving data. This way I arbitrate the USB myself again. I don't know if this is intended behaviour of the low lever driver. Maybe someone from STM can clarify this issue!

My usecase is the following: If you send packages that are n*max_packet lengths, assuming for example 512 bytes packet size and 1024 bytes of data. The driver has to send two packets, each 512 bytes and a 0 byte packet (the ZLP packet) at the end to complete the transfer. But because the driver might be too fast, the interrupt might happen, while being in the first Transmit() method sending all the data, thus the handle is still locked, and I cannot send the ZLP instantly.

Therefore I am using an internal state for each endpoint. Because there will be only one interrupt for sending data (even if it is > 512 Byte), you can rely on that interrupt that the transmission is successful. Thus receiving this interrupt, you are save to set the IN endpoint to `IDLE` state. My functions waits for the EP to be IDLE again to send the ZLP package.

The code can be seen at https://gitlab.ibr.cs.tu-bs.de/stm32/generic_usb_driver/blob/master/src/generic_usb_driver.c#L718 ( GUD_SendData function).

- Robert

markb
Associate II
Posted on August 19, 2015 at 00:19

Hi,

I think it must have been the same guy who wrote the UART handling code because it has the same problem. You can't reliably post a new read request while a transmit request is being handled. These problems have been discussed quite a few times and various hacks have been posted to work around the broken HAL code. I have completely given up hope that ST will fix these issues, they don't appear to be interested at all. The HAL is a complete disgrace but as I actually have reliable working code for UART, USB FS and SD, I am putting up with it.

Cheers,

Mark

abarbieri9
Associate
Posted on August 26, 2015 at 12:02

Hi friends, I found the same problem on my projects using STMCumeMX.

In fact, I ''wrap'' the awful USB CDC template with a clever version, handled with circlular queue. My problem is that sometimes the function HAL_PCD_EP_Receive return (HAL_BUSY) I couldn't detect it, so my system cannot return in RX state.

This is absolutely awful!!

My workaround is to write my USBD_CDC_ReceivePacket, and returning the real result of the command...

Any others idea?

What is real ''bad'' is this code:

#if (USE_RTOS == 1)

 

#error '' USE_RTOS should be 0 in the current HAL release ''

 

#else

 

#define __HAL_LOCK(__HANDLE__)                                           \

 

    do{                                        \

 

        if((__HANDLE__)->Lock == HAL_LOCKED)  \

 

        {                                      \

 

            return HAL_BUSY;                    \

 

        }                                      \

 

        else                                   \

 

        {                                      \

 

            (__HANDLE__)->Lock = HAL_LOCKED;    \

 

        }                                      \

 

    }while (0)

 

 

#define __HAL_UNLOCK(__HANDLE__)                                          \

 

    do{                                       \

 

        (__HANDLE__)->Lock = HAL_UNLOCKED;   \

 

    }while (0)

 

#endif /* USE_RTOS */

 

When we could use RTOS?!?!?

estie
Associate II
Posted on September 07, 2015 at 11:01

I agree, this is very unprofessional. It is really basic, obvious coding errors. Their apparent lack of interest is fixing is even worse.

It looks like they got some summer student 'intern' to write this code without any supervision from a paid software engineer.

These Discovery boards look great and could be a good way to launch the chips and get them adopted. Part of the attraction is substantial body of useful firmware that is provided with the boards.

But what we ''discover'' is that ST can't even write reliable firmware for their own devices. This is not rocket science, it's fairly simple, standard serial interfacing.

I would have thought that they would be embarrassed and would be quick to provide a fix but they don't even seem to care. 

Really not impressive.

Thanks to all here for contributing to understanding the problem.

rex
Associate II
Posted on November 27, 2015 at 11:55

This was really helpful. Thanks! I have implemented passing the return code to my application and now I can detect this problem. Most of the time I get this in USBD_CDC_ReceivePacket and I can call that again at some later point.

Not quite as often I also get this problem when calling CDC_Transmit_HS. This is more complicated because in this case I have already read the packet I need to transmit from the queue and have to transmit it somehow as I cannot just put it back. Just trying to call CDC_Transmit_HS in a loop until OK is returned, chrashes the whole application. Just dropping the packet doesn't crash, but that isn't a feasible solution. Any ideas how to fix that?

I'm using RTOS in my application and at the moment I'm just doing a stress test of the USB interface but I get the busy for sending or receiving always in less than a minute. Adding more functionality I to my program increases the chance to get this error.

My application will use the USB connection quite a bit and with the added funktionality, I suspect, I will trigger this bug a lot. This is just one nightmare.

Amel NASRI
ST Employee
Posted on February 19, 2016 at 11:27

Hi All,

Please note that the reported bug is tracked internally. It will be fixed soon.

Sorry for any inconvenience this may cause.

All your feedback are welcome in order to improve our solutions.

-Mayla-

To give better visibility on the answered topics, please click on Accept as Solution on the reply which solved your issue or answered your question.

StefanoBettega1
Associate II
Posted on June 30, 2016 at 14:29

<flame_mode_on />

It's hilarious that in stm32f2xx_hal_def.h we have such a definition and in USB Host code we find USBH_USE_OS...

So, my question is: can we use an RTOS? Or can we use it only for the USB part?

<flame_mode_off />

StefanoBettega1
Associate II
Posted on July 11, 2016 at 10:35

As reported in this thread, it seems that there is a possible race condition in Tx and Rx interrupts. Is there any solution for this at this moment?

We are experiencing such a problem in our application, so that we digged out all of the tasks and interrupts, leaving only a main loop without OS, and only USBD and Systick (for HAL timing) as interrupt sources.

USB VCP behave like an old serial port on top of which we run a ''consolidated'' Q&A protocol: PC acts as master, querying device, and device answers to requets as slave. Eventually PC sends retries if a packet is lost or corrupted (we have a framing and checksum control in protocol).

Things normally works with small data, but as we try to transfer big data buffer (i.e. packets of 1010 bytes, that are 1000 bytes payload and 10 bytes of framing), data exchange stops. PC queries device, device prepares the transmission buffer and starts to send it out, but PC doesn't receive them. Probably data transfer is partially started, because if we dig into USB registers we see that count of transferred data is 640 (10 packets of 64 bytes), and peripheral seems to be stuck with TxState = 1.

Any suggestion on how to overcome this problem?

Thanks,

Stefano

cesarortizpan
Associate
Posted on October 28, 2016 at 09:00

Any new about it?

Any fix on de way?

When?