cancel
Showing results for 
Search instead for 
Did you mean: 

STM32Cube USB CDC (possible) Bug Found - Rx Tx Race condition

markdunstan9
Associate II
Posted on August 18, 2015 at 16:17

I can see there are a few posts regarding this, but I'm not sure if anyone has actually pinpointed the exact issue.

I noticed that when using the transmit and receive functions (CDC_Transmit_FS() and CDC_Receive_FS()) they work completely fine on their own, but if trying to receive and then send more than the max packet size (64 bytes by default for Full Speed) the micro will eventually stop receiving (maybe after 1-3 65Byte+ packets).  

This is because while the micro is transmitting it is locking the PCD handle using '__HAL_LOCK(hpcd);' found inside USBD_CDC_TransmitPacket() -> USBD_LL_Transmit() -> HAL_PCD_EP_Transmit(). 

BUT the receive function also needs this handle as it calls the same '__HAL_LOCK(hpcd);' inside  USBD_CDC_ReceivePacket() -> USBD_LL_PrepareReceive() -> HAL_PCD_EP_Receive() and __HAL_LOCK(handle) is a preprocessor macro which can 'return HAL_BUSY'. Knowing that the receive function is called via interrupt, while the transmit is via user code, the issue is that the receive function can interrupt the transmit process while the transmit is locking the handle, causing the receive to leave the function before preparing the endpoint for reception as it should. 

Nothing will be received after this point, unless the user calls USBD_CDC_ReceivePacket to prepare the receive after transmitting (after calling USBD_CDC_TransmitPacket).

In relation to this, another poor piece of coding is that USBD_CDC_ReceivePacket() will return USBD_OK even in the case when the most inner function (HAL_PCD_EP_Receive) returns HAL_BUSY. They should change the appropriate functions to actually pass this status back to the highest level function so e.g. USBD_BUSY could be handled apropriately.             
11 REPLIES 11
arnold2
Associate II
Posted on November 03, 2016 at 15:07

Today I have found this bug independent of this discussion thread. It is a shame, that this bug is known since more than a half year now and ST does not get it fixed. Mayla from ST on 2/19/2016: ''It will be fixed soon''...

No further comment...

It is quite annoying having to look at the code. The implementation of the HAL_LOCK/UNLOCK stuff is a mess. Macros, that silently return error codes from functions in combination with functions ignoring return values completely... Both of those 2 things should be avoided by any software and if ST would have done so, when this code has been written, the bug would have not been there. By the way, there is a lot of code in the ST driver layer, that ignores return values of functions, that can return errors! Please consider buying a copy of MISRA, ST 😉

For those of you, who are interested in a solution: We have fixed it now by surrounding all code from HAL_LOCK until HAL_UNLOCK (including those 2 macro ''calls'') of function HAL_PCD_EP_Transmit() in file stm32f4xx_hal_pcd.c by a critical section, so it can no longer be preempted by the interrupt, that raises, when the USB device receives data from the host (in other words: HAL_PCD_EP_Transmit() and HAL_PCD_EP_Receive() must not preempt each other, or at least the code between LOCK and UNLOCK must not be preempted). Though you have to be aware of the fact, that Cube overwrites this code, when code is re-generated.

Best Regards,

René

li kai
Associate II
Posted on June 30, 2017 at 18:58

https://community.st.com/0D50X00009XkhXwSAJ

'I also have right now something that looks like that. Spent a lot of time on it. Stuck.

Using USB Custom HID. Bulk transfers. 1ms IN, 10 ms OUT

Code generated with STM32CubeMX, on STM32F072RB on Discovery board. Even updated today the libs and regenerated the code : no progress

If I only transmit from PC, OK. Never blocks

If I only receive from PC, OK. Never Blocks

Now,

If I have IN reports at about every 16ms, and OUT reports about every 20ms, after some random time, really random, but never much more than 30 seconds, the OUT reports are stuck, that is, the transaction never completes.

After that, the IN reports can continue with no problem. I never have problems on the IN reports.

If I restart my PC application, I even can still enumerate, but cannot send any OUT report. AWriteFile() produces aERROR_IO_PENDING than never ends.

Now

Here is the IMPORTANT information : if I disconnect the 'USB User' from the board, (still running thanks to the other USB connector for the supply and debug), then reconnect (which calls USBD_LL_Reset() among other things) , IT WORKS again, the OUT reports go through (at least for another 30sec...).

For me, this is the PROOF that it is a low level problem in ST's firmware.

I have even tried to activate double buffering : even worse. Can't make it work at all in double buffer.'

The same problem above was encountered in my development. The scene is that using stm32cube usb CDC as a virtual com transmit data and receive data to and from PC software. The data may come from two side, one is PC software and another is UART buffer. When developing I found after receive data from PC software to mcu several times , and the mcu can not receive data from PC software any more(using a detection software -----Bus Hound 6.1.0, it seems that PC software can not send data to mcu again and PC software can not open successfully ), but the data transmit to PC software is normal. If in debug status(no power off ), I reinsert the usb, it works normal again. I know the program step into HAL_PCD_IRQHandler()-> if (__HAL_PCD_GET_FLAG (hpcd, USB_ISTR_RESET))

{

__HAL_PCD_CLEAR_FLAG(hpcd, USB_ISTR_RESET);

HAL_PCD_ResetCallback(hpcd);

HAL_PCD_SetAddress(hpcd, 0U);

}

,however the reinsert is not what we want. Tomorrow I will try the solution, I beg it works well.

That is, Add

HAL_NVIC_DisableIRQ(USB_IRQn);

just before

__HAL_LOCK(hpcd);

and add

HAL_NVIC_EnableIRQ(USB_IRQn);

just after

__HAL_UNLOCK(hpcd);

At last , I am looking for the ST Corp can give an official action to solve the problem.

At last , I want to say the UART HAL driver functions have the same problem, such as HAL_UART_Receive_DMA() and HAL_UART_Transmit_DMA().