2014-09-09 03:55 PM
Using STM32Cube_FW_F4_V1.1.0 on STM32F401
I am trying to maximize data from host to stm on FS bulk endpoint. Using a 64 byte Rx buffer, I can only get about 5 MBps, as the interrupt and setup for next Rx buffer and restart of Rx takes a large amount of time. If I used a much larger buffer, I can get 9 MBps, but I don't get an Rx complete interrupt unless the host sends a short packet. I need to use the bigger buffer for speed, but I can't count on the host sending non-64 byte packets, and if the host sends less than an Rx buffer-full the app doesn't get an int. Is there some way to stop the Rx in the middle and recover the existing data (and count) in the Rx operation as if if filled, and not drop any in-process data (nak of it is OK)? Or is there some other solution? #usb-fs-bulk-rx-speed-stm32cube2014-09-10 08:38 AM
You have two problems,
a) How to implement variable-length transfer. b) How to move large data quickly over the bulk OUT endpoint (host --> device).a) Variable-length transfer In practice, there are two popular methods to send variable-length transfer over a bulk/interrupt OUT endpoint. 1) The transfer length is told to the device beforehand, or at the first packet (transaction) of the transfer. 2) The transfer is always terminated by a short packet, including ZLP (Zero-Length Packet). The second method is applied just to limited PC drivers like WinUSB (**1), because - ZLP termination should be ordered by PC applications to the driver, - but most of PC in-box drivers don't provide any way to send ZLP. (**1) WinUSB sends ZLP over OUT endpoint, when SHORT_PACKET_TERMINATE policy is enabled.http://msdn.microsoft.com/en-us/library/windows/hardware/ff728833%28v=vs.85%29.aspx
For other drivers, like CDC (Virtual COM port), you have to apply the first method.b) Transfer speed The USB device engine on STM32F2/F4 is designed in transfer-oriented; we have to set up the number of packets (transactions) and entire transfer length on the endpoint register. To make the endpoint run quickly, we have to assign the real transfer length (or more) to the endpoint, And then, the engine works quickly without (so much) intervention of firmware. 1) The first packet holds the transfer length The firmware claims MPS (MaxPacketSize) transfer (ie. 64 bytes for full-speed) for the first packet. When it receives the first packet, the exact length of the transfer is known. The firmware claims another transfer of the exact rest length. 2) Short packet termination In this method, your firmware assigns large enough transfer length to cover the possible max transfer length on your protocol. When a short packet (including ZLP) is received, the transfer terminates, and transfer completion interrupt should be generated. On the host side, the PC application has to add ZLP explicitly, or it sets up the driver, so that a ZLP is appended, when the transfer length is a multiple of MPS. If it were possible, the second method would be simple on both sides. Tsuneo2014-09-10 04:39 PM