cancel
Showing results for 
Search instead for 
Did you mean: 

32F417: How does USB CDC OUT (PC -> target) flow control work?

PHolt.1
Senior II

I have my board running well, and flow control works in the "IN" direction (board -> PC).

But there are strange issues. I am testing with a loopback program which has variable size packets and at a point which is very close to 1024 bytes I see a lot of errors.

I do have a 1024 byte buffer in that data flow but changing that to 2048 doesn't change anything, and I suspect the flow control isn't working.

As with everything-USB it is damn complicated but AFAICT the data path is:

1) USB (FS mode) does an interrupt on each RX packet, and it has 1-64 bytes of data.

2) An ISR copies this into a 64 byte buffer

3) The ISR calls this

/**
  * @brief  Data received over USB OUT endpoint are sent over CDC interface
  *         through this function.
  *
  * We get here when there is some OUT data (OUT = PC to target). That data is
  * loaded into cdc_receive_temp_buffer[64], supplied to us here as "buf".
  * This is the call flow: https://peter-ftp.co.uk/screenshots/202209065715674317.jpg
  * Data is then copied into our circular buffer cdc_receive_buffer[1024].
  * No more data can be sent by PC until USBD_CDC_ReceivePacket gets called.
  * This function is called by an ISR so no need to disable interrupts.
  *
  * @param  Buf: Buffer
  * @param  Len: Number of data received (in bytes)
  *
  */
 
static int8_t CDC_Receive_FS(uint8_t* buf, uint32_t *len)
{
	if (*len > 0)
	{
 
		// copy over the data from buf, length len, into our circular buffer
		for (uint16_t i = 0; i < *len; i++)
		{
			cdc_receive_buffer[cdc_receive_buffer_rx_index++] = buf[i];
			cdc_receive_buffer_rx_index %= CDC_RX_BUFFER_SIZE;
			if (cdc_receive_buffer_rx_index == cdc_receive_buffer_get_index)
			{
				// Buffer is full - discard a byte from it. This is necessary to resolve the ambiguity of
				// the two pointers being equal if FULL or EMPTY. We make sure they are never equal after an input.
				// Increment the get index to ensure that only the last BUFFER-1 bytes can be accessed
				cdc_receive_buffer_get_index++;
				cdc_receive_buffer_get_index %= CDC_RX_BUFFER_SIZE;
			}
		}
 
	}
 
	// Enable reception of next packet from Host
	USBD_CDC_ReceivePacket(&hUsbDeviceFS);
 
	return (USBD_OK);
}
 

4) The target application extracts data from the circular buffer

The last line above, USBD_CDC_ReceivePacket, is supposed to enable the next 1-64 byte packet to arrive from the PC. This is executed regardless of whether that circular buffer overflowed, which is not ideal because there is no flow control, but if I don't call USBD_CDC_ReceivePacket I need to call it later when the target application has extracted some data, but this is a difficult problem because with no data arriving that interrupt won't be happening. So I would need to call it from some foreground code, but then I have reentrancy issues.

The obvious way is to skip the USBD_CDC_ReceivePacket if that buffer is full and call it from the code which is extracting data from that circular buffer, whenever >=64 bytes of room is in there.

The above code is standard; it is the ST port of some common USB code for the 32F4 family, supplied with Cube IDE. As usual there are loads of people posting all over the place trying to solve this.

However I have established that when there are errors, the above buffer overflow condition is not happening. So it is something else.

AIUI, when a PC application sends data to USB VCP, it opens a file called say COM3 and sends data to it, in packets of any length. You can send 100k if you want. USB then packages this into 1-64 byte packets and sends it down the wire. There is no 1024 or whatever packet limit involved.

For an idea of how much code there is:

0693W00000UnKjsQAF.png

5 REPLIES 5
Pavel A.
Evangelist III

The flow control is very simple. When the device is not ready to receive the OUT data, it NAKs the OUT token, as usual. The controller should do this automatically, without any overhead on the firmware side.

When you call USBD_CDC_ReceivePacket you tell the controller to enable receive and get the following OUT data. Until you are ready to digest more data, don't call USBD_CDC_ReceivePacket.

The serial port driver on the host somehow passes back pressure onto the host apps (for example, as "queue full" ). It may buffer the outstanding data, or not. This is not the concern of the device.

PHolt.1
Senior II

OK, but that whole thing is called from an ISR which is generated by USB.

So if, upon finding the destination unable to consume more data, I don't call USBD_CDC_ReceivePacket, I don't get another chance to call it. No more data will ever arrive.

Or is that wrong... will USB keep repeating that interrupt?

My tests suggest that it does not. The moment that ISR executes without USBD_CDC_ReceivePacket being called, that is the end. No more USB data.

I have tried various schemes e.g.

static int8_t CDC_Receive_FS(uint8_t* buf, uint32_t *len)
{
 
	if (*len > 0)
	{
 
		// copy over the data from buf, length len, into our circular buffer
		for (uint16_t i = 0; i < *len; i++)
		{
			cdc_receive_buffer[cdc_receive_buffer_rx_index++] = buf[i];
			cdc_receive_buffer_rx_index %= CDC_RX_BUFFER_SIZE;
			if (cdc_receive_buffer_rx_index == cdc_receive_buffer_get_index)
			{
				// Buffer is full - discard a byte from it. This is necessary to resolve the ambiguity of
				// the two pointers being equal if FULL or EMPTY. We make sure they are never equal after an input.
				// Increment the get index to ensure that only the last BUFFER-1 bytes can be accessed
				cdc_receive_buffer_get_index++;
				cdc_receive_buffer_get_index %= CDC_RX_BUFFER_SIZE;
			}
		}
 
	}
 
	// Enable reception of next packet from Host, if there is room for 100+ bytes
	// USB FS sends 64 byte packets
 
	if ( (CDC_RX_BUFFER_SIZE - CDC_get_ipqcount()) > 100)
	{
		USBD_CDC_ReceivePacket(&hUsbDeviceFS);
		return (USBD_OK);
	}
	else
	{
		return (USBD_BUSY);
	}
 
}

This is USB CDC. The funny thing is that I have been around this block with USB MSC; specifically when the host is writing a file to a serial FLASH filesystem (FatFS) which has a 15ms sector write time. Despite countless forum posts, that was never resolved and I simply return USB_OK after the 15ms write is finished... It's a hack to have a USB ISR hung for 15ms but it works perfectly. I have since heard that USB FLASH sticks do exactly the same thing.

gbm
Lead III

The simplest, not very optimal solution, is to enable further data reception after processing the whole data buffer (without extra buffering/copying to another buffer). If you process the data outside of USB interrupt context, just make sure to disable interrupts before calling USBD_CDC_ReceivePacket() and enable them right after the call. You don't need to call USBD_CDC_SetRxBuffer if you use the same buffer for next data packet. In CDC_Receive_FS() make sure not to call data procesing if *Len is 0 - in that case just call ReceivePacket() immediately.

Usually the lack of extra buffering is not a big deal, as the next packet will arrive no sooner than in 1 ms.

A better solution involves the use of software-triggered interrupts.

PHolt.1
Senior II

Thank you! That is what I thought might be needed but I was expecting it to be a lot more complicated.

I copy the 64 byte USB buffer into a 1k circular buffer within the USB interrupt, so I can stop calling USBD_CDC_ReceivePacket once the space in this circular buffer goes below some amount.

Then when the RTOS task which empties this 1k circular buffer expands the space in it above some value, it will call USBD_CDC_ReceivePacket with interrupts disabled around the call. That long call chain is mostly passing pointers along...

It's also interesting to realise that at 1ms interrupt rate the CDC data rate must be limited to 64kbytes/sec. I measured values in that ballpark previously without realising this.

I can see that triggering that USB ISR in software, with len=0, would be neat, and is probably doable by setting the relevant IP bit, but what would that give me?

PHolt.1
Senior II

Update:

The above suggestion by gbm seems to work.

This runs from USB interrupt:

static int8_t CDC_Receive_FS(uint8_t* buf, uint32_t *len)
{
 
	// copy over the data from buf, length len, into our circular buffer
 
	if (*len > 0)
	{
		for (uint16_t i = 0; i < *len; i++)
		{
			cdc_receive_buffer[cdc_receive_buffer_rx_index++] = buf[i];
			cdc_receive_buffer_rx_index %= CDC_RX_BUFFER_SIZE;
			if (cdc_receive_buffer_rx_index == cdc_receive_buffer_get_index)
			{
				// Buffer is full - discard a byte from it.
				cdc_receive_buffer_get_index++;
				cdc_receive_buffer_get_index %= CDC_RX_BUFFER_SIZE;
			}
		}
	}
 
	// Run USBD_CDC_ReceivePacket only if there is enough room in the circular buffer
 
	if ( (CDC_RX_BUFFER_SIZE - CDC_get_ipqcount()) > 100)	// USB moves 0-64 bytes
	{
		// Enable reception of next packet from Host
		USBD_CDC_ReceivePacket(&hUsbDeviceFS);
	}
	else
	{
		CDC_rx_blocked=true;	// mark "USBD_CDC_ReceivePacket needed"
	}
 
	return (USBD_OK);
 
}

and this runs from foreground (actually an RTOS task but that's irrelevant:

// Read any available bytes from the RX circular buffer up to the maximum number given
 
uint16_t CDC_get_rx_bytes(
	uint8_t *data,			// Out 	Where to copy the received bytes
	uint16_t maxlength		// In	Max number of bytes to receive
)
{
	portENTER_CRITICAL();	// prevent interrupts while updating the index
 
	uint16_t length = 0;
	uint16_t count = CDC_get_ipqcount();
 
	length = MIN(maxlength, count);
 
	for (uint16_t i = 0; i < length; i++)
	{
		data[i] = cdc_receive_buffer[cdc_receive_buffer_get_index++];
		cdc_receive_buffer_get_index %= CDC_RX_BUFFER_SIZE;
	}
 
	// If USBD_CDC_ReceivePacket was skipped (in CDC_Receive_FS) and there is useful
	// room in the circular buffer for more data, then run it now
	// Must run with ints disabled
 
	if ( (CDC_rx_blocked) && ( (CDC_RX_BUFFER_SIZE - CDC_get_ipqcount()) > 100) )
	{
		// Enable reception of next packet from Host
		USBD_CDC_ReceivePacket(&hUsbDeviceFS);
		CDC_rx_blocked=false;
	}
 
	portEXIT_CRITICAL();
 
	return length;
}

The flag CDC_rx_blocked gets initialised to FALSE. It's a little bit like xon/xoff handshake actually.

I am testing this in various ways. I have set up an RTOS task which does loopback, and I am running some loopback tests on the PC..

One is a custom written loopback program (freelancer.com, $100) which sends blocks of up to 100k bytes of random data and receives them and compares. There are loopback testers but none work on a single port. This is running at 47kbytes/sec which is a reasonable % of the max theoretical CDC speed of 64kbytes/sec. Since 100k is way more than any buffering along the path in my board, flow control must be working.

Another is an upload of a 1mbyte text file with Teraterm and logging the returned data. This also works but one can make it lose data. I can't be sure it isn't an issue with Teraterm when logging data to a file, because if my board transmits text to Teraterm (no logging to a file) with upstream flow control disabled (actually I have this as an option for debug messages) Teraterm does lose some of it, so this is a mystery.

CDC_Receive_FS always returns OK status. As happened with the MSC case, the BUSY status just doesn't seem to work.