2025-07-15 11:07 PM - edited 2025-07-15 11:49 PM
I'm building a USB Audio Class 1.0 full-speed microphone on an STM32L4 (HSI48+CRS clock) using ST's USBD stack. So far I'm just streaming a 1 kHz test tone in 48 kHz/16-bit.
Platform support I need
- macOS: must use explicit or implicit feedback (Adaptive isn't supported on IN)
- Android: restricted to UAC1.0 support which is essential for me so UAC2.0 is out of the question
- Linux: adaptive works fine, explicit does not yet work for me
- Windows: what's supported here?
Hardware & clock
- USB FS off HSI48 with CRS locked to SOF
- SOF interrupts enabled
- ADC sampling not in use yet (sine is generated and sent in `USBD_AUDIO_DataIn` callback)
hpcd_USB_FS.Init.Sof_enable = ENABLE;
Descriptors
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 0x007e
bNumInterfaces 2
bConfigurationValue 1
iConfiguration 0
bmAttributes 0xa0
(Bus Powered)
Remote Wakeup
MaxPower 100mA
Interface Association:
bLength 8
bDescriptorType 11
bFirstInterface 0
bInterfaceCount 2
bFunctionClass 1 Audio
bFunctionSubClass 1 Control Device
bFunctionProtocol 0
iFunction 0
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 0
bInterfaceClass 1 Audio
bInterfaceSubClass 1 Control Device
bInterfaceProtocol 0
iInterface 0
AudioControl Interface Descriptor:
bLength 9
bDescriptorType 36
bDescriptorSubtype 1 (HEADER)
bcdADC 1.00
wTotalLength 0x0027
bInCollection 1
baInterfaceNr(0) 1
AudioControl Interface Descriptor:
bLength 12
bDescriptorType 36
bDescriptorSubtype 2 (INPUT_TERMINAL)
bTerminalID 1
wTerminalType 0x0201 Microphone
bAssocTerminal 0
bNrChannels 1
wChannelConfig 0x0000
iChannelNames 0
iTerminal 0
AudioControl Interface Descriptor:
bLength 9
bDescriptorType 36
bDescriptorSubtype 6 (FEATURE_UNIT)
bUnitID 2
bSourceID 1
bControlSize 1
bmaControls(0) 0x00
bmaControls(1) 0x00
iFeature 0
AudioControl Interface Descriptor:
bLength 9
bDescriptorType 36
bDescriptorSubtype 3 (OUTPUT_TERMINAL)
bTerminalID 3
wTerminalType 0x0101 USB Streaming
bAssocTerminal 0
bSourceID 2
iTerminal 0
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 1
bAlternateSetting 0
bNumEndpoints 0
bInterfaceClass 1 Audio
bInterfaceSubClass 2 Streaming
bInterfaceProtocol 0
iInterface 0
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 1
bAlternateSetting 1
bNumEndpoints 2
bInterfaceClass 1 Audio
bInterfaceSubClass 2 Streaming
bInterfaceProtocol 0
iInterface 0
AudioStreaming Interface Descriptor:
bLength 7
bDescriptorType 36
bDescriptorSubtype 1 (AS_GENERAL)
bTerminalLink 3
bDelay 1 frames
wFormatTag 0x0001 PCM
AudioStreaming Interface Descriptor:
bLength 11
bDescriptorType 36
bDescriptorSubtype 2 (FORMAT_TYPE)
bFormatType 1 (FORMAT_TYPE_I)
bNrChannels 1
bSubframeSize 2
bBitResolution 16
bSamFreqType 1 Discrete
tSamFreq[ 0] 48000
Endpoint Descriptor:
bLength 9
bDescriptorType 5
bEndpointAddress 0x83 EP 3 IN
bmAttributes 5
Transfer Type Isochronous
Synch Type Asynchronous
Usage Type Data
wMaxPacketSize 0x0060 1x 96 bytes
bInterval 1
bRefresh 0
bSynchAddress 4
AudioStreaming Endpoint Descriptor:
bLength 7
bDescriptorType 37
bDescriptorSubtype 1 (EP_GENERAL)
bmAttributes 0x00
bLockDelayUnits 0 Undefined
wLockDelay 0x0000
Endpoint Descriptor:
bLength 9
bDescriptorType 5
bEndpointAddress 0x84 EP 4 IN
bmAttributes 17
Transfer Type Isochronous
Synch Type None
Usage Type Feedback
wMaxPacketSize 0x0003 1x 3 bytes
bInterval 1
bRefresh 0
bSynchAddress 0
I'm also configuring PMA correctly as far as I know:
HAL_PCDEx_PMAConfig(&hpcd_USB_FS, AUDIO_IN_EP_ADDR, PCD_SNG_BUF, 0x180); // AUDIO IN
HAL_PCDEx_PMAConfig(&hpcd_USB_FS, AUDIO_FB_IN_EP_ADDR, PCD_SNG_BUF, 0x01E0); // AUDIO FEEDBACK
What I'm seeing
I just think I'm not getting URB_SUBMIT for the feedback endpoint so I'm not getting anything on that endpoint in Wireshark.
Questions
1. Do full-speed UAC1.0 drivers on macOS, Linux, Android, and Windows all support explicit-feedback async capture, or do they fall back to adaptive/sync differently?
2. What's the recommended FS/UAC1.0 approach to achieve cross-platform mic input?
3. Any known gotchas around implicit vs explicit feedback on UAC1.0 mics?
4. Is sending data in `USBD_AUDIO_DataIn` and feedback in `USBD_AUDIO_SOF`, primed at `USB_REQ_SET_INTERFACE alt==1`, the right pattern?
5. Should I ditch explicit and go implicit feedback, is support for that better or worse than explicit?
Thanks in advance for any pointers!
I can also share my full Wireshark dumps if anyone is interested. Here is my full usbd_audio.c (part of a composite setup as you can see):
#include "usbd_audio.h"
#include "usbd_ctlreq.h"
#include "stm32l4xx_hal.h"
#include "adc.h"
#include "firmware_config.h"
#include <string.h>
#include <math.h>
extern DMA_HandleTypeDef hdma_adc1;
// Compile-time check for packet size alignment
_Static_assert(AUDIO_IN_PACKET_BYTES % sizeof(int16_t) == 0,
"AUDIO_IN_PACKET_BYTES must be aligned to int16_t size");
static uint8_t dummy_zero_packet[AUDIO_IN_PACKET_BYTES];
// Adaptive feedback globals
volatile uint32_t sample_count = 0; // Inc'd in ADC DMA callback
static uint32_t last_count = 0;
static uint32_t feedback_acc = 0;
static uint32_t feedback_incr = 48000U << 14; // Start nominal (48.0 in Q10.14)
static uint8_t feedback_data[3];
static uint32_t ms_counter = 0;
static uint8_t USBD_AUDIO_Init(USBD_HandleTypeDef *pdev, uint8_t cfgidx);
static uint8_t USBD_AUDIO_DeInit(USBD_HandleTypeDef *pdev, uint8_t cfgidx);
static uint8_t USBD_AUDIO_Setup(USBD_HandleTypeDef *pdev, USBD_SetupReqTypedef *req);
static uint8_t USBD_AUDIO_DataIn(USBD_HandleTypeDef *pdev, uint8_t epnum);
static uint8_t USBD_AUDIO_SOF(USBD_HandleTypeDef *pdev);
static uint8_t USBD_AUDIO_EP0_RxReady(USBD_HandleTypeDef *pdev);
static uint8_t USBD_AUDIO_EP0_TxReady(USBD_HandleTypeDef *pdev);
static void *USBD_AUDIO_GetAudioHeaderDesc(uint8_t *pConfDesc);
USBD_ClassTypeDef USBD_AUDIO = {
.Init = USBD_AUDIO_Init,
.DeInit = USBD_AUDIO_DeInit,
.Setup = USBD_AUDIO_Setup,
.EP0_TxSent = USBD_AUDIO_EP0_TxReady,
.EP0_RxReady = USBD_AUDIO_EP0_RxReady,
.DataIn = USBD_AUDIO_DataIn,
.DataOut = NULL,
.SOF = USBD_AUDIO_SOF,
.IsoINIncomplete = NULL,
.IsoOUTIncomplete = NULL,
.GetHSConfigDescriptor = NULL,
.GetFSConfigDescriptor = NULL,
.GetOtherSpeedConfigDescriptor = NULL,
.GetDeviceQualifierDescriptor = NULL
};
/**
* @brief USBD_AUDIO_Init
* Initialize the AUDIO interface
* pdev: device instance
* cfgidx: Configuration index
* @retval status
*/
static uint8_t USBD_AUDIO_Init(USBD_HandleTypeDef *pdev, uint8_t cfgidx) {
USBD_AUDIO_HandleTypeDef *haudio;
haudio = USBD_malloc(sizeof(USBD_AUDIO_HandleTypeDef));
if (haudio == NULL) {
pdev->pClassDataCmsit[pdev->classId] = NULL;
return USBD_FAIL;
}
pdev->pClassDataCmsit[pdev->classId] = haudio;
pdev->pClassData = haudio;
// Open the isochronous IN endpoint for microphone streaming
if (USBD_LL_OpenEP(pdev, AUDIO_IN_EP_ADDR, USBD_EP_TYPE_ISOC, AUDIO_IN_PACKET_BYTES) != USBD_OK) return USBD_FAIL;
pdev->ep_in[AUDIO_IN_EP_ADDR & 0x0F].bInterval = AUDIO_FS_BINTERVAL;
pdev->ep_in[AUDIO_IN_EP_ADDR & 0x0F].is_used = 1U;
// Open the feedback endpoint for microphone streaming
if (USBD_LL_OpenEP(pdev, AUDIO_FB_IN_EP_ADDR, USBD_EP_TYPE_ISOC, AUDIO_FB_IN_PACKET_BYTES) != USBD_OK) return USBD_FAIL;
pdev->ep_in[AUDIO_FB_IN_EP_ADDR &0x0F].bInterval = AUDIO_FS_BINTERVAL;
pdev->ep_in[AUDIO_FB_IN_EP_ADDR & 0x0F].is_used = 1U;
// Flush endpoints
USBD_LL_FlushEP(pdev, AUDIO_IN_EP_ADDR);
USBD_LL_FlushEP(pdev, AUDIO_FB_IN_EP_ADDR);
// Initialize buffer pointers and state
haudio->alt_setting = 0U;
return USBD_OK;
}
/**
* @brief USBD_AUDIO_Init
* DeInitialize the AUDIO layer
* pdev: device instance
* cfgidx: Configuration index
* @retval status
*/
static uint8_t USBD_AUDIO_DeInit(USBD_HandleTypeDef *pdev, uint8_t cfgidx) {
UNUSED(cfgidx);
// Stop ADC streaming
ADC_StopStreaming();
// Close the AUDIO IN endpoint
(void)USBD_LL_CloseEP(pdev, AUDIO_IN_EP_ADDR);
pdev->ep_in[AUDIO_IN_EP_ADDR & 0x0F].is_used = 0U;
pdev->ep_in[AUDIO_IN_EP_ADDR & 0x0F].bInterval = 0U;
// Close the FEEDBACK IN endpoint
(void)USBD_LL_CloseEP(pdev, AUDIO_FB_IN_EP_ADDR);
pdev->ep_in[AUDIO_FB_IN_EP_ADDR & 0x0F].is_used = 0U;
pdev->ep_in[AUDIO_FB_IN_EP_ADDR & 0x0F].bInterval = 0U;
if (pdev->pClassDataCmsit[pdev->classId] != NULL) {
USBD_free(pdev->pClassDataCmsit[pdev->classId]);
pdev->pClassDataCmsit[pdev->classId] = NULL;
pdev->pClassData = NULL;
}
// Call the board‐level deinit (e.g. stop ADC/DMA)
((USBD_AUDIO_ItfTypeDef *)pdev->pUserData[pdev->classId])->DeInit(0U);
return USBD_OK;
}
/**
* @brief USBD_AUDIO_Setup
* Handle the AUDIO specific requests
* pdev: instance
* req: usb requests
* @retval status
*/
static uint8_t USBD_AUDIO_Setup(USBD_HandleTypeDef *pdev, USBD_SetupReqTypedef *req) {
USBD_AUDIO_HandleTypeDef *haudio = (USBD_AUDIO_HandleTypeDef *)pdev->pClassDataCmsit[pdev->classId];
uint16_t len;
uint8_t *pbuf;
uint16_t status_info = 0U;
USBD_StatusTypeDef ret = USBD_OK;
uint8_t recipient = req->bmRequest & USB_REQ_RECIPIENT_MASK;
if (haudio == NULL) {
return (uint8_t)USBD_FAIL;
}
switch (req->bmRequest & USB_REQ_TYPE_MASK) {
case USB_REQ_TYPE_CLASS:
if (recipient == USB_REQ_RECIPIENT_ENDPOINT) {
if (req->bRequest == AUDIO_REQ_GET_CUR) {
// 3-byte LSB-first of 48000 Hz
static const uint8_t freq3[3] = {
(uint8_t)(USBD_AUDIO_FREQ & 0xFF),
(uint8_t)((USBD_AUDIO_FREQ >> 8) & 0xFF),
(uint8_t)((USBD_AUDIO_FREQ >> 16) & 0xFF)
};
USBD_CtlSendData(pdev, (uint8_t *)freq3, 3);
return ret;
}
else if (req->bRequest == AUDIO_REQ_SET_CUR) {
haudio->control.cmd = AUDIO_REQ_SET_CUR;
haudio->control.len = MIN(req->wLength, 3);
USBD_CtlPrepareRx(pdev, haudio->control.data, haudio->control.len);
return ret;
}
}
/* all other class‐type requests we don’t support */
// USBD_CtlError(pdev, req);
// return USBD_FAIL;
break;
case USB_REQ_TYPE_STANDARD:
switch (req->bRequest){
case USB_REQ_GET_STATUS:
if (pdev->dev_state == USBD_STATE_CONFIGURED){
(void)USBD_CtlSendData(pdev, (uint8_t *)&status_info, 2U);
} else {
USBD_CtlError(pdev, req);
ret = USBD_FAIL;
}
break;
case USB_REQ_GET_DESCRIPTOR:
if ((req->wValue >> 8) == AUDIO_DESCRIPTOR_TYPE){
pbuf = (uint8_t *)USBD_AUDIO_GetAudioHeaderDesc(pdev->pConfDesc);
if (pbuf != NULL){
len = MIN(USB_AUDIO_DESC_SIZ, req->wLength);
(void)USBD_CtlSendData(pdev, pbuf, len);
} else {
USBD_CtlError(pdev, req);
ret = USBD_FAIL;
}
}
break;
case USB_REQ_GET_INTERFACE:
if (pdev->dev_state == USBD_STATE_CONFIGURED){
(void)USBD_CtlSendData(pdev, (uint8_t *)&haudio->alt_setting, 1U);
} else {
USBD_CtlError(pdev, req);
ret = USBD_FAIL;
}
break;
case USB_REQ_SET_INTERFACE:
if (pdev->dev_state == USBD_STATE_CONFIGURED){
uint8_t alt = (uint8_t)(req->wValue);
if (alt <= USBD_MAX_NUM_INTERFACES){
haudio->alt_setting = alt;
if(alt == 1){
// Start audio streaming
ADC_StartStreaming(); // Start ADC DMA
// Prime the audio endpoint with first packet
USBD_LL_FlushEP(pdev, AUDIO_IN_EP_ADDR);
USBD_LL_Transmit(pdev, AUDIO_IN_EP_ADDR, dummy_zero_packet, AUDIO_IN_PACKET_BYTES);
// Prime the feedback endpoint with first packet
// Reset feedback state
sample_count = 0;
last_count = 0;
feedback_acc = 0;
feedback_incr = 48000U << 14; // Start nominal (48.0 in Q10.14)
ms_counter = 0;
// Prime with initial feedback packet
feedback_data[0] = (uint8_t)(feedback_incr & 0xFF);
feedback_data[1] = (uint8_t)((feedback_incr >> 8) & 0xFF);
feedback_data[2] = (uint8_t)((feedback_incr >> 16) & 0xFF);
USBD_LL_FlushEP(pdev, AUDIO_FB_IN_EP_ADDR);
USBD_LL_Transmit(pdev, AUDIO_FB_IN_EP_ADDR, feedback_data, AUDIO_FB_IN_PACKET_BYTES);
} else {
// Stop audio streaming when switching to alt setting 0
ADC_StopStreaming();
}
} else {
USBD_CtlError(pdev, req);
ret = USBD_FAIL;
}
} else {
USBD_CtlError(pdev, req);
ret = USBD_FAIL;
}
break;
case USB_REQ_CLEAR_FEATURE:
break;
default:
USBD_CtlError(pdev, req);
ret = USBD_FAIL;
break;
}
break;
default:
USBD_CtlError(pdev, req);
ret = USBD_FAIL;
break;
}
return (uint8_t)ret;
}
/**
* @brief USBD_AUDIO_DataIn
* handle data IN Stage
* pdev: device instance
* epnum: endpoint index
* @retval status
*/
static uint8_t USBD_AUDIO_DataIn(USBD_HandleTypeDef *pdev, uint8_t epnum){
USBD_AUDIO_HandleTypeDef *haudio = (USBD_AUDIO_HandleTypeDef *)pdev->pClassDataCmsit[pdev->classId];
if (!haudio || haudio->alt_setting != 1) {
return USBD_OK;
}
// Handle Audio Data Endpoint
if (epnum == (AUDIO_IN_EP_ADDR & 0x7F)) {
static uint32_t sample_counter = 0;
int16_t* samples = (int16_t*)dummy_zero_packet;
for (int i = 0; i < 48; i++) {
// Generate 1kHz sine wave at 48kHz sample rate
float phase = (sample_counter + i) * 2.0f * 3.14159f * 1000.0f / 48000.0f;
samples[i] = (int16_t)(sinf(phase) * 8000);
}
sample_counter += 48;
USBD_LL_Transmit(pdev, AUDIO_IN_EP_ADDR, dummy_zero_packet, AUDIO_IN_PACKET_BYTES);
return USBD_OK;
}
return USBD_OK;
}
uint8_t USBD_AUDIO_SOF(USBD_HandleTypeDef *pdev){
USBD_AUDIO_HandleTypeDef *haudio = (USBD_AUDIO_HandleTypeDef *)pdev->pClassDataCmsit[pdev->classId];
// Only send feedback when streaming is active
if (!haudio || haudio->alt_setting != 1) {
return USBD_OK;
}
// Measure drift every 100 ms and adapt feedback rate
if (++ms_counter >= 100) {
uint32_t delta = sample_count - last_count;
last_count = sample_count;
ms_counter = 0;
// Convert samples_per_100ms to Q10.14 per frame
// delta samples in 100ms
uint32_t new_incr = ((delta * (1UL << 14)) + 50) / 100;
// Smoothing
feedback_incr = (feedback_incr * 7 + new_incr) >> 3;
}
// Accumulate with fractional precision and pack into 3-byte format
feedback_acc += feedback_incr;
feedback_data[0] = (uint8_t)(feedback_acc & 0xFF);
feedback_data[1] = (uint8_t)((feedback_acc >> 8) & 0xFF);
feedback_data[2] = (uint8_t)((feedback_acc >> 16) & 0xFF);
// Send feedback packet (no flush, let USB engine handle scheduling)
USBD_LL_Transmit(pdev, AUDIO_FB_IN_EP_ADDR, feedback_data, AUDIO_FB_IN_PACKET_BYTES);
return USBD_OK;
}
/**
* @brief USBD_AUDIO_EP0_RxReady
* handle EP0 Rx Ready event
* pdev: device instance
* @retval status
*/
static uint8_t USBD_AUDIO_EP0_RxReady(USBD_HandleTypeDef *pdev) {
UNUSED(pdev);
return USBD_OK;
}
/**
* @brief USBD_AUDIO_EP0_TxReady
* handle EP0 TRx Ready event
* pdev: device instance
* @retval status
*/
static uint8_t USBD_AUDIO_EP0_TxReady(USBD_HandleTypeDef *pdev){
UNUSED(pdev);
/* Only OUT control data are processed */
return (uint8_t)USBD_OK;
}
/**
* @brief USBD_AUDIO_RegisterInterface
* pdev: device instance
* fops: Audio interface callback
* @retval status
*/
uint8_t USBD_AUDIO_RegisterInterface(USBD_HandleTypeDef *pdev, USBD_AUDIO_ItfTypeDef *fops){
if (fops == NULL){
return (uint8_t)USBD_FAIL;
}
pdev->pUserData[pdev->classId] = fops;
return (uint8_t)USBD_OK;
}
/**
* @brief USBD_AUDIO_GetEpPcktSze
* pdev: device instance (reserved for future use)
* If: Interface number (reserved for future use)
* Ep: Endpoint number (reserved for future use)
* @retval status
*/
uint32_t USBD_AUDIO_GetEpPcktSze(USBD_HandleTypeDef *pdev, uint8_t If, uint8_t Ep){
// Return the wMaxPacketSize value in Bytes: 96
return AUDIO_IN_PACKET_BYTES;
}
/**
* @brief USBD_AUDIO_GetAudioHeaderDesc
* This function return the Audio descriptor
* pdev: device instance
* pConfDesc: pointer to Bos descriptor
* @retval pointer to the Audio AC Header descriptor
*/
static void *USBD_AUDIO_GetAudioHeaderDesc(uint8_t *pConfDesc){
USBD_ConfigDescTypeDef *desc = (USBD_ConfigDescTypeDef *)(void *)pConfDesc;
USBD_DescHeaderTypeDef *pdesc = (USBD_DescHeaderTypeDef *)(void *)pConfDesc;
uint8_t *pAudioDesc = NULL;
uint16_t ptr;
if (desc->wTotalLength > desc->bLength){
ptr = desc->bLength;
while (ptr < desc->wTotalLength){
pdesc = USBD_GetNextDesc((uint8_t *)pdesc, &ptr);
if ((pdesc->bDescriptorType == AUDIO_INTERFACE_DESCRIPTOR_TYPE) &&
(pdesc->bDescriptorSubType == AUDIO_CONTROL_HEADER)){
pAudioDesc = (uint8_t *)pdesc;
break;
}
}
}
return pAudioDesc;
}
2025-07-16 12:32 AM
Same question here ...?
https://www.reddit.com/r/embedded/comments/1m0sbke/usb_uac10_mic_on_stm32l4_explicitfeedback_vs/
Just: what is : implicit vs explicit feedback on UAC1.0 ?
(I only made standard codec on USB , 48kHz , isochronous; that was working.)
2025-07-16 4:05 AM
Yes, I also asked it over on r/embedded, but then thought that maybe this would be a better place to ask for this specific stack!
In my understanding "explicit feedback" requires a separate feedback endpoint that publishes the real sample rate relative to the SOF timing with 3-byte values.
"Implicit feedback" does not use a feedback endpoint, but the host will automatically adjust its clock to match the data rate of the device. But I think implicit is less widely support than explicit, but like I said it's very hard to find conclusive information about this online.
There is also "synchronous" mode that also does not use any feedback, but requires the device to match the clock of the host, so this would mean it should always send exacty 48 samples per frame despite ADC clock drift. I have considered implementing my own sample rate conversion (fractional SRC) for it, but this would increase complexity massively and instinctively this should not be necessary with asynchronous mode.
2025-07-16 5:35 AM - edited 2025-07-16 5:38 AM
I just can tell, what i know:
audio is always transferred isochronous , thats fixed sent (from host, PC) in 1ms blocks , typical at 48khz, 48 samples 16b., 192B.
There is no (!) feedback, synchronisation or whatever, the device just has to get the real clock from the 1ms interval;
DACs like PCM5102 doing this with internal PLL . (because the master clock in PC always has some drift or offset, typ. 10exp-5 or so)
Which way the "hi-end-audio-USB" interfaces doing it, at 96k/24b or any, i dont know; its UAC2 needed i think, and maybe they just transfer as bulk data to a local buffer, playing then on their own precision master clocks, totally independent on the host clock.
Optimum quality is only with local low jitter crystal clocks possible.
So just use isochronous mode at 48k 16b , that works ; i tried. :)
And should enumerate on Lin/Win/Mac without any problem, its what all cheap USB-sound-cards are using.