USB UAC1.0 Mic on STM32L4: Explicit/Implicit-Feedback support

bvmd · ‎2025-07-15

I'm building a USB Audio Class 1.0 full-speed microphone on an STM32L4 (HSI48+CRS clock) using ST's USBD stack. So far I'm just streaming a 1 kHz test tone in 48 kHz/16-bit.

Platform support I need
- macOS: must use explicit or implicit feedback (Adaptive isn't supported on IN)
- Android: restricted to UAC1.0 support which is essential for me so UAC2.0 is out of the question
- Linux: adaptive works fine, explicit does not yet work for me
- Windows: what's supported here?

Hardware & clock
- USB FS off HSI48 with CRS locked to SOF
- SOF interrupts enabled
- ADC sampling not in use yet (sine is generated and sent in `USBD_AUDIO_DataIn` callback)

hpcd_USB_FS.Init.Sof_enable = ENABLE;

Descriptors

Configuration Descriptor:
  bLength                 9
  bDescriptorType         2
  wTotalLength       0x007e
  bNumInterfaces          2
  bConfigurationValue     1
  iConfiguration          0 
  bmAttributes         0xa0
    (Bus Powered)
    Remote Wakeup
  MaxPower              100mA
  Interface Association:
    bLength                 8
    bDescriptorType        11
    bFirstInterface         0
    bInterfaceCount         2
    bFunctionClass          1 Audio
    bFunctionSubClass       1 Control Device
    bFunctionProtocol       0 
    iFunction               0 
  Interface Descriptor:
    bLength                 9
    bDescriptorType         4
    bInterfaceNumber        0
    bAlternateSetting       0
    bNumEndpoints           0
    bInterfaceClass         1 Audio
    bInterfaceSubClass      1 Control Device
    bInterfaceProtocol      0 
    iInterface              0 
    AudioControl Interface Descriptor:
      bLength                 9
      bDescriptorType        36
      bDescriptorSubtype      1 (HEADER)
      bcdADC               1.00
      wTotalLength       0x0027
      bInCollection           1
      baInterfaceNr(0)        1
    AudioControl Interface Descriptor:
      bLength                12
      bDescriptorType        36
      bDescriptorSubtype      2 (INPUT_TERMINAL)
      bTerminalID             1
      wTerminalType      0x0201 Microphone
      bAssocTerminal          0
      bNrChannels             1
      wChannelConfig     0x0000
      iChannelNames           0 
      iTerminal               0 
    AudioControl Interface Descriptor:
      bLength                 9
      bDescriptorType        36
      bDescriptorSubtype      6 (FEATURE_UNIT)
      bUnitID                 2
      bSourceID               1
      bControlSize            1
      bmaControls(0)       0x00
      bmaControls(1)       0x00
      iFeature                0 
    AudioControl Interface Descriptor:
      bLength                 9
      bDescriptorType        36
      bDescriptorSubtype      3 (OUTPUT_TERMINAL)
      bTerminalID             3
      wTerminalType      0x0101 USB Streaming
      bAssocTerminal          0
      bSourceID               2
      iTerminal               0 
  Interface Descriptor:
    bLength                 9
    bDescriptorType         4
    bInterfaceNumber        1
    bAlternateSetting       0
    bNumEndpoints           0
    bInterfaceClass         1 Audio
    bInterfaceSubClass      2 Streaming
    bInterfaceProtocol      0 
    iInterface              0 
  Interface Descriptor:
    bLength                 9
    bDescriptorType         4
    bInterfaceNumber        1
    bAlternateSetting       1
    bNumEndpoints           2
    bInterfaceClass         1 Audio
    bInterfaceSubClass      2 Streaming
    bInterfaceProtocol      0 
    iInterface              0 
    AudioStreaming Interface Descriptor:
      bLength                 7
      bDescriptorType        36
      bDescriptorSubtype      1 (AS_GENERAL)
      bTerminalLink           3
      bDelay                  1 frames
      wFormatTag         0x0001 PCM
    AudioStreaming Interface Descriptor:
      bLength                11
      bDescriptorType        36
      bDescriptorSubtype      2 (FORMAT_TYPE)
      bFormatType             1 (FORMAT_TYPE_I)
      bNrChannels             1
      bSubframeSize           2
      bBitResolution         16
      bSamFreqType            1 Discrete
      tSamFreq[ 0]        48000
    Endpoint Descriptor:
      bLength                 9
      bDescriptorType         5
      bEndpointAddress     0x83  EP 3 IN
      bmAttributes            5
        Transfer Type            Isochronous
        Synch Type               Asynchronous
        Usage Type               Data
      wMaxPacketSize     0x0060  1x 96 bytes
      bInterval               1
      bRefresh                0
      bSynchAddress           4
      AudioStreaming Endpoint Descriptor:
        bLength                 7
        bDescriptorType        37
        bDescriptorSubtype      1 (EP_GENERAL)
        bmAttributes         0x00
        bLockDelayUnits         0 Undefined
        wLockDelay         0x0000
    Endpoint Descriptor:
      bLength                 9
      bDescriptorType         5
      bEndpointAddress     0x84  EP 4 IN
      bmAttributes           17
        Transfer Type            Isochronous
        Synch Type               None
        Usage Type               Feedback
      wMaxPacketSize     0x0003  1x 3 bytes
      bInterval               1
      bRefresh                0
      bSynchAddress           0

I'm also configuring PMA correctly as far as I know:

HAL_PCDEx_PMAConfig(&hpcd_USB_FS, AUDIO_IN_EP_ADDR, PCD_SNG_BUF, 0x180); // AUDIO IN
HAL_PCDEx_PMAConfig(&hpcd_USB_FS, AUDIO_FB_IN_EP_ADDR, PCD_SNG_BUF, 0x01E0); // AUDIO FEEDBACK

What I'm seeing

Linux: Adaptive works flawlessly if I just always send 48 samples in `DataIn`; but no traffic ever on EP 4 when using Asynchronous with explicit feedback.
macOS: Plays sine for a few seconds, then babbles/cuts out (continuous stream of `babble error` in logs). No obvious EP 4 polling in the logs.
Windows: Haven't tested yet. Anyone confirm?

I just think I'm not getting URB_SUBMIT for the feedback endpoint so I'm not getting anything on that endpoint in Wireshark.

Questions

1. Do full-speed UAC1.0 drivers on macOS, Linux, Android, and Windows all support explicit-feedback async capture, or do they fall back to adaptive/sync differently?
2. What's the recommended FS/UAC1.0 approach to achieve cross-platform mic input?
3. Any known gotchas around implicit vs explicit feedback on UAC1.0 mics?
4. Is sending data in `USBD_AUDIO_DataIn` and feedback in `USBD_AUDIO_SOF`, primed at `USB_REQ_SET_INTERFACE alt==1`, the right pattern?
5. Should I ditch explicit and go implicit feedback, is support for that better or worse than explicit?

Thanks in advance for any pointers!

I can also share my full Wireshark dumps if anyone is interested. Here is my full usbd_audio.c (part of a composite setup as you can see):

#include "usbd_audio.h"
#include "usbd_ctlreq.h"
#include "stm32l4xx_hal.h"
#include "adc.h"
#include "firmware_config.h"

#include <string.h>
#include <math.h>

extern DMA_HandleTypeDef hdma_adc1;

// Compile-time check for packet size alignment
_Static_assert(AUDIO_IN_PACKET_BYTES % sizeof(int16_t) == 0, 
               "AUDIO_IN_PACKET_BYTES must be aligned to int16_t size");

static uint8_t dummy_zero_packet[AUDIO_IN_PACKET_BYTES];

// Adaptive feedback globals
volatile uint32_t sample_count = 0;       // Inc'd in ADC DMA callback
static uint32_t last_count = 0;
static uint32_t feedback_acc = 0;
static uint32_t feedback_incr = 48000U << 14;    // Start nominal (48.0 in Q10.14)
static uint8_t feedback_data[3];
static uint32_t ms_counter = 0;

static uint8_t USBD_AUDIO_Init(USBD_HandleTypeDef *pdev, uint8_t cfgidx);
static uint8_t USBD_AUDIO_DeInit(USBD_HandleTypeDef *pdev, uint8_t cfgidx);
static uint8_t USBD_AUDIO_Setup(USBD_HandleTypeDef *pdev, USBD_SetupReqTypedef *req);
static uint8_t USBD_AUDIO_DataIn(USBD_HandleTypeDef *pdev, uint8_t epnum);
static uint8_t USBD_AUDIO_SOF(USBD_HandleTypeDef *pdev);
static uint8_t USBD_AUDIO_EP0_RxReady(USBD_HandleTypeDef *pdev);
static uint8_t USBD_AUDIO_EP0_TxReady(USBD_HandleTypeDef *pdev);
static void *USBD_AUDIO_GetAudioHeaderDesc(uint8_t *pConfDesc);

USBD_ClassTypeDef USBD_AUDIO = {
    .Init                             = USBD_AUDIO_Init,
    .DeInit                           = USBD_AUDIO_DeInit,
    .Setup                            = USBD_AUDIO_Setup,
    .EP0_TxSent                       = USBD_AUDIO_EP0_TxReady,
    .EP0_RxReady                      = USBD_AUDIO_EP0_RxReady,
    .DataIn                           = USBD_AUDIO_DataIn,
    .DataOut                          = NULL,
    .SOF                              = USBD_AUDIO_SOF,
    .IsoINIncomplete                  = NULL,
    .IsoOUTIncomplete                 = NULL,
    .GetHSConfigDescriptor            = NULL,
    .GetFSConfigDescriptor            = NULL,
    .GetOtherSpeedConfigDescriptor    = NULL,
    .GetDeviceQualifierDescriptor     = NULL
};

/**
  * @brief  USBD_AUDIO_Init
  *         Initialize the AUDIO interface
  *   pdev: device instance
  *   cfgidx: Configuration index
  * @retval status
  */
static uint8_t USBD_AUDIO_Init(USBD_HandleTypeDef *pdev, uint8_t cfgidx) {
    USBD_AUDIO_HandleTypeDef *haudio;

    haudio = USBD_malloc(sizeof(USBD_AUDIO_HandleTypeDef));
    if (haudio == NULL) {
        pdev->pClassDataCmsit[pdev->classId] = NULL;
        return USBD_FAIL;
    }

    pdev->pClassDataCmsit[pdev->classId] = haudio;
    pdev->pClassData = haudio;

    // Open the isochronous IN endpoint for microphone streaming
    if (USBD_LL_OpenEP(pdev, AUDIO_IN_EP_ADDR, USBD_EP_TYPE_ISOC, AUDIO_IN_PACKET_BYTES) != USBD_OK) return USBD_FAIL;
    pdev->ep_in[AUDIO_IN_EP_ADDR & 0x0F].bInterval = AUDIO_FS_BINTERVAL;
    pdev->ep_in[AUDIO_IN_EP_ADDR & 0x0F].is_used = 1U;
    
    // Open the feedback endpoint for microphone streaming
    if (USBD_LL_OpenEP(pdev, AUDIO_FB_IN_EP_ADDR, USBD_EP_TYPE_ISOC, AUDIO_FB_IN_PACKET_BYTES) != USBD_OK) return USBD_FAIL;
    pdev->ep_in[AUDIO_FB_IN_EP_ADDR &0x0F].bInterval = AUDIO_FS_BINTERVAL;
    pdev->ep_in[AUDIO_FB_IN_EP_ADDR & 0x0F].is_used = 1U;

    // Flush endpoints
    USBD_LL_FlushEP(pdev, AUDIO_IN_EP_ADDR);
    USBD_LL_FlushEP(pdev, AUDIO_FB_IN_EP_ADDR);

    // Initialize buffer pointers and state
    haudio->alt_setting = 0U;

    return USBD_OK;
}

/**
  * @brief  USBD_AUDIO_Init
  *         DeInitialize the AUDIO layer
  *   pdev: device instance
  *   cfgidx: Configuration index
  * @retval status
  */
static uint8_t USBD_AUDIO_DeInit(USBD_HandleTypeDef *pdev, uint8_t cfgidx) {
    UNUSED(cfgidx);

    // Stop ADC streaming
    ADC_StopStreaming();

    // Close the AUDIO IN endpoint
    (void)USBD_LL_CloseEP(pdev, AUDIO_IN_EP_ADDR);
    pdev->ep_in[AUDIO_IN_EP_ADDR & 0x0F].is_used = 0U;
    pdev->ep_in[AUDIO_IN_EP_ADDR & 0x0F].bInterval = 0U;

    // Close the FEEDBACK IN endpoint
    (void)USBD_LL_CloseEP(pdev, AUDIO_FB_IN_EP_ADDR);
    pdev->ep_in[AUDIO_FB_IN_EP_ADDR & 0x0F].is_used = 0U;
    pdev->ep_in[AUDIO_FB_IN_EP_ADDR & 0x0F].bInterval = 0U;

    if (pdev->pClassDataCmsit[pdev->classId] != NULL) {
        USBD_free(pdev->pClassDataCmsit[pdev->classId]);
        pdev->pClassDataCmsit[pdev->classId] = NULL;
        pdev->pClassData = NULL;
    }

    // Call the board‐level deinit (e.g. stop ADC/DMA)
    ((USBD_AUDIO_ItfTypeDef *)pdev->pUserData[pdev->classId])->DeInit(0U);

    return USBD_OK;
}

/**
  * @brief  USBD_AUDIO_Setup
  *         Handle the AUDIO specific requests
  *   pdev: instance
  *   req: usb requests
  * @retval status
  */
static uint8_t USBD_AUDIO_Setup(USBD_HandleTypeDef *pdev, USBD_SetupReqTypedef *req) {
    USBD_AUDIO_HandleTypeDef *haudio = (USBD_AUDIO_HandleTypeDef *)pdev->pClassDataCmsit[pdev->classId];
    uint16_t len;
    uint8_t *pbuf;
    uint16_t status_info = 0U;
    USBD_StatusTypeDef ret = USBD_OK;
    uint8_t recipient = req->bmRequest & USB_REQ_RECIPIENT_MASK;

    if (haudio == NULL) {
        return (uint8_t)USBD_FAIL;
    }

    switch (req->bmRequest & USB_REQ_TYPE_MASK) {
        case USB_REQ_TYPE_CLASS:
            if (recipient == USB_REQ_RECIPIENT_ENDPOINT) {
                if (req->bRequest == AUDIO_REQ_GET_CUR) {
                    // 3-byte LSB-first of 48000 Hz
                    static const uint8_t freq3[3] = {
                        (uint8_t)(USBD_AUDIO_FREQ & 0xFF),
                        (uint8_t)((USBD_AUDIO_FREQ >> 8) & 0xFF),
                        (uint8_t)((USBD_AUDIO_FREQ >> 16) & 0xFF)
                    };
                    USBD_CtlSendData(pdev, (uint8_t *)freq3, 3);
                    return ret;
                }
                else if (req->bRequest == AUDIO_REQ_SET_CUR) {
                    haudio->control.cmd = AUDIO_REQ_SET_CUR;
                    haudio->control.len = MIN(req->wLength, 3);
                    USBD_CtlPrepareRx(pdev, haudio->control.data, haudio->control.len);
                    return ret;
                }
            }
            /* all other class‐type requests we don’t support */
            // USBD_CtlError(pdev, req);
            // return USBD_FAIL;
            break;

        case USB_REQ_TYPE_STANDARD:
            switch (req->bRequest){
                case USB_REQ_GET_STATUS:
                    if (pdev->dev_state == USBD_STATE_CONFIGURED){
                        (void)USBD_CtlSendData(pdev, (uint8_t *)&status_info, 2U);
                    } else {
                        USBD_CtlError(pdev, req);
                        ret = USBD_FAIL;
                    }
                    break;

                case USB_REQ_GET_DESCRIPTOR:
                    if ((req->wValue >> 8) == AUDIO_DESCRIPTOR_TYPE){
                        pbuf = (uint8_t *)USBD_AUDIO_GetAudioHeaderDesc(pdev->pConfDesc);
                        if (pbuf != NULL){
                            len = MIN(USB_AUDIO_DESC_SIZ, req->wLength);
                            (void)USBD_CtlSendData(pdev, pbuf, len);
                        } else {
                            USBD_CtlError(pdev, req);
                            ret = USBD_FAIL;
                        }
                    }
                    break;

                case USB_REQ_GET_INTERFACE:
                    if (pdev->dev_state == USBD_STATE_CONFIGURED){
                        (void)USBD_CtlSendData(pdev, (uint8_t *)&haudio->alt_setting, 1U);
                    } else {
                        USBD_CtlError(pdev, req);
                        ret = USBD_FAIL;
                    }
                    break;

                case USB_REQ_SET_INTERFACE:
                    if (pdev->dev_state == USBD_STATE_CONFIGURED){
                        uint8_t alt = (uint8_t)(req->wValue);
                        if (alt <= USBD_MAX_NUM_INTERFACES){
                            haudio->alt_setting = alt;

                            if(alt == 1){
                                // Start audio streaming
                                ADC_StartStreaming();  // Start ADC DMA
                                
                                // Prime the audio endpoint with first packet
                                USBD_LL_FlushEP(pdev, AUDIO_IN_EP_ADDR);
                                USBD_LL_Transmit(pdev, AUDIO_IN_EP_ADDR, dummy_zero_packet, AUDIO_IN_PACKET_BYTES);
                                
                                // Prime the feedback endpoint with first packet
                                // Reset feedback state
                                sample_count = 0;
                                last_count = 0;
                                feedback_acc = 0;
                                feedback_incr = 48000U << 14;  // Start nominal (48.0 in Q10.14)
                                ms_counter = 0;
                                
                                // Prime with initial feedback packet
                                feedback_data[0] = (uint8_t)(feedback_incr & 0xFF);
                                feedback_data[1] = (uint8_t)((feedback_incr >> 8) & 0xFF);
                                feedback_data[2] = (uint8_t)((feedback_incr >> 16) & 0xFF);
                                
                                USBD_LL_FlushEP(pdev, AUDIO_FB_IN_EP_ADDR);
                                USBD_LL_Transmit(pdev, AUDIO_FB_IN_EP_ADDR, feedback_data, AUDIO_FB_IN_PACKET_BYTES);
                            } else {
                                // Stop audio streaming when switching to alt setting 0
                                ADC_StopStreaming();
                            }
                        } else {
                            USBD_CtlError(pdev, req);
                            ret = USBD_FAIL;
                        }
                    } else {
                        USBD_CtlError(pdev, req);
                        ret = USBD_FAIL;
                    }
                    break;

                case USB_REQ_CLEAR_FEATURE:
                    break;

                default:
                    USBD_CtlError(pdev, req);
                    ret = USBD_FAIL;
                    break;
            }
            break;

        default:
            USBD_CtlError(pdev, req);
            ret = USBD_FAIL;
            break;
    }

    return (uint8_t)ret;
}

/**
  * @brief  USBD_AUDIO_DataIn
  *         handle data IN Stage
  *   pdev: device instance
  *   epnum: endpoint index
  * @retval status
  */
static uint8_t USBD_AUDIO_DataIn(USBD_HandleTypeDef *pdev, uint8_t epnum){
    USBD_AUDIO_HandleTypeDef *haudio = (USBD_AUDIO_HandleTypeDef *)pdev->pClassDataCmsit[pdev->classId];

    if (!haudio || haudio->alt_setting != 1) {
        return USBD_OK;
    }

    // Handle Audio Data Endpoint
    if (epnum == (AUDIO_IN_EP_ADDR & 0x7F)) {
        static uint32_t sample_counter = 0;
        int16_t* samples = (int16_t*)dummy_zero_packet;
        
        for (int i = 0; i < 48; i++) {
            // Generate 1kHz sine wave at 48kHz sample rate 
            float phase = (sample_counter + i) * 2.0f * 3.14159f * 1000.0f / 48000.0f;
            samples[i] = (int16_t)(sinf(phase) * 8000);
        }
        sample_counter += 48;
        
        USBD_LL_Transmit(pdev, AUDIO_IN_EP_ADDR, dummy_zero_packet, AUDIO_IN_PACKET_BYTES);
        
        return USBD_OK;
    }
    
    return USBD_OK;
}

uint8_t USBD_AUDIO_SOF(USBD_HandleTypeDef *pdev){
    USBD_AUDIO_HandleTypeDef *haudio = (USBD_AUDIO_HandleTypeDef *)pdev->pClassDataCmsit[pdev->classId];
    
    // Only send feedback when streaming is active
    if (!haudio || haudio->alt_setting != 1) {
        return USBD_OK;
    }
    
    // Measure drift every 100 ms and adapt feedback rate
    if (++ms_counter >= 100) {
        uint32_t delta = sample_count - last_count;
        last_count = sample_count;
        ms_counter = 0;
        
        // Convert samples_per_100ms to Q10.14 per frame
        // delta samples in 100ms
        uint32_t new_incr = ((delta * (1UL << 14)) + 50) / 100;
        
        // Smoothing
        feedback_incr = (feedback_incr * 7 + new_incr) >> 3;
    }
    
    // Accumulate with fractional precision and pack into 3-byte format
    feedback_acc += feedback_incr;
    feedback_data[0] = (uint8_t)(feedback_acc & 0xFF);
    feedback_data[1] = (uint8_t)((feedback_acc >> 8) & 0xFF);
    feedback_data[2] = (uint8_t)((feedback_acc >> 16) & 0xFF);
    
    // Send feedback packet (no flush, let USB engine handle scheduling)
    USBD_LL_Transmit(pdev, AUDIO_FB_IN_EP_ADDR, feedback_data, AUDIO_FB_IN_PACKET_BYTES);
    
    return USBD_OK;
}

/**
  * @brief  USBD_AUDIO_EP0_RxReady
  *         handle EP0 Rx Ready event
  *   pdev: device instance
  * @retval status
  */
static uint8_t USBD_AUDIO_EP0_RxReady(USBD_HandleTypeDef *pdev) {
    UNUSED(pdev);
    return USBD_OK;
}

/**
  * @brief  USBD_AUDIO_EP0_TxReady
  *         handle EP0 TRx Ready event
  *   pdev: device instance
  * @retval status
  */
static uint8_t USBD_AUDIO_EP0_TxReady(USBD_HandleTypeDef *pdev){
    UNUSED(pdev);

    /* Only OUT control data are processed */
    return (uint8_t)USBD_OK;
}

/**
  * @brief  USBD_AUDIO_RegisterInterface
  *   pdev: device instance
  *   fops: Audio interface callback
  * @retval status
  */
uint8_t USBD_AUDIO_RegisterInterface(USBD_HandleTypeDef *pdev, USBD_AUDIO_ItfTypeDef *fops){
    if (fops == NULL){
        return (uint8_t)USBD_FAIL;
    }

    pdev->pUserData[pdev->classId] = fops;

    return (uint8_t)USBD_OK;
}

/**
  * @brief  USBD_AUDIO_GetEpPcktSze
  *   pdev: device instance (reserved for future use)
  *   If: Interface number (reserved for future use)
  *   Ep: Endpoint number (reserved for future use)
  * @retval status
  */
uint32_t USBD_AUDIO_GetEpPcktSze(USBD_HandleTypeDef *pdev, uint8_t If, uint8_t Ep){
    // Return the wMaxPacketSize value in Bytes: 96
    return AUDIO_IN_PACKET_BYTES;
}

/**
  * @brief  USBD_AUDIO_GetAudioHeaderDesc
  *         This function return the Audio descriptor
  *   pdev: device instance
  *   pConfDesc:  pointer to Bos descriptor
  * @retval pointer to the Audio AC Header descriptor
  */
static void *USBD_AUDIO_GetAudioHeaderDesc(uint8_t *pConfDesc){
    USBD_ConfigDescTypeDef *desc = (USBD_ConfigDescTypeDef *)(void *)pConfDesc;
    USBD_DescHeaderTypeDef *pdesc = (USBD_DescHeaderTypeDef *)(void *)pConfDesc;
    uint8_t *pAudioDesc =  NULL;
    uint16_t ptr;

    if (desc->wTotalLength > desc->bLength){
        ptr = desc->bLength;

        while (ptr < desc->wTotalLength){
            pdesc = USBD_GetNextDesc((uint8_t *)pdesc, &ptr);
            if ((pdesc->bDescriptorType == AUDIO_INTERFACE_DESCRIPTOR_TYPE) &&
                    (pdesc->bDescriptorSubType == AUDIO_CONTROL_HEADER)){
                pAudioDesc = (uint8_t *)pdesc;
                break;
            }
        }
    }

    return pAudioDesc;
}

AScha.3 · ‎2025-07-16

Same question here ...?

https://www.reddit.com/r/embedded/comments/1m0sbke/usb_uac10_mic_on_stm32l4_explicitfeedback_vs/

Just: what is : implicit vs explicit feedback on UAC1.0 ?

(I only made standard codec on USB , 48kHz , isochronous; that was working.)

If you feel a post has answered your question, please click "Accept as Solution".

bvmd · ‎2025-07-16

Yes, I also asked it over on r/embedded, but then thought that maybe this would be a better place to ask for this specific stack!

In my understanding "explicit feedback" requires a separate feedback endpoint that publishes the real sample rate relative to the SOF timing with 3-byte values.

"Implicit feedback" does not use a feedback endpoint, but the host will automatically adjust its clock to match the data rate of the device. But I think implicit is less widely support than explicit, but like I said it's very hard to find conclusive information about this online.

There is also "synchronous" mode that also does not use any feedback, but requires the device to match the clock of the host, so this would mean it should always send exacty 48 samples per frame despite ADC clock drift. I have considered implementing my own sample rate conversion (fractional SRC) for it, but this would increase complexity massively and instinctively this should not be necessary with asynchronous mode.

AScha.3 · ‎2025-07-16

I just can tell, what i know:

audio is always transferred isochronous , thats fixed sent (from host, PC) in 1ms blocks , typical at 48khz, 48 samples 16b., 192B.

There is no (!) feedback, synchronisation or whatever, the device just has to get the real clock from the 1ms interval;

DACs like PCM5102 doing this with internal PLL . (because the master clock in PC always has some drift or offset, typ. 10exp-5 or so)

Which way the "hi-end-audio-USB" interfaces doing it, at 96k/24b or any, i dont know; its UAC2 needed i think, and maybe they just transfer as bulk data to a local buffer, playing then on their own precision master clocks, totally independent on the host clock.

Optimum quality is only with local low jitter crystal clocks possible.

So just use isochronous mode at 48k 16b , that works ; i tried. :)

And should enumerate on Lin/Win/Mac without any problem, its what all cheap USB-sound-cards are using.

If you feel a post has answered your question, please click "Accept as Solution".

bvmd · ‎2025-07-16

Thank you for your comments and insights!

I'm pretty sure you mean "synchronous", right? As isochronous is either synchronous or asynchronous, as far as I know. I'm also fairly confident UAC1.0 does support asynchronous (with some form of feedback) in one way or another. At the end of the day I don't really care how it should work, as long as it does work with low latency and no pops or underruns or drift on my STM32L4 without HSE crystal.

I've been researching the code here today:

There are defines like `USE_AUDIO_RECORDING_USB_IMPLICIT_SYNCHRO`.

 /**
 * @brief The microphone node declaration
 *
 * The session will communicate with microphone node using this structure
 * When USE_AUDIO_RECORDING_USB_IMPLICIT_SYNCHRO is activated, the session will estimate the microphone relative frequency:
 * Two fuction handler will be provided by microphone to estimate frequency.
 * The first function reset a  counter variable. This counter will compute read samples in byte
 * The second function will return this counter value and reset it. 
 *      Thus, this function will provide the amount of data captured in byte since the last call
 * 
 */

But I'm having a hard time figuring this code out. I cannot really find where this implicit feedback is used, or it is not implemented in these examples.

The readme does say:

- Recording synchronization using add/remove(implicit synchronization)

Which I interpret as the device either drops one sample or duplicates one per packet. The confusing thing is that this is not really "implicit" feedback. In my understanding, real implicit feedback is not done on the device, but on the host, where the host adjusts its clock depending on the amount of samples per frame it's seeing from the device:

https://developer.apple.com/library/archive/technotes/tn2274/_index.html

It's all very confusing to me. I know I have used cheap USB mics in the past that worked perfectly fine for extended periods of time without any high precision clocks.

----------

I might have found another problem in my current implementation though. The USB Device Class Definition for Audio Devices says about bRefresh:

bRefresh:

This field indicates the rate at which an
isochronous synchronization pipe
provides new synchronization feedback
data. This rate must be a power of 2,
therefore only the power is reported back
and the range of this field is from 1
(2 ms) to 9 (512 ms).

Source: https://www.usb.org/sites/default/files/audio10.pdf

Which I currently still have at `0` for the feedback endpoint EP4. This is what I will try next, to see if I will get URB IN for the FB EP.