AnsweredAssumed Answered

USB Audio and I2S MCLK? Independent Word Clocks!

Question asked by jaekel.torsten on Oct 30, 2017
Latest reply on Nov 2, 2017 by jaekel.torsten

I work (again) on USB Audio on STM boards. It works, I get the audio from PC to board (or vice versa), so far OK. FW works, example/demo code is helpful, no issues from a SW coding point of view.

 

My issue is: How to deal with two independent audio clocks?

The DAC on board gets MCLK (Word Clock) from MCU (or something else). USB audio comes in, e.g. every 1 ms there is a new audio packet, e.g. 192 bytes (48 stereo samples) when 48 KHz sample rate, 16bit stereo. But this clock is generated by the PC. The host generates this 1 ms period, independently of MCU board (and its I2S/DAC there).
So, at the end, USB and DAC on board are running with different, completely independent and not synchronized clocks.

It results in following:

  1. it is a bit tricky to synchronize the USB with the Audio DAC, the DMAs for it, it depends when USB kicks on, when it will get a packet with audio frames
  2. this is completely unrelated to each other: USB frames can be ready when the DMA HALF_FULL or FULL happen at the same time, if we are not smart - there might be a race condition (think also about preemptive RTOS or interleaved interrupts) - OK, possible to handle and solve this "USB kicks in in a random way and never in 'sync' with DAC DMAs".
  3. but biggest issue is the clock itself: both clocks are never identical, they drift over a period of time!
    It means: USB In can be a bit faster as the DAC will play out the audio (or vice versa). So, periodically you can get an "overrun", you get more audio samples as DAC can play (due to the clock setting).
    And this drift is caused by two parts: the PC might not be accurate (a bit off or it jitters a bit) in terms of nominal Word Clock, and the MCU I2C PLL, clock generation is not "accurate", even it could be different from board to board. Or the MCU MCLK clock has jitter (unknown if and how much).

 

OK, you can try to tweak the PLL configuration, but still: both clocks are never really in sync, never identical as 48.0000xxx KHz and no way to avoid that we get more audio samples as are played via DAC (or vice versa).

There are only two options to solve the problem.
What would be your suggestion? How would you deal with two independent Audio Word Clocks?

  1. OK, if USB is a bit too fast - and a new USB frame (e.g. with 48 stereo samples) does not fit - drop it. This is my current approach and it seems to work acceptable: on regular music I do not hear an effect.
    But it is actually not really nice (if you would analyze LineOut audio and you expect a perfect sine wave sent from host PC - it will have some "phase jumps", e.g. for me approx. every 10 seconds a piece of audio is missing.
  2. OK, let's trim the PLL. Reconfigure the I2C PLL in MCU (if MCU provides the MCLK for DAC). Hoping it will be possible to increment/decrement a PLL divider for I2S PLL without stalling the clock or generating clock glitches or phase jumps.
    My concerns are these:
  • The PLL can be set just with integer dividers, not so fine granular as with a PLL able to take fractional dividers. OK, it might result is a larger frequency change (audio pitch shift at the end), but potentially still small enough not to realize by human ear.
  • My thinking is: assuming it will work, set PLL slower or faster depending on situation what happens on the relation between USB input buffers and DAC output buffers - it will add jitter! (for sure):
    I do not lose anymore audio samples but the audio clock, the MCLK and therefore the output signal will see small changes on frequency. Here in simple case: jumping between two MCLK frequencies, always toggling around the nominal word clock but not anymore with "correct nominal" clock. Let's assume, instead of the accurate 48KHz it toggles now between 47.8 and 48.2 KHz (with question how often/fast).

 

My fear is: this additional jitter - will it be audible? Could the audio sound start "breathing"?
(eliminating jitter from USB is not the issue here, fine, with large enough buffers (and delay) - we can assume USB in is jitter free)

 

What would be your option and opinion?

  • a) drop a small frame (or "merge" with the very latest, in case USB is a bit faster)?
  • b) adjust the PLL and create an "adaptive clock recovery"? (a feedback from USB "clock" to MCLK generation for DAC)

 

From a HW point of view, the actual correct solution would be this IMO:

  • the USB should be the master clock: generate a clock from the incoming ISO-chronous (1 KHz) USB clock and use it in order to generate the MCLK for the DAC (e.g. via a PLL chip)
  • use a real PLL which allows also fractional dividers so that the fine tuning of the MCLK is much more fine granular, the jitter much smaller (changing more seldom), very smooth and not so drastic (as on MCU I2S PLL with integer dividers and uncertainty how the clock will behave on MCU)
  • or: use a real USB-audio receiver chip, as USB-to-I2S and connect the DAC and this chip via I2S (the SAI peripherals), instead of MCLK from MCU - I2S clocks come from external clock source, e.g. here the USB bridge.

 

Conclusion:
As it is on many of the DISCOVERY boards, where I2S clock is generated by MCU, even MCU and board allows to implement USB Audio - there remains the issue with the fact, that the audio Word Clocks are not in sync. It is not possible to sync really both in a way that the audio would be free of artefacts over a longer period of time (my goal is: at least 5 minutes free of any lost audio frame, no buffer over/underruns during 5 minutes of sound). We had to cope with discarding same samples (or filling a gap) or we had to add "artifical" jitter. Having 5 minutes the Word Clock "in sync" would need a quite accurate clock configuration and clock stability which is quite impossible to achieve (with MCUs).


Please, tell me your thoughts and how you have solved such an issue. Thank you.
Many regards
Torsten

Outcomes