cancel
Showing results for 
Search instead for 
Did you mean: 

USB audio (isochronous async EP + feedback EP) - sync problems (SOLVED!)

ColdWeather
Senior
Posted on October 09, 2016 at 12:35

Hello!

Using an own board with a F103 and a TAS5715 Class-D amplifier on it I'm trying to implement an USB-Speaker.

Learning from various AN I modified the speaker example from STM USB FS Device Stack V3.3.0 by changing the isochronous OUT EP to its async submode and adding an isochronous IN EP as a feedback. To obtain a better audio quality the USB ''properties'' (descriptors, EP buffer sizes, etc...) have been further changed to stereo 16 bit channels at 44.1kHz.

Windows 10 recognizes the device, and in general, it works: I hear the sound!

Well, according to the F103 docs, having 72MHz system clock the actual sample rate of I2S would be 43.269kHz, what I really observe (my scope shows 43.2694kHz ;)). This value I send via feedback EP (in the 10.14 format), and Windows OS seems to accept and process the value: if I intensionally skip sending the feedback, it starts to sound robot-like and with other distortion effects.

OK, I'm still working on the proper synchronisation to adjust the feedback value depending on the buffer level, but the issue is of another kind: every 10 seconds (quite exactly as far as I can measure it by a stop watch) I hear short dropouts in the sound. I found out, in those cases F103 obtains the buffer filled in with zeroes only. I have no places in my code doing this, thus, the data with zeroes must come via USB. I suppose, it must be a Windows driver issue.

Any ideas and experience?

TIA.

P.S. If I don't send the feedback, the periodical distortions are to hear but NO dropouts at all. The further sign, it must be a sync issue on the host.

#feedback #usb #usb #audio #audio #usb-isochronous #asynchronous #asynchronous #dropout
6 REPLIES 6
tsuneo
Senior
Posted on October 09, 2016 at 15:55

> every 10 seconds (quite exactly as far as I can measure it by a stop watch) I hear short dropouts in the sound.

 

> The further sign, it must be a sync issue on the host.

When the actual feedback value would be too far from the sampling frequency declared on the Type I format descriptor, host should be confused.

To run asynchronous USB speaker on Windows, careful setting of Windows audio control panel is required, and also a suitable playback application.

Refer to these links of async audio DAC manufacturers for the details.

''Ayre - Windows Vista/7 Setup:''

https://www.ayre.com/usb-dac-windows-vista7.htm

''Ayre - Windows 8 Setup:''

https://www.ayre.com/usb-dac-windows-8.htm

''Ayre - Computer Audio Playback:''

https://www.ayre.com/usb-dac-windows-playback.htm

''Wavelength Audio - Windows Setup''

http://www.usbdacs.com/Windows/Windows.html

> I'm still working on the proper synchronisation to adjust the feedback value depending on the buffer level

If you would be calculating feedback value at the timing of the isoc OUT transfer completion, it should cause jitter, because this timing moves around in each frame.

- Windows puts isoc IN first, isoc OUT next on a frame, in which isoc IN should occur. (isoc IN occurs just at every 2, 4, 8, 16 or 32 frames, specified in bRefresh field of AS Isochronous Synch Endpoint Descriptor). And then the timing of isoc OUT delays by isoc IN transaction, just in such frames.

- Even in a frame without isoc IN, the total bit number of isoc OUT transaction differs depending on its data, because of bit stuff on the USB wire.

I've seen many wrong audio implementations which apply this completion timing.

You should calculate the feedback at steady SOF timing.

A timer gives better implementation than buffer level, briefly described in

https://my.st.com/public/STe2ecommunities/mcu/Lists/cortex_mx_stm32/Flat.aspx?RootFolder=/public/STe2ecommunities/mcu/Lists/cortex_mx_stm32/oddeven%20bit%20in%20assynchronous#DisplayLink69529

,

For STM32 with OTG_FS/OTG_HS, SOF trigger is provided to Timer2 (TIM2_OR:ITR1_RMP). On these STM32, precise measurement is done using just hardware. For STM32F103, Unfortunately, no hardware SOF trigger is provided. SOF interrupt is applied to capture the timer value.

Tsuneo

ColdWeather
Senior
Posted on October 09, 2016 at 19:03

Thank you, Tsuneo, for your attention to my problems and the answer.

First of all, you are right about:

To run asynchronous USB speaker on Windows, careful setting of Windows audio control panel is required, and also a suitable playback application.

 

 

I have been using VLC player that allows easy to switch between the output devices. I was already about to start Ubuntu to test under it if the dropouts persist. But just before I tried with Windows Media Player, and voilá! - no dropouts more! I love VLC, but some times... Well, one issue less. At least I know now, it were not my software fault.

You should calculate the feedback at steady SOF timing.

For STM32F103, Unfortunately, no hardware SOF trigger is provided. SOF interrupt is applied to capture the timer value.

 

I'm continue trying various solutions.

I send the data to I2S by DMA now, because the CPU has been overloaded by I2S interrupts at 88kHz (stereo!). DMA is in the cyclic mode processing a ring buffer for two (double buffered EP) USB frames 176/180 bytes each for 44.1kHz 16-bit stereo (equals up to 90 16-bit samples).

Each EP OUT interrupt copies the received portion from one of the USB buffers into the ring buffer and adds the number of copied samples to a free running counter.

In that time DMA runs afterwards almost with the same rate and feeds I2S with the samples from the buffer. I use the DMA transfer half- and complete interrupts to substract the number of transferred samples from the mentioned free running counter to find the buffer level out.

The question is, what changes can/should be applied to the feedback value? Actually it is 43.269 now per default.

ColdWeather
Senior
Posted on October 11, 2016 at 12:18

Hello, Tsuneo,

A timer gives better implementation than buffer level, briefly described in

/public/STe2ecommunities/mcu/Lists/cortex_mx_stm32/Flat.aspx?RootFolder=/public/STe2ecommunities/mcu/Lists/cortex_mx_stm32/oddeven%20bit%20in%20assynchronous#DisplayLink69529

, For STM32F103, Unfortunately, no hardware SOF trigger is provided. SOF interrupt is applied to capture the timer value. I've been working these days on solution to synchronize USB and I2S, and I got it.

In general I cannot agree with the claim ''

A timer gives better implementation than buffer level''. Actually, the buffer level is that ultimate output result of all control loops.

As I wrote two days before, in my implementation now the EP OUT interrupt adds the number of just copied samples to a variable. The DMA interrupt on HT and TC in its turn substracts the number of just transmitted samples (1/2 of the whole ring buffer size) from this variable. Thus, this variable is a buffer level reflection. A trick: both interrupts have the same priority to ensure a kind of atomic operations on the variable.

The next steps I do are:

- in the DMA interrupt, after substraction, I integrate a bit the buffer level by a RC-Filter (out += (in - out)/rc, where rc is 4 now);

- in the feedback EP IN I compute a correction to the sample rate. I set the buffer level point to keep at 2/3 of the whole buffer size. The reason, why not 1/2, is, it seems I've noticed occational USB packet losses, when the buffer almost underruns, so I give an advantage to USB. I apply a linear (proportional) control function to compute the neccessary deviation from my average sample rate (269kHz) depending on the current buffer level distance to the set point and its sign. The result as 14 is sent then as the feedback.

As far as it was possible, I gathered the results of the the control loop: see the attached picture.

________________

Attachments :

sample_rate_control.jpg : https://st--c.eu10.content.force.com/sfc/dist/version/download/?oid=00Db0000000YtG6&ids=0680X000006I0Zx&d=%2Fa%2F0X0000000bb6%2FAg7XecfuHTwwg1FWclNCK9T5k.Z.iK6Bswj7kgqmqGU&asPdf=false
ColdWeather
Senior
Posted on October 13, 2016 at 13:24

I applied yet another control function to compute the feedback value: a power 2.2 function (like eye perception function used by TV and monitors): y=k*x^2.2. The function rises slowly for low deviations providing the stable set point and accelerates essentially for larger deviations to compensate quickly. The attached picture shows incredibly better feedback value (sample rate) stability.

________________

Attachments :

sample_rate_control_gamma.jpg : https://st--c.eu10.content.force.com/sfc/dist/version/download/?oid=00Db0000000YtG6&ids=0680X000006I0bY&d=%2Fa%2F0X0000000bb3%2F_QAK6NukJq5ASRZGUxedlQaXHzbnzlTN7QPWW2MXbA4&asPdf=false
tsuneo
Senior
Posted on October 17, 2016 at 10:37

> In general I cannot agree with the claim ''A timer gives better implementation than buffer level''. Actually, the buffer level is that ultimate output result of all control loops.

Many implementers of asynchronous sink (speaker) have been mislead by the word ''feedback''. They try to integrate unnecessary ''controller'' on the device. You should carefully read this section of the USB2.0 spec to realize, what is supposed by the spec for an asynchronous sink device. Here are excerpts from the spec.

5.4.2 Feedback (

http://www.usb.org/developers/docs/usb20_docs/usb_20_091zip

)

An asynchronous sink must provide explicit feedback to the host by indicating accurately what its desired data rate (Ff) is, relative to the USB (micro)frame frequency. This allows the host to continuously adjust the number of samples sent to the sink so that neither underflow or overflow of the data buffer occurs... To generate the desired data rate Ff, the device must measure its actual sampling rate Fs, referenced to the USB notion of time, i.e., the USB (micro)frame frequency.

The following sentences of the spec describe the way to measure the feedback value using device master clock/divider and a counter gated by SOF timing, as shown in

https://my.st.com/public/STe2ecommunities/mcu/Lists/cortex_mx_stm32/Flat.aspx?RootFolder=/public/STe2ecommunities/mcu/Lists/cortex_mx_stm32/oddeven%20bit%20in%20assynchronous#DisplayLink69529

.

In this context,

device measures, and

host adjusts. That is, the device is just a sensor, host is the controller. You have to remember that, in a feedback system, a controller doesn't feed ''feedback'', a sensor does.

At the end of this section, the spec notes about the adjustment of the feedback value,

It is possible that the source will deliver one too many or one too few samples over a long period due to errors or accumulated inaccuracies in measuring Ff. The sink must have sufficient buffer capability to accommodate this. When the sink recognizes this condition, it should adjust the reported Ff value to correct it. This may also be necessary to compensate for relative clock drifts.

This adjustment is supposed to occur at

every ONE second or so (with 10 accuracy/1ms frame). It doesn't mean every feedback IN transaction.

The timer/counter method provides accurate feedback value

at the first feedback IN transaction, because SOF and master clock are available before host starts audio streaming. On the other hand, all of ''buffer'' methods, including yours, take

a couple of seconds after start of audio streaming, to be stabilized into required 10 accuracy, because host should have this time constant.

Do you still insist ''buffer'' methods are better than the original described in the USB spec?

Tsuneo

ColdWeather
Senior
Posted on October 17, 2016 at 16:10

Hello, Tsuneo,

thank you for your reply and arguments!

I personally, you and the specifiers of USB agree with the point, any measurement is subjected to errors. Providing the host with the desired sample rate measured as the relation between the SOF period (host clock) and an internal clocking of the sink does not eliminate buffer under-/overruns in principle: ''It is possible that the source will deliver one too many or one too few samples over a long period due to errors or accumulated inaccuracies in measuring Ff.'' This sentence just confirms, the buffer level, below the line, is decisive for the whole system.

The intention of the USB specification regarding the audio system is to provide the ''error free'' playback as far as possible at any time, also at the very beginning. As a solution the USB specification prescribes the measurement of SOF period ''in advance'' to prepare the sample rate ''ready'' even for the very first feedback IN request, when - absolutely true - any buffer level state is yet invalid.

All the theory meets the reality now.

In general, it does not matter, where the very first sample rate feedback value comes from: from the live SOF period measurements by the CPU or from the documentation written by people who have made the measuments before. Designing a system with all the restrictions and limitations (for instance, F103 without any SOF connection to timers, memory limits for buffers because of other tasks, the CPU to do, etc.) I knew, for example, the sample rate would be 43.269kHz, and I do provide this value as the very first feedback. The point is, this value is not better or worse than any live SOF measurement subjected to errors. After that I just do the inevitable, Mr. Anderson, - I measure the buffer level only.

I don't agree with the claim, the sink would not ''control'' anything at all.

Well, it provides the host with the ''measured'' value as a sensor, and the host desides, how many samples to send. But the sink has still to compute the feedback value: only the sink can estimate the USB packet ''utilization'' and react to any packet losses in real life. The behavour of the designed system shows, the computing of the feedback value according to the known control algorithms (PID, without ID now) and applying the special function helps the host to keep the final data rate precise enough to prevent any buffer edge effects.

You see, the system is not limited only to the buffer level measument. It measures the SOF, too :).

Best regards,

Igor ''ColdWeather'' Ivanov.