View Full Version : Sync question for you.


Alastair Brown
June 23rd, 2008, 04:39 AM
Posting this here after some advice on the Vegas forum.

Here is one for you audio experts.

I recorded some Audio using an Olympus DS-30 Digital Voice Recorder. It was at it's highest quality setting recording in wma format at 44.1 kHz / 128 kbps.
My problem arose when I then tried to sync this up to my XH-A1 HD Footage.
The sync very quickly drifted out.
I'm a Vegas user.
Only solution I could get was to go to the clip properties for the Olympus track and do the following:-
Go to Time stretch/pitch shift
Method Classic
Set the new length value to approx 99.76% of the original length. (too tired to work it out exactly..sorry!)
You then have to go back and re-sync the start of the clip.
After that you go to the end of the clip and just adjust the new length a fraction at a time until you are in sync.
Took a few attempts having to adjust/re-sync the start and check but it's now spot on.


So…….anybody got an explanation for what goes or a more elegant non hit or a miss solution?

Up until now, I have been using mini disc recorder up until now which worked great apart from the slow upload times.

Tom Hardwick
June 23rd, 2008, 05:43 AM
I'm the same as you Alastair - I've used Minidisc for years and they happily sync up on my timeline even for hour-long takes in RC churches. So I thought it was time I moved into the digital age and bought myself a Zoom H2, but whatever format I record in (wav or MP3) and at whatever quality setting, sync is soon lost. More post fiddling needed.

tom.

Dave Anderson
June 23rd, 2008, 07:15 AM
It is my understanding, from everything I have read hear, that drifting is common. How much drift will always vary due to the time clocks in each recording unit.

As for the ds-30, I've also read that it is best to resample the audio from the 44.1 Khz wma format to a 16-bit 48Khz wav. I also understand this will help with the drift.

Then I came across this forum, http://www.mfbb.net/myvideoproblems/myvideoproblems-about25.html. It talks about synchronising external audio with video audio. I haven't tried it yet, but I'll be doing my first wedding this weekend using the ds-30 and I hope that discussion will help me.

Richard Gooderick
June 23rd, 2008, 08:14 AM
Same here. I've been using an HHB minidisc and it has stayed in sync for very long takes.

For what it's worth I am now also using a Fostex FR2 LE (flash memory recorder) with similar results.

I have read that an audio recorder with timecode is de rigeur in some quarters but this system seems to work OK for me.

Brooks Harrington
June 23rd, 2008, 10:13 AM
First thing is to render a Sample Rate Conversion from 44.1 to 48K before importing to work with picture.

Steve House
June 23rd, 2008, 10:14 AM
In a digital video workflow, with wav files produced by the audio recorder, timecode DOES NOT in itself prevent sync drift. What it does is establish a 'line-up' point similar to a slate that lets you align the video and audio files in your editor. But that's only one point ... if the files are playing back as slightly different rates (caused by the video and audio sample clock not being identical) they will drift out of sync when you move away from that alignment frame. What is necessary to prevent drift is to slave the camera sync and the audio sample clock to a common timebase. This can be done in a variety of ways - one, for example, is to use an audio recorder such as the Tascam HD-P2 or the new Sound Devices 788t that accepts a composite video or video blackburst signal from the camera and slaves its clock to it. Multi-camera shoots and shoots where you need the camera and recorder to be untethered use tools like Lock-it Boxes that are highly accurate clocks that are tuned to each other and attach to both the cameras and the recorder, supplying genlock to the camera and word-clock to hte audio recorder (obviously this requires you to be using pro cameras that accept genlock sync input - a rarity in prosumer cameras).

Alastair Brown
June 23rd, 2008, 10:49 AM
First thing is to render a Sample Rate Conversion from 44.1 to 48K before importing to work with picture.


This one keeps cropping up however, If I do this, then place both the wav and wma file together on the timeline, they align perfectly, which means the sync issue will still happen.

I still don't get how a mini-disc IS in sync yet a digital recorder isn't. Heard that ipods also don't drift.

Steve House
June 23rd, 2008, 10:51 AM
This one keeps cropping up however, If I do this, then place both the wav and wma file together on the timeline, they align perfectly, which means the sync issue will still happen.

I still don't get how a mini-disc IS in sync yet a digital recorder isn't. Heard that ipods also don't drift.

I think it's more a matter of degree than anthing else.

Brooks Harrington
June 23rd, 2008, 11:38 AM
In Vegas are you using a 48K sample rate for your timeline? I'm assuming you are.
To put a 44.1k file in a 48k timeline is asking for trouble.
Proceed in a logical manor.

Pietro Impagliazzo
June 23rd, 2008, 12:20 PM
I was going to ask the same.

If you're recording to sync up with video 48khz would be the way to go.

Someone recently said (can't remeber who) that 44.1khz won't sync up with video properly because some things don't match (I'm sorry, can't remeber exactly what).

Alastair Brown
June 23rd, 2008, 01:52 PM
In Vegas are you using a 48K sample rate for your timeline? I'm assuming you are.
To put a 44.1k file in a 48k timeline is asking for trouble.
Proceed in a logical manor.

Tried all sorts of combinations and still the same. I think the article Dave Anderson posted has come closest to explaining whats going on. SUKS, but at least it's a reason!

http://www.mfbb.net/myvideoproblems/myvideoproblems-about25.html

Seth Bloombaum
June 23rd, 2008, 02:39 PM
The link to Anderson's explanation is broken.

The reason is quite simple: most consumer/music market audio devices aren't designed to tightly meet the timebase standard; that is, they run on internal clocks that in many cases aren't running at precisely the right rate. This means that instead of recording 44,100 samples in a second, the number of samples will be off. The NLE assumes that 44,100, or 48,000 samples equals one second on the timeline.

A little error of a few percentage points isn't an issue unless you're trying to sync to another device.

An hour-long take with my HHB Minidisc has always been right on the money, my Zoom H4 not so much.

So, you make adjustments or buy better equipment. Your garden-variety dictation device, like the Olympus, was never designed for sync. This is rarely a problem in short takes, but always shows up in event coverage where you have hour-long takes.

Alastair Brown
June 23rd, 2008, 02:50 PM
My mini-desc is the same. Never out, never a problem. So whats difference? They are still recording zeros and ones?

Link sorted.

Dave Blackhurst
June 23rd, 2008, 03:52 PM
Here's the easy explanation - the internal workings of the recording device have a "clock" - much like the things that regulate the speed of an old analog tape deck.

These little "clock" chips come off a production line somewhere in China most likely, produced with some tolerancing no doubt, but unknown Quality Control. All it takes is a slight variation (or a minor design flaw), and the clocks in your devices run slightly slower or faster.

Remember the old saying, "man who has two watches never knows what time it is". Your cam has one "watch", your sound recorder another - add to the mix different sample rates, and now you have most of the problem sussed out.

At least it's not like old analog gear that could speed up and slow down WHILE recording or playing back... the "drift" should be consistant, and easier to correct for.

Dave Anderson
June 24th, 2008, 07:15 AM
Just as 2 clocks are never the same time, off by fractions of a second, 2 time-clocks in recording devices are never the same.

To minimize the drift, and if you don't have access to a common time-sync device, you should try and use the same model camera or at least from the same manufacturer.

I have 2 of the same model cams and I have experienced a drift after 2 hours of continous video. The drift was 1 frame, but still a drift.

It all goes back to the chip that keeps time in the recording unit. No 2 chips will ever be 100% in sync. They should be close and I would even say at short lengths it also should not be a problem. But after an hour I would understand a slight drift and would not mind correcting for it if the price was good audio.

A. J. deLange
June 24th, 2008, 03:39 PM
Just as 2 clocks are never the same time, off by fractions of a second, 2 time-clocks in recording devices are never the same.

The fact that a pair of clocks read a fraction of a second apart at any particular time is half the story i.e. the offset. The other half is that the rates are slightly different. In the time one ticks off 100 seconds the other may tick off 101 (or 99 or some other number). It is the role of wordclock, blackburst or video sync to make two entities' (cameras', recorders') clocks run at the same rate. It is the role of time code to mark recordings at the same instant of time with the same "address" (time stamp) though many pieces of equipment are capable of using time code for the rate as well. There are 80 bits of timecode information in each of 30 time code frames (not necessarily the same as video frames) per second for 2400 bps which when multiplied by 20 gives 48,000 (or by 40 gives 96,000) sps wordclock for audio. When the frame rate is 30 that's 1600 audio samples per frame. When it's 29.97... its 1601.6 and it gets tricker. And so on for other frame rates.

Having things run at the same rate is more important than time stamping because you can usually figure out the offset in post but time code is handy. The trick is to make sure you have synchronized blackburst for the camera(s) and wordclock for the recorders. Both are available from the aforementioned Lockit and Clockit boxes. Synchronizers (e.g. MOTU Time Piece) can derive word clock from time code as can some recorders so a camera which puts out LTC (Linear Time Code), which many do over their LANC ports, can often be sync'ed to a recorder. The combinations of ways in which this can be done are numerous.

Always keep in mind that a LTC signal is an audio signal which, if recorded on a spare track, will ultimately always be available to tell you what time it is (so long as you have a way to read it).

Steve House
June 24th, 2008, 04:49 PM
My mini-desc is the same. Never out, never a problem. So whats difference? They are still recording zeros and ones?

Link sorted.


The difference is due to the way the recording is tranferred into the NLE with a minidisk versus a file-based recorder. While they are both recording ones and zeros, the file-based recorder records as a bwf (wav) file which is simply copied over onto the editing computer -it's a straight file copy. The minidisc, otoh, effectively "plays" the recording to transfer it into the editing workstation, sort of like hooking the digital output of one device into the digital input of another. It's still a digital transfer but it's not a just a direct file copy. Why should this matter? Consider two recorders, one file-based and the other a minidisc. For simplicity lets say they are both recording at 48kHz samples rate and both of them have sample clocks that run 10% too fast. They're recording a sound that lasts exactly 1 second. In each case the recording is supposed to contain 48000 samples but because the clock is ticking too fast, it actually has 52800 samples (48000 plus 4800). With the file based recorder, we copy its file into an editing computer whose clock is running exactly on-spec at 4800kHz and play it back. After 1 second of playback time it will have played 48000 samples. But our file still has another 4800 samples to go, which will take the workstation another 0.1 second to get through. In other words, the editing computer will take 1.1 seconds to playback the number of samples that were originally recorded in 1 second. The sound is running slow. But consider the same scenario with the minidisc. Even though its clock is also ticking 10% too fast and it records 52800 samples in one second instead of 48000, when it is being transferred into the computer it is effectively being "played" on the recorder itself, the playback rate governed not by the workstations more accurate clock but instead by the recorder's 10% fast clock. The result is that the 52800 samples play in 1 second, NOT 1.1 second, and the event time isn't distorted.

It's a shame the whole digital rights managment broohaha led Sony to effectively cripple the minidisc technology. It had some distinct advantages that gave it a lot of promise, including the aformentioned behavior. But I'm afraid it's now a technology that's as anachronistic as DAT or analogue audio tape, or 2 inch quad, 1 inch helical, or VHS and Beta videotapes.

Bill Davis
June 24th, 2008, 05:45 PM
I think this is getting WAY too complicated to solve his actual problem.

You DO NOT need extra hardware to solve this.

Look the problem is fundamental.
The standard for DV (plain old digital video) is a 25Mpbs video stream and two 48Khz audio streams.
If you bring in something other than that, it needs to be re-calculated. That's all.

Fixing this is largely trivial.

Just export your audio to a file and RESAMPLE that file into 48khz audio. Period. End of problem.

I work with Quicktime and Final Cut Pro - not Vegas - but back in the early versions of FCP before they coded in "on the fly" audio resampling - I had to do this all the time.

We'd get CD audio that was 44.1khz and if it simply got stuck on the timeline - the audio would drift.

The solution was always to simply export the audio from the timeline as an AIFF File, Open it in Quicktime Pro - change the sample rate to 48Khz And replace the original timeline audio with the newly sample file.

Presto - NO MORE DRIFT.

I'm confident you have equally simple and easy to operate tools in the PC/Vegas world.

This is just MATH - nothing crazy - and we all know that computers are REALLY good at math.

Just re-sample the audio and get on with building your video.

Good luck.

A. J. deLange
June 24th, 2008, 07:05 PM
Just export your audio to a file and RESAMPLE that file into 48khz audio. Period. End of problem.


This will work if and only if the other (recorder) clock is synchronous (in terms of rate) with the camera clock. Interpolation to 48kHz from 44.1kHz only means that the resampler computes 480 evenly spaced output samples from every 441 input samples. Were the recorder clock off 1% its sample rate would be 44.541 kHz and the resampled file would be sampled at an effective 48.480 kHz which would drift relative to camera sound and video. So if you are lucky (as some have said they are) or you can tolerate small amounts of relative drift this method is OK. Otherwise, you must do something additional. Solutions in software are possible if you have in and out sync points on recording and video. Two clapper strokes 10 sec apart would be 445,410 samples apart on the recorder with 1% clock error. In many NLEs and DAWs one can put cursors on such audio to stretch or shrink it to match the video in which case the software would interpolate 480,000 samples from the 445,410 input samples (arbitrary sample rate conversion is possible - it's just math though not trivial math) thus effectively locking recorder and camera clocks.

Were the quote true there would be no market for synchronizers, black burst generators, cameras with "Jack Packs", Lockits, Clockits etc. This gear is expensive, sometimes tricky to master and makes for more complicated setups so if synced recording is an occasional requirement and if approximate sync is sufficient by all means use resampling (with stretching if necessary) and manual time alignment. For professional work electronic syncing is a must.

Bill Davis
June 24th, 2008, 10:46 PM
AJ,

Don't mean to be a pest about this but I haven't seen a temporal offset DUE TO SAMPLE RATE problems that was as hard to solve as you allege since engineers went to crystal controlled clocks a LONG time ago.

(The closest I remember "in the modern age" is back in the late 90's when Canon released a camcorder (the ORIGINAL XL-1?) that had a slightly wonky sample rate. So the guys at Apple had to put a check box into Final Cut Pro 1.5 for guys who were shooting that particular camcorder. Even THEN checking the box let the Mac re-calculate and - POOF - no drift - everything was fine.)

The kind of delay that the tools you mention in your post are designed to deal with is TRANSMISSION delay. Like sending audio or video through a few hundred feet of co-ax plus a few routers and switches for good measure in a studio facility. Yeah, those kind of timing delays may ABSOLUTELY need hardware to address.

But blackburst and genlock were developed to keep SEPARATE machines synced in REAL TIME. The OP is NOT dealing with separate machines. Just reading the same file on two separate devices.

And I haven't seen anyone design a camcorder circuit that drifts as badly as you're talking about in DECADES. Plus it's writing it's data to a digital stream right AT THE ENCODER.

Then taking those ones and zeros and digitally transferring them (with checksum and "keep the numbers accurate" math going on) into yet another system (likely IEEE1394 buss fed or similar) where there is AGAIN essentially NO opportunity for the kind of transmission delay you note.

What the OP is facing is NOT complicated. It's just MATH. Pretty simple math for any reasonable computer processor at that.

I'm saying this is a solution because I've been DOING it successfully for a decade now.

In fact, it's a stupid debate.

OP - just TRY what I suggested. Take the audio file and re-sample it right in your computer. At the MOST, you'll need a free downloadable audio program that can re-sample WAV or AIFF files. So my suggestion means you don't have to go drag down the 100 pound Allied catalog or spend the weekend on-line ordering stuff then waiting for UPS. It costs you NOTHING.

Just try it.

And I bet you report back that everything on your timeline comes out JUST FINE.

Sorry guys, but I got to tell you that there's a HUGE tendency in this business to try and "over think" solutions. (not saying that there aren't things that need serious effort and expensive gear to solve, just that reading a DV tape into a computer and keeping the audio and video in sync isn't one of them) Not after twenty years of constant NLE refinement.

If your signals are drifting that badly, you're doing something WRONG. Perhaps simply recording at a rate you're timeline isn't properly set up to handle. No need for anguish or expense. Just tell the computer to FIX it and move on.

Good luck.

Ty Ford
June 25th, 2008, 05:20 AM
Yes, but what's interesting is that the humble consumer minidisc works, but the H4 doesn't.

Regards,

Ty Ford

Michael Liebergot
June 25th, 2008, 06:16 AM
Yes, but what's interesting is that the humble consumer minidisc works, but the H4 doesn't.

Regards,

Ty Ford
That's because the Zoom recorders use reeeeeeeeal cheap crystals from China.
I believe Sony, Marantz, Edirol use crystals from Japan.

Funny it used to be a running joke about things being made in Japan.
Now China is the joke, and Japan means quality.

Go figure.

A. J. deLange
June 25th, 2008, 07:00 AM
Whether the simple technique Bill advocates works or not depends on 3 things: the luck of the draw, the practices of the manufacturers of both pieces of equipment and what your tolerance for relative drift is. A decent crystal oscillator is typically specced to about plus or minus 25 ppm. If you are lucky both pieces of equipment will have the same error or nearly the same error. If you are unlucky one piece may be off plus 25 and the other off -25 for a difference of 50. Thats 50 seconds time base error in a million or 1 frame every 11 minutes. Even so for some applications that's acceptable while for others it is not. It is not, of course, likely that one crystal would be at the maximum allowable drift and the other at the other. More likely one would be at say plus 12 and the other at say plus 17 for a difference (and it is definitely difference in running rates of oscilators that we are speaking of and not cable or switcher delay) for something like 1/5 the error in the max case which would amount to 1 frame in 111 minutes. Better, but in some applications still intolerable.

To the crystal's basic tolerance we must add temperature instability, voltage instability and ageing which taken together simply mean that the drift between units can be greater than the example numbers I've thrown out. All of these, including tolerance, can be compensated for - voltage instability by good voltage regulation, temperature stability by ovenizing, basic tolerance by careful selection and ageing by recalibration. The Lockit boxes solve the sync problem using high quality crystal oscilators. Ambient gives careful consideration in their design to all these factors. The manufacturer of a consumer camera (which costs less than a Lockit) does not nor, in general, do the manufacturers of prosumer or even pro cameras to the same extent. If they did, Ambient would not have a market. Even with the Lockit approach (in which each crystal is locked to a master at the beginning of the shoot/day/scene) there will be some drift though it is hundredths of a frame (10's of audio samples at 48K per day). In some cases this is not tolerable and in those hard wiring or radio linked sync is required.

I hope that this helps to make it clear why some people can get by without sync and others can't. One thing that has not been mentioned which may help when running without is to try to have all pieces of gear at the same temperature.

Again I emphasize that I am not talking about transmission delays which result in clock phase error but rather clock rate (phase rate) error. Transmission delay must be accounted for if all signals resulting from an event are to arrive at a given point at the same epoch. Time code is there to help with this but cameras, for example, often have controls which allow pre or post triggering adjustment. With sound it is not the cables or radio links (which contribute delays in the 10's to hundreds of nanoseconds) but propagation of soundwaves in the air. A camera 30 feet away from a podium will record sound 30 ms (1440 samples at 48K) later than the speaker's mic. But it is a fixed offset. Certain that camera and speaker's recorder clocks are running at the same RATE (seconds per second - the clocks can be different) one can confidently make the PHASE adjustment once and rely on it even if the guy goes on for an hour or more.

Since people are throwing out personal experiences I'll give mine. Two XL series cameras drift enough that the differences are plain to the casual observer after about 15 minutes. Now these are my 2 cameras and your 2 may be fine for an hour (or for 5 minutes).

Dave Anderson
June 25th, 2008, 07:32 AM
Wow, this thread went way beyond me. But from what I've read, I have a question. Why can't we just hook our recorders to the line-in of our computers, play the file back on the original recorder and capture it on the pc just like we capture the video from our cams? Wouldn't this make the sampling problem not a problem?

If I'm understanding what I've read, then capturing everything from all the different devices to the same device should result in everything being in sync. Yes it would take more time to capture the audio in real time, but it would be worth it for me to get great audio.

Steve House
June 25th, 2008, 08:27 AM
Wow, this thread went way beyond me. But from what I've read, I have a question. Why can't we just hook our recorders to the line-in of our computers, play the file back on the original recorder and capture it on the pc just like we capture the video from our cams? Wouldn't this make the sampling problem not a problem?

If I'm understanding what I've read, then capturing everything from all the different devices to the same device should result in everything being in sync. Yes it would take more time to capture the audio in real time, but it would be worth it for me to get great audio.


You could do that and probably reduce the drift but you wouold be going between the analog and digital domains with the subsequent risk of generational losses. Like copying an old VHS tape to another tape or an audio cassette to another audio cassette. Probably could get away with it for a few generations with pro grear but quality loss would be a risk

Dave Anderson
June 25th, 2008, 08:41 AM
You could do that and probably reduce the drift but you wouold be going going between the analog and digital domains with the subsequent risk of generational losses. Like copying an old VHS tape to another tape or an audio cassette to another audio cassette.

Ah, didn't think about that. Thanks.

A. J. deLange
June 25th, 2008, 09:29 AM
Why can't we just hook our recorders to the line-in of our computers, play the file back on the original recorder and capture it on the pc just like we capture the video from our cams? Wouldn't this make the sampling problem not a problem?


When you do what you propose the player produces signal at every instance of time at the correct rate (assuming the playback and record clocks are the same which they should be for the most part except for temperature and voltage induced drift) and the "recorder" samples this continuous waveform at the correct times. This is essentially the same as what happens when you import a file from a recorder and tell your DAW/NLE the start and end points of the file (or some part of it) relative to reference start and end points (slate or other short duration sounds)in the camera audio or video. The software computes the values (usually between recorder samples) at the times it needs samples to be "locked" to the video. There is no dual (D/A plus A/D) degradation but the filters which do the interpolation must be properly designed to maintain NPR (noise power ratio - a type of signal to noise ratio).

Jeff Kellam
June 25th, 2008, 01:45 PM
Ah, didn't think about that. Thanks.

Most recorders and soundcards have optical (digital) inputs/outputs nowdays

Richard Gooderick
June 25th, 2008, 04:23 PM
That's because the Zoom recorders use reeeeeeeeal cheap crystals from China.
I believe Sony, Marantz, Edirol use crystals from Japan.

Funny it used to be a running joke about things being made in Japan.
Now China is the joke, and Japan means quality.

Go figure.

My Fostex FR2 LE is made in China and it works fine thank you.

Steve House
June 25th, 2008, 04:32 PM
.... There is no dual (D/A plus A/D) degradation but the filters which do the interpolation must be properly designed to maintain NPR (noise power ratio - a type of signal to noise ratio).

True, but while the signal is in the analog domain we have to be careful how we handle it or else we run the risk of degrading it. Being "careful" with analog signal handling is why a Nagra IV was slightly more expensive that a consumer cassette recorder of similar vintage <grin>.

Shaun Roemich
June 25th, 2008, 04:38 PM
Long story short: Listen to Bill, convert the 44.1Khz files to 48KHz and when everything works, then we can all sit around navel gazing. I would have thought NLEs could do this seamlessly by now but they don't. Convert ALL audio to the frequency of your timeline. Yes, in 30 minutes there may be SOME drift but you will be close. Like 1 or 2 frames close, barring anything unforeseen.

Do it and then tell us how well it worked.