Sample Rate Conversion in Double System Sound at DVinfo.net

Steve House · December 24th, 2005, 09:25 AM

Quote:

Originally Posted by Ty Ford

Actually, the Sony M100 Hi-MD recorder does 16-bit 44.1 wav files and sounds pretty darn nice. ... (That would require some sample rate conversion, which might be problematic ...)

Happened to wake up this morning thinking about sample rate conversions (the mind works in strange ways) and maintaining sync. With audio-for-video using a 48k sample rate but the most common rate with inexpensive recorders being the CD norm of 44.1k, as you point out we need to convert the rate when importing the audio into our project. If we don't, one second's worth of audio, 44,100 samples, played at 48,000 sps will play back in 0.91875 seconds, falling out of sync at a rate of about 2.5 frames for each second of video. So we convert by resampling as you note. To increase the 44,100 samples in each second of audio to 48,000, we have to add 3900 samples. But where does the data for those samples come from? We could duplicate every 11th sample in the source file or we could look at each pair of samples in the original file and interpolate between them to come up with a "guess-timate" of what that 1 or 2 samples would have been had it been recorded at 48k. But it seems to me that either method would potentially introduce noise and/or distortion.

Sample rate conversions where the two rates are even integer multiple of each other seem like they would be straight forward and free of introduced distortion - to record in 96k and resample to 48k you'd just drop every alternate sample. Going the other way, you'd just have to add a sample between each original pair of samples that is the average of the two original samples. But where they're not even integer multiples, such as a 44.1k file going into a 48k project, it looks like there is at least the potential for a signifigant loss of quality from the resampling process itself.

Comments?

Bob Grant · December 24th, 2005, 05:15 PM

I use Sony's Vegas which by default resamples perfectly. You face the same isssues taking audio from a CD and DAT recordings made at 44.1KHz. None of this should be a problem with modern software. I know with early versions of FCP one did have to have everything on the timeline at the same sample rate which meant a trip through another app but I think that's been addressed.
Apart from that though a much bigger problem is that without being able to genlock everything on the shoot things do run at slightly different clock rates. Biggest problem I have are the CD players at live events.
In post I replace the feed from the desk with audio from the CDs and it can be half a beat out at the end of a track. Again with Vegas not a problem, just line up the waveforms at the start and ctl-drag the end to line up the waveforms at the end, typically less than 0.2% correction is required.
Most recorders these days though do support 48KHz, in fact can't say I've noticed one that doesn't. I usually record at 24bit 48KHz, that way I can keep my levels a bit lower so there's no risk of clipping and still have enough headroom to bring the levels up in post.

Steve House · December 25th, 2005, 08:18 AM

Quote:

Originally Posted by Bob Grant

I use Sony's Vegas which by default resamples perfectly. You face the same isssues taking audio from a CD and DAT recordings made at 44.1KHz. None of this should be a problem with modern software. I know with early versions of FCP one did have to have everything on the timeline at the same sample rate which meant a trip through another app but I think that's been addressed.
Apart from that though a much bigger problem is that without being able to genlock everything on the shoot things do run at slightly different clock rates. Biggest problem I have are the CD players at live events.
In post I replace the feed from the desk with audio from the CDs and it can be half a beat out at the end of a track. Again with Vegas not a problem, just line up the waveforms at the start and ctl-drag the end to line up the waveforms at the end, typically less than 0.2% correction is required.
Most recorders these days though do support 48KHz, in fact can't say I've noticed one that doesn't. I usually record at 24bit 48KHz, that way I can keep my levels a bit lower so there's no risk of clipping and still have enough headroom to bring the levels up in post.

I was thinking of the folks who record sound on iRivers or other 44.1khz only recorders versus recording at 48khz right from the start. Say make two recordings, splitting off after the mic and mixer etc so everything else is identical and recording two simultaneous tracks, one at 44.1 and the other at 48. Drop both into Vegas and resample the 44.1 track to 48. If one then A/B'd the playback could you tell a difference?

Ben De Rydt · December 26th, 2005, 05:58 AM

Quote:

Originally Posted by Steve House

Happened to wake up this morning thinking about sample rate conversions (the mind works in strange ways) and maintaining sync. ... To increase the 44,100 samples in each second of audio to 48,000, we have to add 3900 samples. But where does the data for those samples come from?

That's the magic of digital signal processing (DSP). A 44100 samples per second audio stream contains all information necessary to do a perfect reconstruction* of the original 0 to 22050 Hz audio signal. This reconstructed signal can be resampled to 48000 samples per second without losing anything*. Things get trickier if you want to resample to a lower sampling rate, say 22kHz, because then a low pass filter needs to be applied before resampling.

So, in theory, it is perfectly feasible to resample a 44100 Hz digital audio stream to a 48000 Hz one, lossless. In practice there is one big problem: the algorithm needs every input sample from -infinity to +infinity for one output sample. And it needs to look through all those again for the next output sample. This is clearly not workable, so windowing filters are used. These filters limit the amount of input samples needed and describe their respective weight. I can't find any documentation about the windows used by popular audio programs for resampling, not even the amount of input samples they use.

* aside from quantisation errors. All frequencies will be acurately represented but there might be some error in their respective levels due to the 16 bit digitalisation process. There will be some rounding errors on resampling too.

December 24th, 2005, 05:15 PM	#2
Bob Grant Trustee Join Date: Nov 2005 Location: Sydney Australia Posts: 1,570	I use Sony's Vegas which by default resamples perfectly. You face the same isssues taking audio from a CD and DAT recordings made at 44.1KHz. None of this should be a problem with modern software. I know with early versions of FCP one did have to have everything on the timeline at the same sample rate which meant a trip through another app but I think that's been addressed. Apart from that though a much bigger problem is that without being able to genlock everything on the shoot things do run at slightly different clock rates. Biggest problem I have are the CD players at live events. In post I replace the feed from the desk with audio from the CDs and it can be half a beat out at the end of a track. Again with Vegas not a problem, just line up the waveforms at the start and ctl-drag the end to line up the waveforms at the end, typically less than 0.2% correction is required. Most recorders these days though do support 48KHz, in fact can't say I've noticed one that doesn't. I usually record at 24bit 48KHz, that way I can keep my levels a bit lower so there's no risk of clipping and still have enough headroom to bring the levels up in post.