|
|||||||||
|
Thread Tools | Search this Thread |
April 8th, 2006, 07:00 PM | #1 |
Regular Crew
Join Date: Oct 2004
Location: Pittsburgh, PA
Posts: 28
|
44.1khz vs 48khz: How important is it?
Hi,
I'm working on a project that I'm showing tomorrow morning, and I went over to where I'm presenting it so I could test things out. The video's going to work out fine, but when I was listening to the audio, things wern't so great. The problem was that all of the high-frequency sounds were distorted... a very annoying problem. I'm now trying to determine the problem, and to fix it. After a bit of twiddling, I noticed that my soundtracks were audio files at 44.1khz. In Premiere, the project settings, and export settings for my project are all at 48khz. I tried rendering an audio clip at 48khz, then another down at 44.1khz, and the difference is amazing:
As you can hear in the audio clip, the 48khz version is very distorted - espically at the higher frequencies. But the 44.1khz version sounds beautiful. I find it difficult to understand why taking a 44khz file and upgrading it to a higher frequency would degrade the quality of it. Is it the same reason that doubling the size of an image in Photoshop distorts it? My other question is, what should I do now? I have a Matrox plugin for Premiere, and it won't let me change the audio settings in my project settings to anything other than 48khz, so I'm stuck there. At first I thought of reencoding all of my sound files at 48khz to match my project settings, but it occurs to me that probably wouldn't help anything - since upgrading seems to be the problem in itself. So I suppose that my only solution is to just work in 48khz in Premiere, and remember to render out at 44.1khz? Seems like a solution that apparently works, it's just seems like a bit of a hack. While editing in Premiere, the sounds are distorted, too (because my project settings are stuck at 48khz) - which is disapointing since I can't hear them in good quality until the end. I've never had any formal training with audio frequencies - I've just been told that you need to be sure to keep your frequency settings the same. I've never understood why, though. I guess this is it. So am I missing anything? Would you agree with the conclusions that I've drawn? Any other thoughts? Thanks in advance. :) |
April 8th, 2006, 08:41 PM | #2 |
Inner Circle
Join Date: Apr 2003
Location: Aus
Posts: 3,884
|
theres afew things to consider...
audio for DV/DVD/HDV/DVCProHD100 is acquired at 48k, wheras for CD and MOST commercial soundcards its at 44.1k Now when ur editing, your NLE is downsampling that 48k audio to 44.1 to adhere to the Hardware restrictions of the soundcard itself. THEN when ur outputting to DVD, it has to scale that back up to 48k If however your souncard is native 48k (like a soundblaster or afew other higher end cards) then you hsould theoretically be able to bypass that resampling stage. Without enough info, what your prolly experiencing is is a downconvert from 48k to 44.1, your then adding your filters and mixing that in a 44.1 environment, then dithering to 48k, but more than likely your soundcard cant interpolate in a clean fashion. Im trying to be very basic here coz u mentioned that youre not too accostomed to audio, but this seems to be it.. You also mention matrox... from here, what your prolly doing during capture is connecting the audio out of the matrox to the line in of your soundcard, as this is how the RTX100 works, irrespective of whether your using the BoB fr analogue or digital capture. The interface is analogue, pretty clean at that, but its still analogue so if u captured with the RTx (as opposed to a straight 1394 capture bypassing the RTx) then your capture wont be as clean. I gess from here, your noext option is to output your video and audio but add a 10 second 1khz test pattern with a tone. Then create a new project in 48k and stick to it, then remaster your audio and resample to 48k in Audition then swap your auditrack, or export the 2 files (vid and audio) create a new project, import ur video, resample your audio in audition then import THAT. Im not an adobe fan simply due to the restrictions their products have with formats, but this should work |
April 9th, 2006, 05:29 AM | #3 |
Inner Circle
Join Date: Mar 2005
Location: Hamilton, Ontario, Canada
Posts: 5,742
|
Upping the sample rate after recording will not add information that's not already there - 1 second's worth of water that flowed under the bridge was captured and divided up into 44 thousand buckets. Will taking that same water and redistributing it into 48 thousand buckets give you more water?
Maybe it's my geezer's ears but I don't hear much difference on my system between the two clips when playing back using either SoundForge or Adobe Audition 2, an Echo Audiofire audio interface, and Sony MDR7506 headphones. Maybe a little metallic brittleness on the high end of the 48kHz file but nothing I'd find objectionable. Both files have a noticable bit of room reverb when you listen closely if that's what you're hearing. As an aside, the sample rate for HDV audio is 48kHz by specification so rendering your project at 44.1 for the final output is not going to work out - that's simply not an option. So at some point the 44.1 files you're importing - music from CDs etc - have to get converted to 48kHz. While most NLEs can do a pretty decent job of it, IMHO you'll get more reliable results if you resample them one at a time, converting to 48kHz in an audio program before putting them into your project. If you have a choice, such as working with original wild sound you're recording yourself, try to record the original at 48 kHz or 96 kHz. Sample rate conversions are less likely to introduce distortions downconverting then when upconverting (ie 48kHz --> 44.1kHz goes better than 44.1kHz --> 48kHz so if you need output to CD as well as video, recording at 48kHz is still the way to go) and also when the two rates are integer ratios of each other (96kHz --> 48 kHz is better than 88.2kHz --> 48kHz). That integer relationship is why a number of cards and digital recorders give you the choice of 44.1, 48, 88.2, and 96 kHz - 44.1 or 88.2 if the final destination is audio CDs and 48 or 96 if the final output will be for video or DVD-A audio. Another factor than can affect your monitoring is the behavious of many soundcards when presented with materials of differing sample rate. The card itself often locks on to the lowest rate presented to it in a session. If you send it a stream containing 44.1 kHz material it will set itself to that rate and any 48 kHz material is going to sound distorted even if the file itself is okay. Some big offenders are the computer's system sounds - many of the default Windows sounds are at 22 or even 11 kHz. All it takes is a "click" when you hit a mouse button or a "ding" when email arrives to reset the card to that even lower sample rate and make the whole project sound distorted. Go into the control panel and completely disable the system sounds altogether - do you really need to have the computer say "You have mail!" <grin> Not so bad when editing but think if it happens when recording from an external source!
__________________
Good news, Cousins! This week's chocolate ration is 15 grams! |
April 9th, 2006, 07:49 AM | #4 |
Fred Retread
Join Date: Jul 2004
Location: Hartford, CT
Posts: 1,227
|
Steve, not to take away from the appreciation we all have for experienced hands like DSE and others, I'd like to acknowledge the value you add to this forum. You really do your homework, and have a knack for explaining things clearly. You help a lot of people.
I have geezer's ears too, and I can't hear the difference in those samples. And according to my understanding, if anyone really can hear the difference between a 44.1 khz and a 48 khz sampled signal, simply on the basis of the sampling rate, Nyquist was wrong. If there really is an audible difference that better ears can hear, I suspect that it has to do with distortion added in processing. However, I don't understand how that would work, do you? For instance, do you happen to know what kind of distortions are introduced in upconverting and downconverting, and why one direction is more prone to distortion than the other? BTW, what is "dithering" in this context?
__________________
"Nothing in the world can take the place of persistence..." - Calvin Coolidge "My brain is wired to want to know how other things are wired." - Me |
April 9th, 2006, 08:49 AM | #5 |
Major Player
Join Date: May 2005
Location: Stockholm - Sweden
Posts: 344
|
I can hear a very noticable difference between those two clips. Listen to Test-audio-48khz.wav and listen to when he says "But i like candy" and compare it to the very same scentence in Test-audio-44khz.wav. Steve and Fred, cant you here the difference? :)
Test-audio-48khz.wav sounds like a hard compressed .wma-file where the treble is "fuzzy". Allen, Matrox cards demands 48Khz audio for best results. Using other sample rates than 48Khz will give poor results, as you have noticed. Convert the 44.1 Khz audio in any audio application to 48 Khz and import it into Premiere and use it instead of the 44.1 Khz audio. You will hear the difference, guaranteed. /Roger |
April 9th, 2006, 09:34 AM | #6 | |
Inner Circle
Join Date: Mar 2005
Location: Hamilton, Ontario, Canada
Posts: 5,742
|
Quote:
Thank you for the compliment - it's really appreciated. I'm just learning myself but I'm a quick study, I do have a bit of professional experience from back in the analog days, and I like helping others, it helps me understand it all better myself when I have to explain it. I agree, I doubt anyone can hear the difference between recordings made at 44.1 and 48 either. I'm thinking of interplolation errors that could creep in during the conversion process more than anything else. Because 48 kHz has more sampling points than 44.1 ad they don't line up in a nice neat one-to-one correspondence, a awfully lot of those samples are going to lie between the 44.1 samples while a lot of the 44 samples are going to be discarded. Lets say one of two adjacent samples has double the numerical value of its neighbor. The system has to come up with a number to assign to the new 48k sample that is being created that lies in between them - what value should it pick? The higher? The lower? Halfway between? I can see the algorithm assuming it should be one number when it really would have been something else had we been digitizing the original signal. The's also why resampling at integer multiples or divisors would be less likely to distort, the math is simpler and less prone to round-off error. If we're downsampling from 96 to 48 all we need do is throw away every other sample. If we're upsampling, each new sample lies exactly halfway between the old and all we have to do is set its value to the average of the two old samples on either side of it. But if that new sample has to be inserted at .345654 of the time between two old samples and the next one is at .897667 of the time difference, and only a few of the new samples fall exactly on the same time mark as the old ones, now we got the problem in interpolation. Of course I might just be splitting hairs and would never let this prevent me from doing sample conversions to use the files I needed. But if I were choosing a recorder to use for double system sound or choosing a sound card, an inability to record at 48 kHz would be an absolute deal-breaker. When making a recording on the computer such as VO to use in a video or if I had a choice of settings on location or in a sound library to purchase, I'd never intentionally go with 44 if 48 was an option.
__________________
Good news, Cousins! This week's chocolate ration is 15 grams! |
|
April 9th, 2006, 09:42 AM | #7 | |
Inner Circle
Join Date: Mar 2005
Location: Hamilton, Ontario, Canada
Posts: 5,742
|
Quote:
__________________
Good news, Cousins! This week's chocolate ration is 15 grams! |
|
April 9th, 2006, 04:21 PM | #8 | |
Fred Retread
Join Date: Jul 2004
Location: Hartford, CT
Posts: 1,227
|
Quote:
__________________
"Nothing in the world can take the place of persistence..." - Calvin Coolidge "My brain is wired to want to know how other things are wired." - Me |
|
April 10th, 2006, 12:33 PM | #9 |
Major Player
Join Date: Jun 2004
Location: McLean, VA United States
Posts: 749
|
The fact that interpolated samples don't fall at convenient spots doesn't have an effect on the quality of the interpolation. The interpolated value is not calculated from just the adjacent samples but is a weighted sum of many samples before and after the pair that are closest to the new sample point. The quality of the interpolated result depends on how many samples are used and on the weights applied to them before summation. Consider a rate conversion in which a new sample that falls 10% of the way between old samples is required. The multiplicity of points used implements a filter which has a very flat amplitude response and a very linear phase response which results in a delay of 1/10th of the original sample period. The next sample may fall 2/10ths of the way between existing samples and so another filter with similar amplitude response and equally linear phase response but with a different slope (to give 2/10ths of a sample of delay) will be required. Etc. The filters that do this are called "polyphase" filters because of this. The quality of the interpolation depends on how well these filters produce flat amplitude and linear phase.
"Rational" conversions i.e. to 3/4 or 8/7 or 15/16ths of the original frequency can be done without polyphase filters. For example 15/16ths can be done by upconverting by a factor of 16 and downconverting by 15 (the up and down conversions can be done by the factors of the numerator and denominator e.g. up by 4 then down by 3 then up by 4 and down by 5) using fixed, symmetrical weight (but otherwise similar) filters. Again the quality depends on the length of the filter, flatness of amplitude response and linear phase (which is easy to get in filters of this sort). Once the signal is in the digital domain provided that it was captured free of aliasing with an NPR (A/D quality measure) commensurate with the number of bits (and that's a whole additional set of issues) it is possible to convert to any other sample rate with out audible degradation if it is done right. Obviously engineering trades are necessary and "right" here isn't a black and white thing. In general it is good to sample at as high a rate as possible so that the analogue antialiasing filter has a long transition band. This minimizes phase distortion at the high frequencies. Then antialiasing can be done with a "brick wall" digital filter and interpolation to any rate greater than twice the highest frequency passed by the antialiasing filter is possible. Again I reiterate that while how it should be done is engineering's concern how it is done may be detemined more by marketing. See Dilbert. |
April 10th, 2006, 02:58 PM | #10 | |
Inner Circle
Join Date: Mar 2005
Location: Hamilton, Ontario, Canada
Posts: 5,742
|
Quote:
__________________
Good news, Cousins! This week's chocolate ration is 15 grams! |
|
April 10th, 2006, 03:55 PM | #11 |
Major Player
Join Date: Jun 2004
Location: McLean, VA United States
Posts: 749
|
I'll wager a beer of your choice that the new high frequency energy is aliasing (and that what it sounds like to me too) though I'd have to understand lots more of the details of how the various bits of software and hardware involved actually work before I could say with certainty.
Fred asked what dithering is. The quantizing noise produced by A/D conversion or by rounding after doing arithmetic on samples (such as the filtering involved in up or down sampling) is correlated with the signal. If the signal has a broad spectrum the quantizing noise will also, thus, have a broad spectrum i.e. it will be noiselike. However if the signal has a narrow spectrum such as a pure tone from, for example, a solo violin, the quantizing spectrum will consist of narrow lines as well. There are those that argue that these are more annoying than noise-like quantizing noise. To decorrelate the quantizing noise from the signal it is relatively common practice to add low level noise to the signal prior to A/D conversion or rounding. If the level is low enough the degradation to signal to noise ratio is not noticeable (especially if the noise is filtered to put most of its spectal content at the higher frequencies where we don't hear it so well - or at all if you are my age). This is dithering. The web site http://www.digido.com/portal/pmodule...er_page_id=27/ goes into lots more detail. |
| ||||||
|
|