View Full Version : Are microphones have enough sensitive enough for high-res audio?
Yi Fong Yu April 11th, 2008, 11:55 AM i'm an audio n00bie. i've always had this question about microphones. are the decently priced mics technically capable of capturing 24-bit/192kHz uncompressed? how about 32-bit?
there is a lot of talk about high-res audio on Blu-Ray movie playback nowadays and you can now burn PCM audio on Blu-Rays. i'm curious as to our analog equipments can even capture something that high-res and if it is dynamic enough?
i know that as long as the editing chain (after capture of the audio files) through to the ultimate end rendering is kept untouched by downrezzing, high-res lossless audio is possible.
Petri Kaipiainen April 11th, 2008, 01:55 PM The answer is no.
Even the very best microphones have only about 110 dB S/N ratio, which equals about 18-19 bit sample depth. Best electronics and ADCs have about 20-22 bit resolutions. Real life itself does not have 32 bit dynamic range (exept counting nuclear blasts).
But no worry, nothing really has more dynamic range than 16 bits, and nobody has systems and listening rooms where they could hear that kind of dynamic range anyway.
Steve House April 11th, 2008, 02:16 PM While some golden ear'ed types with top of the line equipment may be able to hear a subtle difference between sound recorded in 16 bit and sound recorded 24 bit, it's not at all apparent. Where 24 bit recording becomes important is when you mix various elements in the digital domain. The noise floor is low but it's additive so mixing a number of digital sources can cause an increase in the effective noise floor. The more sources you mix, the higher the effective noise floor becomes. So to deliver a 16 bit track consisting of a number of mixed sources with the maximum resolution 16 bits are capable of, you have to record, mix, and master in 24 or 32 bit and down convert to 16 bits at the final rendering.
Yi Fong Yu April 11th, 2008, 03:06 PM petri,
u know i suspected somn like that =P.
if this is the case, then how come the world is in an uproar over Blu-Rays delivering lossless audio? i mean i can definitely hear the difference of the dynamic range, but i can't tell the difference between 16 vs 24bits.
steve,
gotcha, it's that old adage of recording @the highest quality $ can afford =D. even so, recording@32-bit is impracticable at the moment, kind of useless, agreed?
as for mixing elements. i think all sources should come together and be rendered only ONCE for final output. is this what is done or NOT done most of the times? converting is several times is definitely detrimental to the source.
PS this means when people buy audio recorder, 16 can work just as well as 24bit versions yesh?
Jon Fairhurst April 11th, 2008, 04:08 PM 24-bits has three advantages:
1) When recording, you can leave tons of headroom without getting a noisy and quantized result. With 16-bits, you had to record on the hot side for optimum results, which can result in clipping if you're not careful and lucky.
2) Every time you add two numbers digitally, you gain one more bit of information. Every time you multiple two numbers digitally, you double the number of bits. 24-bit PCM or other higher resolution formats (32-bit PCM, 32 float or 64-bit float) allow you to apply effects and mix while maintaining accuracy. In the final step you can dither and round to 24-bits or 16-bits once for a nice result.
If you were to dither/round to 16-bits after every effect in a complex chain/mix, the digitizing noise would be quite audible.
3) A good 24-bit disc doesn't make cleaner sounds, per se, but the phase information is more precise. This yields a better soundstage in critical listening situations. When comparing DVD-As to CDs I don't hear better sound, but if I sit in the sweet spot and focus, I perceive a better image.
I rarely do sweet spot, eyes closed listening. I'm usually doing something else while enjoying music. So I save my money and buy CDs.
Anyway, when talking about 24-bits it's important to talk about the three phases (capture, mixing and playback) separately. The role of bit depth is unique to each.
Steve House April 11th, 2008, 05:22 PM ...steve,
gotcha, it's that old adage of recording @the highest quality $ can afford =D. even so, recording@32-bit is impracticable at the moment, kind of useless, agreed?
as for mixing elements. i think all sources should come together and be rendered only ONCE for final output. is this what is done or NOT done most of the times? converting is several times is definitely detrimental to the source.
PS this means when people buy audio recorder, 16 can work just as well as 24bit versions yesh?
When I referred to mixing, I wasn't thinking of multiple passes but rather the number of parallel source tracks going into the final mix. In a film mix you have tracks for dialog (and sometimes checkerboarded so you have many parallel tracks), music, ambience, room tone, foley, production effects, etc, etc, etc. When you do the final mix all of these tracks are combined into the stereo or 5.1 final track. You're adding many tracks together at once and when you're mixing and processing in the digital domain, mixing 'in the box' rather than sending it out as audio through an analog board with summing amplifiers, the advantages the John mentions of higher bit depths begin to make a difference in the accuracy of the final result.
John Miller April 11th, 2008, 05:38 PM The microphone is nothing more than a transducer that generates an analog signal. Bit depths of digital signals are irrelevant to the microphone. It's the analog-to-digital conversion that is the key point and that can be done at 32-bit if you like but unless the microphone is extraordinarily low noise (as too the amplifiers), half those bits will be random noise. The dynamic range of the digitised audio is fundamentally limited by the sum of all the noise sources.
Does this mean Blu-Ray audio claims are wrong? Absolutely not.
Once the audio has been recorded, the dynamic range can be increased or decreased. This has been done for years with electric guitars using expanders and compressors.
So, any audio can be modified to sound as if it has a wider dynamic range or narrower.
Petri Kaipiainen April 12th, 2008, 07:38 AM There are certain "quality windows" so to say in recording and reproduction. The smallest window determines the final quality of the system.
- listening room, reproduction system: typically 70-90 dB (13 to 15 bits) in a normal home (ambient noise is about 40-50 dB minimum, very few systems can put out clean 110 dB SPL full range)
- 16/44 digital, 5Hz-20 kHz range, 96+ dB dynamic range
- Analog components (mics & mic pre-amps) at best give about 110 dB, 10Hz - 40 KHz, about 19 bits /96 Khz sampling
- 24/96 sampling at best is better than any analog component before of after recording, an overkill in that sense.
- 24/196 kHz recording really does not bring any improvement, see above.
Jon Fairhurst April 12th, 2008, 11:11 AM Petri,
That's an excellent analysis, based on maximum signal to noise floor ratios. However, this doesn't consider the effect on phase accuracy introduced to the signal by bit and time quantization.
Also, the noise in an ambient environment is completely uncorrelated with the signal, so the brain is able to do a good job of separating the two sources. Time and level quantization noise *is* correlated to the signal. In effect, this noise is silent and "rides" on the signal. The brain can no longer separate the signal from the noise. The signal is just, well, different.
For instance, with low-bit rate MP3 Internet radio (an extreme example), there's no hiss or background noise. All noise is in the signal. Personally, I find that it's okay for short clips, but I get fatigued within ten or twenty minutes. In effect, it doesn't sound nearly as bad as it feels. No Internet radio for me!
For me DVD-A doesn't sound better than CD, but I've found that it can produce a bigger, clearer soundstage due to more accurate phase information. It could possibly be less fatiguing as well, under controlled, long-term listening situations.
It's certainly subtle. And since I don't listen to music for long periods with my head clamped into the "sweet spot", CDs are fine with me. However, I do know people who listen to pre-recorded music clinically for long periods. For them, I'd recommend SACD or DVD-A over CD.
Petri Kaipiainen April 12th, 2008, 02:09 PM There is some possibility that lowpass filters cause phase problems in the highest frequencies near 20 kHz in 44 kHz sample rate recordings. Still there is no conclusive evidence that it can be heard in real life. As far as I know there are no reliable double blind tests made where people could hear the difference between 44/48/96 kHz sample rates. Many people claim they hear it, but...
Here is an iteresting test file with which you can try if you can hear the difference between 16/44 and 24/96: http://hosted.filefront.com/Jullepoika/ Download also the explanation TXT file. The main file is 470 MB in size and requires true 24/96 playback capability.
The final quality of the recording is not really dependent of possible subtle differences in sample rate, but mic choice and placement and choice of venue. Talking about classical acoustic stuff here of course. Shooting video this hi-fi obsession of higher sample rates and bit depts is even less relevant. I dare to say only one home theatre system in 10000 is even in theory capable of discerning between 16/48 and 24/96, and out of those super systems maybe one in 100 listeners could hear any difference. That's like about 10 persons on the whole USA. Nothing to worry about on practical level of video producion.
Steve Oakley April 12th, 2008, 02:32 PM you know if it were just about sample rate to highest frequency, there wouldn't be much argument that even 44.1 is past what most people can here and a lot of sound systems can even reproduce. however, I just posted this somewhere else a few days ago - in commenting on how I have found that FCP can now capture 192/24 !
no, not even dogs can hear 96khz freqencies, thats not the point.only some mics can make it to 50khz with reasonable level, very expensive ones. its that you can have more samples per waveform, and a more accurate reproduction of the normal heard frequencies, especially the higher ones of 6khz-15khz. for example, at 10khz, using 48K sampling you get almost 5 samples, the low points, the 2 middles, and hopefully the high point, or maybe not. is it a triangle wave, sine wave, or something else ? can't tell because there are not enough samples. do it again at 192khz and you get 19.2 samples. you now not only know what wave it is, but you've captured much more of its nuances in more then enough detail to reproduce it so that a person would not detect its not real.
how would the average person perceive it ? as warmer and more detailed, more "live". most high end audio guys say it sounds as good as analog tape. if you've not heard a 2" audio recorder in a studio playing back an original recording... its a long way from what you get at iTunes ;). its pretty amazing how good it sounds, nothing like what you hear on CD or record.
for normal video purposes, 96k or 192k is overkill. for high end features, what they ( FCP/apple) are now stepping into. very nice marketing item to "play like the big boys"
Jon Fairhurst April 12th, 2008, 02:58 PM The final quality of the recording is not really dependent of possible subtle differences in sample rate, but mic choice and placement and choice of venue. Talking about classical acoustic stuff here of course. Absolutely true! Not everybody will agree on which mic sounds better, or which mic placement is the best, but most everybody will be able to hear a difference between two differently engineered recordings.
I really like the ability to record at 24-bits, so I can leave lots of headroom without getting into the noise. And I store intermediate files in high-res, because it's good practice and has little cost. But I'm happy delivering the end product in 16-bits. The artistry and engineering are orders of magnitude more important than a few more bits or kHz.
A. J. deLange April 12th, 2008, 03:05 PM There is some possibility that lowpass filters cause phase problems in the highest frequencies near 20 kHz in 44 kHz sample rate recordings.
That's why clever designers don't use analogue antialising filters. They oversample appreciably which makes it possible to use a very sloppy analogue antialising filter which prevents folding at the higher (than 44 kHz) sampling rate, then build a super phase linear FIR digital antialiasing filter into a decimator. The process puts out sample at 44 kHz but there is no aliasing or phase distortion.
Petri Kaipiainen April 13th, 2008, 12:24 AM its that you can have more samples per waveform, and a more accurate reproduction of the normal heard frequencies, especially the higher ones of 6khz-15khz. for example, at 10khz, using 48K sampling you get almost 5 samples, the low points, the 2 middles, and hopefully the high point, or maybe not. is it a triangle wave, sine wave, or something else ? can't tell because there are not enough samples. do it again at 192khz and you get 19.2 samples. you now not only know what wave it is, but you've captured much more of its nuances in more then enough detail to reproduce it so that a person would not detect its not real.
Not so.
Sound is made up form a bunch of sine waves and the sum of these forms the complicated composite wave we can see in an audio editor. All such complex waveforms can be reduced to a set of sine waves (fourier series). The fact is humans can hear only to a certain frequency (about 20 kHz) and capturing and reproducing higher components can not be heard. The beauty of the sampling theory is that all waveforms below half of the sampling rate can be reproduced PERFECTLY. Having more samples than necessary DOES NOT improve the quality of those signals, only captures also higher frequencies. Which in this case in not necessary, as those components lay above our hearing limits.
The view you brought up is a common fallacy and not true.
Wayne Brissette April 13th, 2008, 07:09 AM While I have not heard or seen this mic, the Sanken CO-100K looks very interesting for those who want to try to test/hear if a high def mic exists.
http://www.sanken-mic.com/en/product/product.cfm/3.1000400
Personally, I'm kind of a skeptic in this area. I don't think we'll really ever be at high def, and even if we get there, I'm not convinced that people will really be able to tell the difference.
Wayne
Petri Kaipiainen April 13th, 2008, 08:08 AM The major problem with this so called High Defenition recording is the fact that is is possible to achieve about 22 bit resolution with best mics, and frequency extension to above 50 kHz, but never with the same mic!!!
It is possible to have quitet mics only with large diaphrams, but high frequency extension only with small diaphram mics. These requirements exclude each other out. Thus true high defenition is just a dream, having high bit rate and sampling rate does not do any good if mics and analog components can not deliver, and they can not deliver even 24/96 quality.
Sanken super mic CO-100K: noise level: 22dB-A, max sound pressure 125 dB. Basically 17 bit performance in dynamic range, CD can achieve this with proper dithering.
DPA large diaphram 130 volt mic 4041-S: noise level 7dB-A, max sound pressure 144 dB. Basically 22 bit performance. But frequency range only extends to 20 kHz.
So there you are, no matter what marketing men try to sell, real high extension and low noise is impossible to achieve at the same time with microphones, only by synthesis.
Steve Oakley April 13th, 2008, 09:51 AM Not so.
Sound is made up form a bunch of sine waves and the sum of these forms the complicated composite wave we can see in an audio editor. All such complex waveforms can be reduced to a set of sine waves (fourier series). The fact is humans can hear only to a certain frequency (about 20 kHz) and capturing and reproducing higher components can not be heard. The beauty of the sampling theory is that all waveforms below half of the sampling rate can be reproduced PERFECTLY. Having more samples than necessary DOES NOT improve the quality of those signals, only captures also higher frequencies. Which in this case in not necessary, as those components lay above our hearing limits.
The view you brought up is a common fallacy and not true.
ok, if the reproduction was perfect, why can I hear the difference between 44.1 and 48 K ? and certainly 96k ? I think even a "tin"ear could pick out 32K easily enough. the problem between theory and practice is reality. if one where to model sound in terms of pure sine waves, they you lose other transients which are not curved. you would be rounding them off and creating a less accurate sample of them. the more samples, the more accurately the wave from is sampled, even if there is some mathematical cheating of rounding. otherwise no one would be able to hear the difference between sample rates, and clearly many people can. what people lose site of is that 44.1 is a number from the 1960's. back then digital technology was pretty limited and 44.1kz was considered the threshold of where some one started to have a problem between live and source, but it was not the ideal point, just the minimum. if they could of drove faster sample rates back then, they would of but the technology became too difficult or expensive.
Petri Kaipiainen April 13th, 2008, 10:00 AM Have you ever tried a blind test where the only difference in the signals is the sample rate (high cutoff frequency)? If you really can hear reliably a difference of 44, 48 and 96 you are a truly exeptional person. Most of us can not hear much above 16 kHz.
We do loose those transients which are too sharp, true, but those transients contain frequencies which are outside of human hearing. And even those "which are not curved" are composed of even higher sine waves... But the crux of the matter is that those frequencies are not heard, thus they need not be recorded! There is nothing lost in what we hear.
You say "many people can [hear the difference between sample rates]", but how come in the AES test out of 100 test subjects NONE was able to tell the difference between SACD and CD? Wrong people chosen for the test, again, why did they not ask you or many other of those who claim to hear the difference? Maybe nobody can hear the difference in real life, after all?
Try the test file at http://hosted.filefront.com/Jullepoika/ to see if you really can tell the difference between 16/44 and 24/96. You only need a true 24/96 capable system to test it.
A. J. deLange April 13th, 2008, 02:42 PM ok, if the reproduction was perfect, why can I hear the difference between 44.1 and 48 K ? and certainly 96k ?
The reproduction isn't perfect but it might as well be.
There are 2 possibilities as to why you may hear, or think you hear differences. One, the more likely these days, is that that marvellous piece of equipment between your ears is equipped with a special processing unit called the "imagination". It is extremely powerful device capable of permitting you to hear what you expect to hear even when it isn't there. To say that you can hear a difference is only meaningful if you did the test with an A,B,C box in which two of the buttons are wired to one signal and the third to the other. You cannot know which is which and ideally the person conducting the test shouldn't either. This is called a double blind triangle test. You must pick which two of the buttons are wired to the same signal and determine which sounds better and you must do this on enough different sound samples that the result has statistical significance. When triangle tests are performed the ability to hear the distinctions that are widely reported on in the popular audiophile press disappear. That's why you don't see much about them. A colleague mentioned to me the other day that he sometimes reads a forum where the rules have only one prohibition: mention of triangle tests.
The other possibility is that the low sampling rate system was not properly implemented. In today's world that's not likely but it certainly has happened in the past and will probably continue to do so. Most OEMs use A/D converters in which proper DSP is built in. The architecture is usually sigma-delta in which at most a few bits (1-4) are sampled at a rate of several MHz and these samples filtered and decimated to provide the outputs of up to 24 bits. The manufacturers test their designs extensively and report test data to the OEMs who integrate the chips into the stuff we buy. You'd have to work pretty hard at it to screw up with one of these devices but I suppose it can be done.
The sampling theorem (note: it is a theorem - it has a proof) requires that signals be "strictly band limited". This requires that they be of infinite duration. It should thus, be obvious that there are no devices which adhere exactly to the sampling theorem because in the real world there are no strictly band limited signals. Nonetheless, for all practical purposes the sampling theorem does apply. The "errors" (quantizing noise, aliasing) can be controlled with proper DSP and pushed down to better than 100 dB below the signal (1 microvolt of noise re a 1 volt signal is 120 dB). I won't say there is no one that can perceive that but 999 out of as thousand who think they can are fooling themselves.
Yi Fong Yu April 14th, 2008, 09:45 AM so, to sum it up:
1. mics can't capture more than what humans can hear.
2. humans can't heart worth a damn anyways
3. capture @the highest quality you can so when you render after final edit, the quality is maintained.
very interesting =D. thanks ya'll =D
|
|