Digital audio fundamental question

after such a brilliant seminal paper on your experiences with hifi how could you then follow up with this question ??
Nothing seminal about it:).
That one was purely listening experience based over the many years and with various equipments used.
And based on knowing that the brain has an amazing way of compensating for many "defects" to still be able to enjoy music of perceived high quality, if one gets it to work for you and not against you.
This question shows that my knowledge of fundamentals was, and many ways still is, flawed.
It should also demonstrate that these are two very different things, that can easily operate in separate domains.
See post # 40 here as well.
 
Last edited:
Let me test my understanding by taking a stab at an answer:).
We do need dips, but these are nothing more than the variations in peak heights, all the way down to zero height. So if all the peak heights are captured, so are the dips.
And the wave is just a representation of the sound vibrations, using succeeding peak height information, and "how many peaks are present in a second" information. Thinking of the wave as more than that leads to thinking that there have to be data points for other parts of the "wave". There aren't any, in reality.
I am now pretty sure that the wave you see on an oscilloscope is drawn by the electronics in the instrument by just these two bits of information, and we end up thinking that there are data points for all of points on the lines shown that connect these information points.
For digital to be 100% truthful it has to carry all the information about the peak heights for as many times as they occur in a second. And sampling 40000 times a second allows this to be done for all the data points that exist for sound frequencies up to 20khz. The 44000 times is for a margin of safety.
The times when it does not happen is not because of a flaw in this reasoning, but because of engineering constraints that have to be overcome in converting this state of affairs to just as truthful electrical signals that reach the speakers.
Digressing again, my experience of modern digital audio equipment tells me that these constraints have now been overcome to the point that further progress isn't to be audibly heard in a well constructed listening test. And the solutions overcoming these have by now also found their way to cheap digital components.

Sawyer,

Sound is a wave is quite an established scientific fact. You can find lots of physics articles on it. The whole speaker industry is based on that. To see it visually, find ruben's tube experiments. Here is one link https://www.youtube.com/watch?v=HpovwbPGEoo

As for instruments drawing a wave from just 2 peaks, well they can't without introducing artifacts called folding and aliasing. You don't have to take my word for it. The Nyquist-Hannon theorem is precisely used to avoid these artifacts.
 
Sawyer,

Sound is a wave is quite an established scientific fact. You can find lots of physics articles on it. The whole speaker industry is based on that. To see it visually, find ruben's tube experiments. Here is one link https://www.youtube.com/watch?v=HpovwbPGEoo
Umm...not quite.
See:
Sound is a Pressure Wave
And a quote from this:
The above diagram can be somewhat misleading if you are not careful. The representation of sound by a sine wave is merely an attempt to illustrate the sinusoidal nature of the pressure-time fluctuations. Do not conclude that sound is a transverse wave that has crests and troughs. Unquote
As you know, all that speakers do is to vibrate back and forth. They don't produce sine waves.
 
Last edited:
1. How would sound in the air look if we could see it?
I got below impression from my degree classes of Physics. :) There could be n-number of disturbances till it enters your ear drum. Still I am not sure how one be different if produced from Digital-Analog conversion. But the content of music could be different in Pure Analog rig v/s digital rig.
sound_waves.gif

u11l1c2.gif

soundwaves1.png
 
It is also interesting to note that while we debate so much about audio, our vision easily perceives a discrete set of images as absolutely smooth motion video, as long as the discrete set of pictures get refreshed (change themselves) at a rate of more than a measly 60 frames a second.

Why do we not talk about information loss etc between the samples? Is it because we don't really care? And ironically, vision is by far our most important sense and sensory input.
 
Leave aside the discussion, you guys have been over it all night? My wife would safely assume me being in a secret affair if I keep posting messages at these hours in the night. Heck, I have been questioned over it even last night while I checked this thread at about 11:45 when it has just 2 pages & 19 replies!

Please continue your valuable discussion, I am trying to catch it but feels like it is all going over my head now. Coming back to the subject, is a loss-less format like say FLAC ALAC etc. truly loss-less when compared to its analog recording? Of course I am assuming here that the analog recording is loss-less.
 
Last edited:
It is also interesting to note that while we debate so much about audio, our vision easily perceives a discrete set of images as absolutely smooth motion video, as long as the discrete set of pictures get refreshed (change themselves) at a rate of more than a measly 60 frames a second.

Why do we not talk about information loss etc between the samples? Is it because we don't really care? And ironically, vision is by far our most important sense and sensory input.
Actually, that was the analogy I used in the past to rebut people that claimed that the soul in the music lives between the 0s and 1s, and can only therefore be heard in analog systems.
But I now understand that it is a defective analogy, because in video the brain is being fooled..the persistence of vision thing.
In digital audio, depending on the frequency being heard and sampling rates employed, it turns out that there is nothing in between the 0s and the 1s to be lost. This is an external reality, not a brain perceived one.
 
Last edited:
is a loss-less format like say FLAC ALAC etc. truly loss-less when compared to its analog recording? Of course I am assuming here that the analog recording is loss-less.
Yes it is. For the frequencies in question for the sampling rates employed.
 
Yes, I do, when something is gnawing me like this little thing has been all of yesterday!
Ok, then about the 16bit - it would mean that it fully covers the silent side of the range, more than 16 is for the loudness, to go to, as you said, deafness.
16 then is selected for how loud it can go taking into account the human levels of listening without damage to the ears?
But why can't the same thing be achieved with say, 8 bit and more amplification?
Bumping this one up for an answer. I think I have it, but am not sure - I have figured out the the bits are the sample size of the samples taken at the designed frequency, the latter being 44100 times a second for CDs.
 
Leave aside the discussion, you guys have been over it all night? My wife would safely assume me being in a secret affair if I keep posting messages at these hours in the night. Heck, I have been questioned over it even last night while I checked this thread at about 11:45 when it has just 2 pages & 19 replies!

Please continue your valuable discussion, I am trying to catch it but feels like it is all going over my head now.
I woke up at 2 am because this was bugging me...
Saw the light at 3 am:). Thanks largely to Arun and Thad.
Take the time to read it at leisure. I have learnt more about this sampling and sound waves thing in this thread than I have in the last few years, where digital audio is concerned.
 
Last edited:
It is also interesting to note that while we debate so much about audio, our vision easily perceives a discrete set of images as absolutely smooth motion video, as long as the discrete set of pictures get refreshed (change themselves) at a rate of more than a measly 60 frames a second.

Film cinemas use traditionally 24 frames per second. See this. One doesn't need so many frames per second (60) to fool the eyes into believing that what it is seeing is a continuous motion video and not a series of discrete frames. 24 is sufficient.
 
Bumping this one up for an answer. I think I have it, but am not sure - I have figured out the the bits are the sample size of the samples taken at the designed frequency, the latter being 44100 times a second for CDs.

Yes, bits are the sample size. A 16 bit signal can store about 65,536 samples where as 24 bit can store 16,777,216 samples. The bit's represent Signal to noise ratio. Each bit is approx 6 db SNR. So, a CD has around 96 db SNR where as 24 bit has about 144 db. Now, this is SNR, so if we assume a noise floor at 40db, then CD can go as loud as 134db. Higher bits will give us more dynamic range. Again, how much we human can perceive and appreciate, there is a physical limit.

Also, this bit depth is not applicable to lossy compression codecs like MP3 etc. So, that's why we notice loss of dynamics on heavily compressed tracks.

About loudness - 8 bit can also go loud with amplification. But it will lift the entire spectrum and make it loud including the noise. The difference between 8 bit vs 16 bit is that the signal to noise ration is 48 db vs 96 db. That gives better dynamics.

The music industry does not have a standard about loudness levels. Movie industry does. THX reference sound is 85 db + 20 db headroom. Since our hearing is not so good at low frequencies, to get the same perceived loudness, the reference for LFE channel is 95 db + 20 db.
 
Last edited:
Yes, bits are the sample size. A 16 bit signal can store about 65,536 samples where as 24 bit can store 16,777,216 samples. The bit's represent Signal to noise ratio. Each bit is approx 6 db SNR. So, a CD has around 96 db SNR where as 24 bit has about 144 db. Now, this is SNR, so if we assume a noise floor at 40db, then CD can go as loud as 134db. Higher bits will give us more dynamic range. Again, how much we human can perceive and appreciate, there is a physical limit.

Also, this bit depth is not applicable to lossy compression codecs like MP3 etc. So, that's why we notice loss of dynamics on heavily compressed tracks.

About loudness - 8 bit can also go loud with amplification. But it will lift the entire spectrum and make it loud including the noise. The difference between 8 bit vs 16 bit is that the signal to noise ration is 48 db vs 96 db. That gives better dynamics.
I think I understand.
Just to be sure, based on the above: if 8 bits were to be employed, and the noise floor to be kept at 40 db, 8 bit would be able to go as loud as 88db? Till 88db sound levels, it would sound exactly the same as CDs for dynamic range? Or, more relevant, the use of 24 bits would be perceived only for sound levels in excess of 134db? Who can even survive those?Or..if the 24 bits are used instead to drop the noise floor to below 40 db, who can hear that change?
Note that I used 40 only because you did - I have no idea what 40db sounds like. Regardless, the question remains the same in principle even if you move around what an ideal noise floor should be for the silences between the notes to be "heard".
 
Last edited:
I think I understand.
Just to be sure, based on the above: if 8 bits were to be employed, and the noise floor to be kept at 40 db, 8 bit would be able to go as loud as 88db? Till 88db sound levels, it would sound exactly the same as CDs for dynamic range? Or, more relevant, the use of 24 bits would be perceived only for sound levels in excess of 134db? Who can even survive those?Or..if the 24 bits are used instead to drop the noise floor to below 40 db, who can hear that change?
Note that I used 40 only because you did - I have no idea what 40db sounds like. Regardless, the question remains the same in principle even if you move around what an ideal noise floor should be for the silences between the notes to be "heard".

Yes, you can put it that way. Although, think of it as more storage capacity. If you are playing only one or two instruments, then 8 bit can probably do good and store all that info. But if there is an orchestra, playing lots of instruments. All these instruments are operating at multiple frequencies and various amplitudes. In that case, more bits will be able to store lot more data. 8 bit will ignore some of the data because it can only store 256 "0"s and "1"s vs 65k of 16 bit. That's where you would notice the big difference irrespective of loudness levels.
 
Umm, I won't say completely. It means 80000 samples per sec are needed to recreate 40000 Hz frequency at a min. More samples will give better waveform.

No they don't.

That is the number-one biggest misconception of digital music. This has nothing to do with ears, belief, faith, or what format we choose to buy our music in, it is simple science. Simple science that, as I said, I for one only became aware of amazingly recently in my long[ish] life.

More samples do not give a better, or more accurate, waveform. The do give the possibility of a bigger frequency range, and whether that is better or not is another issue entirely.

Please consult the science on this: it is there all over the web, and not too hard to find, but we have to be able to accept the counter-intuitive on this. "common" sense does not help: more samples does not leave less out!

Will try to catch up with the rest of the thread later, have to go out now :(
 
Please consult the science on this: it is there all over the web, and not too hard to find, but we have to be able to accept the counter-intuitive on this. "common" sense does not help: more samples does not leave less out!
(

More samples are not necessary because there is nothing more left to sample.

The problem is there is plenty on the web that says the opposite - out of incomplete knowledge, or to mislead. Examples:
Quote:
The sample rate is the number of times an analog signal is measuredor sampledper second. You can also think of the sample rate as the number of electronic snapshots made of the sound wave per second. Higher sample rates result in higher sound quality because the analog waveform is more closely approximated by the discrete samples.
Un quote. From Soundtrack Pro 3 User Manual
And, Quote:
Fill In The Blanks The basic DAC creates an analog voltage equivalent to the numeric value read from the file. But we know that there is data missing between each sample, and that the original level may have been a little more or less than the level recorded (quantization error). This is where the science ends and the art begins as we try to guess what the errors were and what is missing between the blanks.
Digital Filter - Fills in the Blanks

The digital filter is the computer algorithm that looks at the digital audio in the past, and the digital audio in the future and tries to figure out what was going on with the music, and tries to shape the final analog output to match as closely as possible that original waveform, now missing for ever.
Un quote: This, from How DACs Work

One of the above is from a hawker of DACs that cost up USD 70000. No more comment required?:rolleyes:

In my defense I will say that given such stuff floating around in abundance, and given the fact that I am not an engineer, I have a good excuse for my ignorance of this subject till now.
 
For a one pager on Nyquist etc, see:
4200 - Conversion Principles - Sampling
Note how there is no stepped waveform. The sampling is at discrete points. Even then, be careful how much you analyse this sort of representation.
This is the crux of the theorem, quoted sentence below:
A continuous band-limited signal can be replaced by a discreet sequence of samples without loss of information, and the original signal can be reconstructed from those samples.
Aliasing/Anti Aliasing is also covered.
 
Back
Top