The theorem of information theory mentioned here should be used only as a guide. The reason that it cannot be implemented ideally in a CD player or a DAC is NOT that there are engineering or practical limitations of making a perfect low-pass filter. Even if we could make perfect low pass filters that will bandlimit an audio signal to have a max frequency of 22KHz (less than half of the CD sampling freq of 44.1KHz, according to the theorem), we would NOT have the original analog audio signal reproduced.
Why?
I actually have explained the reason in simple terms in post no.4 in the following thread:
http://www.hifivision.com/cd-players/2930-24-bit-192khz pointless.html
Let me try to say this again briefly both in simple terms and a bit technically.
In simple terms, the data contained in a redbook CD (16 bit, 44.1KHz) is a finite amount of data (finite amount of time, finite no. of sampled data points etc). The original signal has infinite no of data points. It is NOT possible to get infinite output from finite no of data, as simple as that. So, one needs to put in mocked up data by hand to fill the gap between the sampled points of time. In scientific terms this process is called INTERPOLATION.
Actually the proof of the theorem effectively assumes
1) a (denumerably) infinite sampling points which obviously cannot ever be realized in the case of discretising and digitizing a continuous (analog) signal which exists for a finite time, and
2) a bandlimited signal (i.e., a signal which has a certain maximum freq), which again mathematically is NOT realizable for a signal which exists for a finite time.
The first of the above assumptions again has to do with superposition of an infinite number of "sinc" functions (one each for each discrete point of time).
The impossibility of the second assumption shows that even if we could make perfect low pass filters which would block frequencies above a certain number perfectly, it would not help reproduce the original continuous signal.
Two things can come to help here. One is surely lack of the human perception of frequencies above a certain maximum. A well implemented CD player or DAC would actually try to push the differences of the reproduced analog signal with the original analog signal to frequencies well above the human perception limit.
This is where a bit of maths is involved, apart from good engineering. Even for standard CD players with sampling rate of 44.1 kHz, a good scheme for interpolation of the discretized data is required. The theorem does NOT ensure that.
One approach has been to upsample, that is increase the data points by a good mathematical programme by increasing the sampling freq. Most people naively think that this is an artificial way of adding more data that was not there in the first place. True. But even if you do not upsample, you need to put in data (by some interpolation) by hand to make a bunch of discrete data points continuous. THERE IS NO ESCAPE FROM THIS, no matter what you do.
Many people also think that the only job of upsampling is to increase the max freq of the bandlimited input audio signal so that a less than perfect low-pass filter does not introduce serious imperfections. True. But that's not the only way the upsampling can be used. The proof of the theorem clearly shows that an increase in the sampling points helps to have a better interpolation, something also consistent with common sense.
Now, I like to add my practical experience in this. I have a recorder that records into some kind of memory and can record upto 24 bit 96 KHz. It's a Sony professional product. Most of us think that Sony makes only mass market products. Not quite true. They also make professional quality AV equipments which are usually not there in their usual website. For example my recorder Sony PCM DM-50 is shown only in the Sony professional website. It does studio quality recording at much better resolution than redbook CDs.
Now I have used this recorder to record live music at a variety of bits and sampling frequencies, from 16 bit 44.1 KHz upto 24 bit 96 KHz. When I playback the recordings through my analog amplifier (with the DA conversion taking place inside the recorder) the difference is obvious. One does NOT need to be an expert to find out that the 24 bit 96 Khz is by far the best.
At the end, one question: I am actually a bit confused about the proof of the theorem. It seems to say that one can get non-denumerable infinity from a denumerable infinity, something I thought was mathematically NOT possible. Can anybody knowledgeable explain? Probably this is why the bandlimitation of the original signal comes in, so that Fourier modes (i.e.frequencies) above a certain number are not involved.