Upload
myles-garrison
View
217
Download
0
Embed Size (px)
Citation preview
Digital Audio
What do we mean by “digital”?How do we produce, process, and playback?
Why is physics important?What are the limitations and possibilities?
Digital vs. Analog
Discrete data Reproducible with
100% fidelity Can be stored
using any digital medium
Frequency and amplitude ranges limited by digitization
Continuous data Reproduction
introduces new noise
Storage limited by physical size
Virtually unlimited frequency and amplitude ranges
Physics of Digitization
Sound (pressure wave) is transduced into an electrical signal (usually voltage)
Signal “read” by A-D converter to discrete values
Time sequence of signal values encoded in a computer
Sampling Basics
Sample Rate: Frequency interval of the time sequence of encoded values
Sample Depth (or Bit Depth): Number of bits used to encode each value
Bit Rate = (Sample Rate) x (Bit Depth)
For example, “CD Quality” audio is 44.1kHz at 16 bits = 7.065E5 bps per channel, or 1411 kbps total
Sample Rate (Sample Frequency)
Sample Period = 0.5s
Sample Rate = 1/0.5s = 2Hz
Sample Rate Matters!(Mathematica Demo 1)
What do the samples actually represent?
Nyquist-Shannon Sampling Theorem
“If a function x(t) contains no frequencies higher than B, it is completely determined by giving its ordinates at a series of points spaced 1/(2B) seconds apart.”
A necessary condition for digitizing a signal so that it can be faithfully reconstructed is that the sample rate is at least twice as high as the highest frequency present in the signal.
What can go wrong?
Aliasing: High frequencies contribute signal components that are perceived as lower frequencies (Mathematica Demo 2)
Bit Depth
Number of bits used to represent each sampled value
Available discrete values n=2b
Here there are only 5 discrete values, so 3 bits per sample
Dynamic Range Ability to represent
small and large amplitude signals in the same scheme
Clipping: Large signals are cut off, introducing high harmonics
Masking: Small signals are “drowned out”
Signal-to-Noise Ratio (S/N)
Ratio of meaningful signal power to unwanted signal power
In sound, the “audible power” (decibels) is skewed from the actual power
Best case scenario: noise is in the first bit:
S/N (dB) = 10 Log (2b) = 3.01b (per channel) Human ear sensitivity covers a range of
more than 120dB! (~40 bits)
Digital Audio Compression
Analog signals are practically incompressible
Raw audio signals are similarly hard to reduce using standard (lossless) file compression (Shannon Information Theory)
Psycho-acoustic models may be helpful! (lossy)
MP3 Codec
Divide the file into packets and find the Fourier power spectrum via DFT
Throw out easily masked frequencies to reach desired bit rate
Dither regions with different dynamic ranges or where the bit depth must be lowered to match desired bit rate
Perform traditional redundancy compression
(ratatat samples)
Discrete Fourier Transform Frequency Limit = ½
Sample Frequency (Nyquist)
Frequency Resolution = 1/Signal Period (Mathematica 3)
Usually frequency resolution is much sharper than the ear can detect
Dithering
Digital Signal Processing (DSP)
Non-linear (ie, atemporal) Real-time effects subject to latency and
buffering memory Filters and envelopes extremely
difficult/expensive to achieve with analog techniques
Easier non-destructive editing Perfect fidelity in copying
Some Common DSP Effects
Vocoder vs Autotune (Daft Punk) Delay/Echo (U2, David Gray) Filter/Flange (Foster the People, Dizzy
Gillespie)
Digital Synthesis (If you can write an equation, you can hear it!)
Sound engineering for movies/TV Arbitrary mathematical functions can be
generated (Mathematica 4) Sounds not identifiable by the ear/brain
(Chem Bros and Skrillex samples)