Digital Representation of Audio InformationIntroduction Kevin D.
Donohue
Question!
If a tree falls in the forest and nobody is there to hear it, will
it make a sound?
Sound provided by http://www.therecordist.com/downloads.html
• Merriam-Webster Dictionary: • Sound a : a particular auditory
impression
b : the sensation perceived by the sense of hearing c : mechanical
radiant energy that is transmitted by longitudinal pressure waves
in a material medium (as air) and is the objective cause of
hearing.
Electronic Audio Systems
Amplification, Signal
Synthetic Audio: Imitating Nature
1780 Wolfgang von Kemplen’s Speaking Machine Mid 1800’s Charles
Wheatstone Late 1800’s Alexander Graham Bell 1939 Homer Dudley’s
Voder http://www.acoustics.hut.fi/~slemmett/wave/track01f.wav
1898 Thaddeus Cahill’s Telharmonium (First Music Synthesizer) 1919
Lev Theremin’s Theremin
Communication channels (acoustic and electric) 1874/1876 (Antonio
Meucci’s) Alexander Graham Bell’s Telephone. 1940’s Homer Dudley’s
Channel Vocoder first analysis-synthesis system
Voice-Coding Models The general speech model:
Speech sounds can be analyzed by determining the states of the
vocal system components (vocal chords, track, lips, tongue … ) for
each fundamental sound of speech (phoneme).
Unvoiced Speech
Voiced Speech
Spectral Analysis Voiced Speech Spectral envelop => vocal tract
formants Harmonic peaks => vocal chord pitch
Time Analysis Voiced Speech Time envelop => Volume dynamics
Oscillations => Vocal chord motion
Spectrogram Analysis
0 5 10 15
1920’s Radio Rex 1950’s (Bell Labs) Digit Recognition
Spectral/Formant analysis Filter Banks
1960’s Neural Networks 1970’s ARPA Project for Speech
Understanding
Applications of spectral analysis methods FFT,
Cepstral/homomorphic, LPC 1970’s Application of pattern matching
methods DTW, and HMM
Speech Recognition
1980’s Standardize Training and Test with Large Corpora (TIMIT)
(RM) (DARPA) New Front Ends (feature extractors) more perceptually
based Dominance/Development of HMM Backpropagation and Neural
Networks Rule-Base AI systems
Specification of Speech Recognition
Speaker dependent or independent Recognize isolated, continuous, or
spot speech Vocabulary Size, Grammar Perplexity, Speaking style
Recording conditions
Components of Speech Recognition
Input Speech Detected Speech String
Matlab Examples %% Create and play a 2 second 440 Hz tone in
Matlab: fs = 8000; % Set a sampling frequency fq = 440; % frequency
to play t = [0:round(2*fs)-1]/fs; % Sampled time axis sig =
cos(2*pi*fq*t); % Create sampled signal soundsc(sig,fs) % Play it
plot(t,sig); xlabel('Seconds'); ylabel('Amplitude')
wavwrite(sig,fs,'t440.wav') clear % Remove all variables from work
space
%% Reload tone and weight it with a decaying exponential of time
constant .6 seconds tc = .6; % Set time constant [y, fs] =
wavread('t440.wav'); % read in wave file t =[0:length(y)-1]'/fs; %
Create sampled time axis dw = exp(-t/tc); % Compute sampled
decaying exponential dsig = y.*dw; % Multiply sinusoid with
decaying exponential soundsc(dsig,fs) plot(t,dsig);
xlabel('Seconds'); ylabel('Amplitude')
Matlab Examples Explore demo and help files >> help script
SCRIPT About MATLAB scripts and M-files.
A SCRIPT file is an external file that contains a sequence of
MATLAB statements. By typing the filename, subsequent MATLAB input
is obtained from the file. SCRIPT files have a filename extension
of ".m" and are often called "M-files". To make a SCRIPT file into
a function, see FUNCTION. See also type, echo. Reference page in
Help browser
doc script In the help window (click on question mark) Go through
section on
programming and then go to the demo tab and view a few of the
demo.
Matlab Examples
• In class examples …
Matlab Exercise Use the sine/cosine function in Matlab to write a
function that generates a minor scale (for testing the function use
start tones between 100 and 440 Hz with a sampling rate of 8 kHz).
Let the Matlab function input arguments be the starting frequency
and the time interval for each scale tone in seconds. Let the
output be a vector of samples that can be played with Matlab
command “soundsc(v,8000)” (where v is the vector output of your
function).
The frequency range of a scale covers one octave, which implies the
last frequency is twice the starting frequency. On most fixed pitch
instruments, 12 semi-tones or half steps make up the notes within
an octave. A minor scale sequentially increases by a whole, half,
whole, whole, half, whole, and whole (8 notes altogether –
including the starting note).
Matlab Exercise - Scales Just Pythagorean Equal Temperament
Interval - 0 (1) 1/1 = 1 1 = 1 2^(0)=1
Interval - 1 16/15 256/243 2^(1/12)
Interval - 2 (2) 10/9 (or 9/8) 9/8 2^(2/12)
Interval - 3 (3) 6/5 32/27 2^(3/12)
Interval - 4 5/4 81/64 2^(4/12)
Interval - 5 (4) 4/3 4/3 2^(5/12)
Interval - 6 45/32 (or 64/45) 1024/729 (or 729/512) 2^(6/12)
Interval - 7 (5) 3/2 3/2 2^(7/12)
Interval - 8 (6) 8/5 128/81 2^(8/12)
Interval - 9 5/3 27/16 2^(9/12)
Interval - 10 (7) 7/4 (or 16/19 or 9/5) 16/9 2^(10/12)
Interval - 11 15/8 243/128 2^(11/12)
Interval - 12 (8) 2/1 = 2 2/1 = 2 2^(12/12) = 2
Matlab Exercise – Famous Notes Middle C = 261.626 Hz (standard
tuning)
Concert A (A above middle C) = 440 Hz
Middle C = 256 Hz (Scientific tuning)
Lowest note on piano A=27.5 Hz
Highest note on piano C= 4186.009
EE513Audio Signals and Systems
Question!
Ambiguity!