19
1 Lab Preparation • Initial focus on Speaker Verification – Tools – Expertise – Good example • “Biometric technologies are automated methods of verifying or recognising the identity of a living person based on a physical or behavioural characteristic”

1 Lab Preparation Initial focus on Speaker Verification –Tools –Expertise –Good example “Biometric technologies are automated methods of verifying or recognising

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

1

Lab Preparation

• Initial focus on Speaker Verification– Tools– Expertise– Good example

• “Biometric technologies are automated methods of verifying or recognising the identity of a living person based on a physical or behavioural characteristic”

2

MATLAB

function sig = makesine (f, fs, timelen)

t = 0:(1/fs):timelen-(1/fs);

sig = sin(2*pi*f*t);

plot (t, sig);

grid;

3

Speech Signals

• Praat

• Waveforms

• F0/Pitch

• Spectra

• Time domain measurements & analysis

• Frequency domain measurements & analysis

• Male vs female speech

4

Sounds and Speech

• Words contain sequences of sounds

• Each sound (phone) is produced by sending signals from the brain to the vocal articulators

• The vocal articulators produce variations in air pressure

• These variations are transmitted through the air as complex waves

• These waves are received by the ear and signals are sent to the brain

5

Praat: Speech Analysis ToolWaveform, Spectrogram, Pitch, Formants

6

Waveforms• Plot of change in air pressure with time• Amplitude

– Compression/Rarefaction– Speech: intensity/loudness

• Frequency– Cycles per second (Hz)

• Speed– Metres per second (ms-1)

• Wavelength– Metres (m) / Microns / Angstroms (Å)

• Related by

fc } Won’t

concern us for now

7

Fundamental Frequency

8

• F0 (pron. F-zero)

• Rate of opening/closing of glottis

• Vocal folds do not vibrate like strings but F0 is dependent on similar factors

• Perceptual correlate is pitch

• Do not confuse with formant frequencies F1, F2,…!!!

Fundamental Frequency

9

Constant VT configuration; varying pitch

10

Spectra• Think of a graphic equalizer• Speech made from waves of many frequencies• Spectrum plots (log) power against frequency• Peaks related to resonant frequencies in VT

– Formants• Centre frequency• Bandwidth

• Spectral slice• Spectrogram

– Overhead view of slices against time– Darkness related to power

11

Vowels vs Fricatives

12

Spectrum /u/

13

Spectrum /i/

14

Spectrum S

15

Constant pitch; varying vocal tract configuration

16

Time-domain Analysis

17

Male speech

18

Female speech

19

• Refers to general slope of spectrum– Higher formants are weaker than lower

formants

• Phonation is most significant factor

• Greater spectral tilt in female speech– Ratio of lower formant amplitude to higher

formant amplitude greater in males

Spectral Tilt