33
Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics Jonas Lindh – [email protected] http://www.ling.gu.se/~jonas Department of Linguistics, Göteborg University and GSLT (Graduate School of Language Technology) IAFPA 2006

Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Embed Size (px)

DESCRIPTION

Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics. Jonas Lindh – [email protected] http://www.ling.gu.se/~jonas Department of Linguistics, Göteborg University and GSLT (Graduate School of Language Technology) IAFPA 2006. Outline. Background and Introduction - PowerPoint PPT Presentation

Citation preview

Page 1: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Jonas Lindh – [email protected]://www.ling.gu.se/~jonasDepartment of Linguistics, Göteborg Universityand GSLT (Graduate School of Language Technology)

IAFPA 2006

Page 2: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Outline• Background and Introduction

– F0 and Forensic Phonetics– Modulation theory of speech

• Hypotheses• Methods• Results

– F0 Statistics – for Young Swedish males– Robustness test– Vocal effort test.– Liveliness illustration.

• Conclusions• Future Work

Page 3: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Background and Introduction

• F0 a reliable parameter for speaker identification (French, 1990 ; Hollien, 1990 ; Künzel,

1987 ; Nolan, 1983 - in Braun, 1995).• Technical, physiological and psychological

factors (Braun, 1995).• Fundamental frequency measures.• Some previous studies and results.

Page 4: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Background and Introduction (Braun, 1995)

• Technical factors– Tape speed unfortunately still a problem. – Sample durations (50, 75, 14, 120 s?).

• Physiological factors– Age, smoking, operations. – Larynx size, shape and mass.– Between speaker variation.

• Psychological factors– Noise level, emotions, time of the day.– Vocal effort, speaking rate, F0-dynamics, voice quality– Within speaker variation

Page 5: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Background and Introduction

• Fundamental frequency measures– Average

– Standard deviation

– Median

– Interquartile range

– F0 mode

– Base value! Modulation theory of speech.

Page 6: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Modulation theory of speech• The theory /…/ considers speech signals as the result

of allowing conventional gestures to modulate a carrier signal that has the personal characteristics of the speaker. This implies that in general the conventional information can only be retrieved by demodulation. In order to perceive the phonetic quality of a speech signal, listeners evaluate the deviations of the properties of the signal (F0, formant frequencies, etc.) from those they expect of a neutral vocalization produced by the speaker with properties given by his age, sex, vocal effort, speech rate, etc. (part of abstract -Traunmüller, 1994)

Page 7: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

F0 Liveliness

European lang. Chinese lang.

Liveliness class SD N SD N(4) Ve ry high 4.8 + +(3) High 4.0 + – –(2) Moderate 2.8 – + – – – 4.0 – –(1) Low 2.1 –

Average F0‑variation (SD in semitones) as a function of the type of speech as classified in.

Under ‘Type’, the speech samples are classified according to their expected liveliness (Traunmüller & Eriksson, 1995).

Page 8: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

F0 Mean, SD and ‘liveliness’

Investigation Type n Sex Age F0 SD

Rappaport (1958), German 1 190 m 129 2.3Chevrie‑Muller et al. (1967),Fr 2 21 m 20–61 145 2.5Boë et al. (1975), Fr 2 30 m 118 2.8Takefuta et al. (1972), English 4 24 m 127 3.8Chen (1974), Mandarin Chinese 2 2 m 30–50 108 4.1Rose (1991), Wú 2 4 m 25–62 170 4.1Kitzing (1979), Swedish 2 51 m 21–70 110 3.0Pegoraro Krook (1988), Swedish 2 198 m 20–79 113 2.6

Page 9: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

F0 Mean, SD and ‘liveliness’

Investigation Type n Sex Age F0 SD

Johns‑Lewis (1986), English:Conversation 2 5 m 24–49 101 3.4Reading 3 5 m 24–49 128 4.35Acting 4 5 m 24–49 142 4.85Graddol (1986), English:Reading passage A 2 12 m 25–40 119 3.6Reading passage B 3 12 m 25–40 131 4.55

Average/investigation 10 m 124 3.4Average/balanced speaker 471 m 119 2.8

Page 10: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

F0 Liveliness (Traunmüller & Eriksson, 1995)

• The SD of F0 increases with increasing ‘liveliness’ of the discourse.

• The SD of F0 seems to be larger in tone languages than in non‑tone languages.

Page 11: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

F0 baseline (Traunmüller & Eriksson, 1995)

• Fb = Fmean – k (F)• Where k is a constant (app. 1.43).• App. 5% F0 values below Fb . • Different liveliness, same Fb .

• Tested by changing the factor and not Fb when resynthesizing natural speech.

• ke = 0.156, 0.414, 0.704, 1.000, 1.290, 1.566, 1.830• “Det finns folkstammar som äter både kattkött och hundkött”.

Page 12: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Hypotheses concerning F0 for young Swedish males

• The F0 median is more robust than the F0 mean when it comes to technical factors, i.e. less sensitive to outliers.

• The base value shows least within speaker variation of presented measures within a voice modality. (creaky voice, shouting or raising one’s voice)

• The 5% limit frequency (alternative baseline) is more robust than the base value when the technical factor means positive octave jumps.

Page 13: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Methods

• The software Praat (Boersma & Weenink, 2005) was used to automatically extract F0 data from 109 young male speakers (20-30 years old).– The group exist as such in the Swedia database.– 62% of convicted criminals in Sweden 2004 (25-35).

• The recordings were taken from the Swedia database (<http://www.swedia.nu>) – spontaneous speech.

• Mean duration of 52.3 sec.

Page 14: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Methods• Edited out interviewer.• Manual check of octave jumps.• Ongoing is the collection of 5% limit frequency, F0

mode (histograms for each speaker’s F0 distribution) and interquartile range.

Page 15: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Methods

• A small robustness test was made by measuring F0 for simultaneous recording on four different devices (material Livijn, 2004).

– The North wind and the sun (in Swedish).

– MCA, Cassette, Mobile and digital (Reference).

Page 16: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Methods

• Vocal effort test.

• 5 male speakers from Eriksson & Traunmüller (2000)

• High quality recordings.

• 5 distances/subject outdoors (0,3-1,5-7,5-37,5-187,5m)

– “Jag tog ett violett, åtta svarta och sex vita.”

Page 17: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Methods

• A liveliness illustration

• Recordings of a simulated carrier signal + a neutral, happy, sad and angry voice.

Page 18: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Results

Mean distribution of F0 for YM

0 0 1

8

21

28

22

14

10

1

4

00

5

10

15

20

25

30

70 80 90 100 110 120 130 140 150 160 170 Fler

Hz

N S

pea

ker

s

• Mean of means 120,8 Hz – 65% between 100-130 Hz

Page 19: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Results

F0 mean trend

708090

100110120130140150160170180

0 10 20 30 40 50 60 70 80 90 100 110

Speakers

F0

mea

n (H

z)

Page 20: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Results

Median distribution of F0 for YM

0 0

5

10

31

22 21

10

6

2 20

0

5

10

15

20

25

30

35

70 80 90 100 110 120 130 140 150 160 170 Fler

Hz

N S

peak

ers

•Mean of medians 115,8 Hz – 68% between 100-130 Hz

Page 21: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Results

F0 Median trend

708090

100110120130140150160170

0 10 20 30 40 50 60 70 80 90 100 110

Speakers

Med

ian

s (H

z)

Page 22: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

ResultsStandard deviations of F0 for YM

02

15

27

19

14 15

11

4

1 1 00

5

10

15

20

25

30

5 10 15 20 25 30 35 40 45 50 55 FlerHz

N S

peakers

•Mean of std’s 24,1 Hz – 56% between 10-25 Hz

Page 23: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Results

•Mean of baselines 86,3 Hz – 68% between 70-100 Hz

Baseline frequencies for YM

0 1 1 1

15 16

3127

13

3 1 00

10

20

30

40

30 40 50 60 70 80 90 100 110 120 130 Fler

Hz

N S

peak

ers

Page 24: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Results

F0 baseline trend

406080

100120140

0 10 20 30 40 50 60 70 80 90 100 110

Speakers

Bas

elin

es (H

z)

Page 25: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

ResultsF0 Measure Robustness

20253035404550556065707580859095

100105110115120125130135140

REF REF_band MOB MOB_band MCA MCA_band CAS CAS_band

Recording device

Fre

quen

cy (

Hz) Mean

STD

Base

Median

Alt-IQ-base

Alt-base

Page 26: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Results

F0 measures of modal to shout

5

25

45

65

85

105

125

145

165

185

205

225

245

265

285

305

325

345

Harald

1

Harald

2

Harald

3

Harald

4

Harald

5

Henrik

1

Henrik

2

Henrik

3

Henrik

4

Henrik

5

Niclas

1

Niclas

2

Niclas

3

Niclas

4

Niclas

5

Peter1

Peter2

Peter3

Peter4

Peter5

Prefek

t1

Prefek

t2

Prefek

t3

Prefek

t4

Prefek

t5

Stark1

Stark2

Stark3

Stark4

Stark5

Speakers Effort 1-5

Hz

Mean

STD

Base

Median

Alt-IQ-base

Alt-base

Page 27: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

ResultsLiveliness illustration

0

10

20

30

40

50

60

70

80

90

100

110

carrier neutral happy sad angry

Liveliness

F0

(Hz)

Mean

STD

Base

Median

Alt-IQ-base

Alt-Base

Page 28: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Conclusions

• The median is more robust than the mean when it comes to technical factors, i.e. less sensitive to outliers.– Yes. Manual check and results confirm this.

• The base value shows least within speaker variation of presented measures within a voice modality.– Yes. Shouting or raising one’s voice can mean raising one’s

base value.

– 68% within 30 Hz, same as median.

• The 5% limit frequency is more robust than the base value when the technical factor means positive octave jumps.– Yes. Robustness test.

Page 29: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Conclusions

• F0 should be measured in case work.

• If baseline values are different there should be a reasonable explanation for it not to indicate speaker difference.– Such as ‘voice modality’ (creak, shout etc.)

differences.

Page 30: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Future work

• F0 mode (ongoing) and individual histograms.

• More measures on different “liveliness” levels for same and different speakers on different recording devices.

• Sample size vs. content.

• Authentic case material.

• Separate study of creaky voice.

Page 31: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

Thank you

for your attention.

Questions?

[email protected]

http://www.ling.gu.se/~jonas

Page 32: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics

ReferencesBoersma, P. & Weenink, D. (2005) Praat: doing phonetics by computer (Version 4.3.27)

[Computer program] Retrieved October 7, 2005, from http://www.praat.org/Braun, A. (1995) Fundamental frequency – how speaker-specific is it?, in Braun and

Köster (eds) (1995): 9-23Brottsförebyggande Rådet: [www] Retrieved November 26, 2005, from http://www.bra.se/Bruce, G. (1982) Developing the Swedish Intonation Model. In Working Papers 22 (Lund

University, Dep of Linguistics, 51-116.Jassem, W., Steffen-Batog, S., and Czajka, M. (1973) Statistical characteristics short-term

average F0 distributions as personal voice features, in W. Jassem (ed.) (1973) Speech Analysis and Synthesis vol. 3:209-25, Warsaw: Polish Academy of Science.

Kitzing, P. (1979) Glottografisk frekvensindikering: En undersökningsmetod för mätning avröstläge och röstomfång samt framställning av röstfrekvensdistributionen (Lund University,Malmö)

Nolan, F. (1983) The Phonetic Bases of Speaker Recognition, Cambridge: Cambridge University Press.

Traunmüller, H. (1994) Conventional, biological, and environmental factors in speech communication: A modulation theory. Phonetica 51: 170 - 183.

Traunmüller, H. & Eriksson, A. (1995) The frequency range of the voice fundamental in the speech of male and female adults. Unpublished Manuscript (can be retrieved from http://www.ling.su.se/staff/hartmut/aktupub.htm)

Traunmüller, H. & Eriksson, A. (1995) The perceptual evaluation of F0-excursions in speech as evidenced in liveliness estimations. J. Acoust. Soc. Am. 97: 1905 - 1915.

Hartmut Traunmüller and Anders Eriksson (2000) "Acoustic effects of variation in vocal effort by men, women, and children", J. Acoust Soc. Am. 107: 3438 - 3451.

Rose, P. (2002) Forensic Speaker Identification. New York, Taylor & Francis.Rose, P. (1991) How effective are long term mean and standard deviation as normalisation

parameters for tonal fundamental frequency?, Speech Communication 10:229-247

Page 33: Preliminary F0 Statistics for Young Swedish Males and Forensic Phonetics