54
University of Sheffield Department of Computer Science “Voice Stress Analysis: Detection of Deception” Xianfeng Liu MSc in Advanced Computer Science August 2005 Supervisor: Prof. Roger K. Moore Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

  • Upload
    lamdien

  • View
    222

  • Download
    0

Embed Size (px)

Citation preview

Page 1: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

University of Sheffield Department of Computer Science

“Voice Stress Analysis: Detection of Deception”

Xianfeng Liu MSc in Advanced Computer Science

August 2005

Supervisor: Prof. Roger K. Moore

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

Page 2: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Plagiarism Declaration

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- I -

Plagiarism Declaration

All sentences or passages quoted in this dissertation from other people's work have been specifically acknowledged by clear cross-referencing to author, work and page(s). Any illustrations which are not the work of the author of this dissertation have been used with the explicit permission of the originator and are specifically acknowledged. I understand that failure to do these amounts to plagiarism and will be considered grounds for failure in this dissertation and the degree examination as a whole. Name: Xianfeng LIU Signature: Date: 31/Aug/ 2004

Page 3: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Abstract

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- II -

Abstract

Voice Stress Analysis (VSA) technology has been introduced to the lie detection field for several decades. It originated from the concept of micro muscle tremors (MMT) which was considered to be a source of deception detection. However, the effectiveness of VSA devices for deception detecting is still questionable. In this project, jitter (micro tremor) and pitch were set as two kinds of features to detect deceptions among several subjects. A deception encouraged card game was designed for build up a deceptive/non-deceptive database. In order to achieve open-set performance, the entire data set was first divided into 22 equal-size groups with both deceptive and non-deceptive speech. Each group contains the same number of non-deceptive and deceptive data. The result has shown that: pitch can detect deception but it is speaker dependent. Detections probabilities on jitter in both speaker dependent and independent methods are not significant higher than chance level.

Page 4: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Acknowledgements

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- III -

Acknowledgements

The completion of this dissertation was made possible through the support and cooperation given by my Supervisor Prof. Roger Moore. I would like to express my sincere gratitude to him for his enlightening guidance and encouragement throughout the academic year. I would also thank Dr Jon Barker who gave me a lot of help for the voice data collection. Lastly, I want to thank my friends, for rendering constant support through the year.

Page 5: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Table of Contents

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- IV -

Table of Contents Plagiarism Declaration.....................................................................................................I Abstract .......................................................................................................................... II Acknowledgements....................................................................................................... III

Introduction............................................................................................................. 1 Literature survey ..................................................................................................... 3

2.1 Overview of stress....................................................................................... 3 2.2 Voice Stress Analysis (VSA)....................................................................... 4 2.3 Deception Detection (DD) .......................................................................... 7 2.4 Bayesian Hypothesis Testing ...................................................................... 9

Requirements and Analysis................................................................................... 12 3.1 Introduction............................................................................................... 12 3.2 Data Collection ......................................................................................... 12 3.3 Analysis Software ..................................................................................... 13

Design and Implementation.................................................................................. 18 4.1 Data Collection ......................................................................................... 18 4.2 Pitch Extraction......................................................................................... 21 4.3 Jitter Extraction......................................................................................... 23 4.4 Classification and Detection ..................................................................... 26

Results and Discussion ......................................................................................... 29 5.1 Probability Density Functions................................................................... 29 5.2 Final Detection probability ....................................................................... 33

Conclusion and Future Expectation .................................................................... 36 References .................................................................................................................... 38 Appendices.................................................................................................................... 40

Appendices I: ......................................................................................................... 40 Deception Encouraged Card Game Instruction .............................................. 40 Object of the game: ......................................................................................... 40 Equipment: ...................................................................................................... 40 Definitions: ..................................................................................................... 40 Play: ................................................................................................................ 41 The Winner: .................................................................................................... 43 Additional Rules: ............................................................................................ 43

Appendices II......................................................................................................... 44 Extracted Pitch Value for Non-deceptive Voices ............................................ 44

Appendices III ....................................................................................................... 45 Extracted Pitch Value for Deceptive Voices.................................................... 45

Appendices IV ....................................................................................................... 46 Extracted Jitter Value for Non-deceptive Voices ............................................ 46

Appendices V ......................................................................................................... 47 Extracted Jitter Value for Deceptive Voices.................................................... 47

Page 6: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

List of Tables

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- V -

List of Figures Fig 2.1 Waveform used to measure stress based on energy in the waveform (Little or no stress)

............................................................................................................................................5 Fig 2.2 Waveform used to measure stress based on energy in the waveform

(Medium Stress)..............................................................................................................5 Fig 2.3 Waveform used to measure stress based on energy in the waveform (Hard

Stress)……………………………………………………………………………………..6 Fig 2.4 Truthful/Deceptive options with inconclusive decisions excluded………………...…..8 Fig 2.1 A real deception detection case solved by CVSA…………………………………...…8 Fig 3.1Voice Data Collection…………………………………………………………………13 Fig 3.2 Distribution Fitting Tool……………………………………………………………...14 Fig 3.3 Labeling and segmentation by Praat………………………………………………….15 Fig 3.4 TextGrid file format…………………………………………………………………..16

Fig 3.5 Use Transcriber to segment and label……………………………………………..…..17 Fig 4.1 Knock-out game table………………………………………………………………...18 Fig 4.2 Card game snap……………………………………………………………………….19 Fig 4.3 Experiment Setup and workflow……………………………………………………...20 Fig 4.4 PitchTier………………………………………………………………………………23 Fig 4.5 Pitch extraction script for Praat……………………………………………………….23 Fig 4.6 Pulses (Jitter)………………………………………………………………………….25 Fig 4.7 Jitter extraction script for Praat……………………………………………………….26 Fig 5.1 Speaker independent pitch pdfs……………………………………………………….29 Fig 5.2 Speaker independent jitter pdfs……………………………………………………….30 Fig 5.3 PDF of Pitch…………………………………………………………………………..31 Fig 5.4 PDF of Jitter…………………………………………………………………………..32 Fig 5.5 Final detection probability of pitch and jitter…………………………………………35

Page 7: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

List of Tables

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- VI -

List of Tables Tab 2.1 Cost comparisons between VSA and Polygraph………………...…………………….5 Tab 4.1 mean and variance value of Pitch by speaker.…………………………………...…...27 Tab 4.2 mean and variance value of Jitter by speaker………………………………………...27 Tab 5.1 Final detection probabilities……………………………………………………….…33

Page 8: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 1 Introduction

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 1 -

Chapter 1

Introduction

Voice Stress Analysis (VSA) was researched and developed in the past few decades by private individuals and the U.S Army. by private individuals and the U.S Army. In the past few decades, this theory has been introduced to the lie detection field. More recently, this lie detection theory has become more and more popular. For instance, in the UK, voice stress analysis has been used in commercial products and got much positive feedback. Moreover, policemen, some insurance companies and even telecom companies are using this technology on incoming telephone calls to detect frauds [1] (News 2003). Also, a number of shops are selling 'love detectors' based on some form of voice analysis [2]. They also claim that the voice stress analysis technology is effective in deception detection and distinguishing different human emotions. Though some company has started to employ the new theory for commercial, it is still doubted by many people about the correctness of the method. Researchers claimed that high levels of stress do not necessarily correlate with deception, the relationship between them still need to be proved. In addition, even if the correlation between the two can be established, an innocent people will also get stressed under a tough situation. One theory behind voice stress analysis is that there are inaudible vibrations known as "micro tremors" in the voice. It is claimed that the micro tremors change when a person is telling a lie [3]. On the other hand, audible features of human voice are also considered. In this project, fundamental frequency (Pitch) and the micro-tremor (Jitter) were examined from several subjects’ voice recordings. The purpose is to study the possibility of detection of deception through human’s stressed voice especially stressed voice. Experiment has been designed to analyze the probability of the detection correctness by using these two features separately. 8 subjects were invited to play a so-called “deception encourage” card game. Their performance has been recorded. In the analysis stage, Bayesian hypothesis test [4] was employed for deception classification. The main objective of the project is to find out whether voice-based analysis methods are able to detect deceptions. Fundamental frequency and micro tremor were separately extracted from a database and same statistical analysis algorithm was used on both. Regarding the personal variance, experiments were based on both speaker dependent and speaker independent. The final result of detection probability would show the effeteness of the VSA theory based method and also the effect of speaker variation. In order to achieve open-set performance, the entire data set was first divided into 22 equal-size groups with both deceptive and non-deceptive speech. Each group contains

Page 9: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 1 Introduction

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 2 -

the same number of non-deceptive and deceptive data. The result has shown that: pitch can detect deception but it is speaker dependent. Detections probabilities on jitter in both speaker dependent and independent methods are not significant higher than chance. This report is organized as follows: Chapter 2 presents a review of some of the different approaches to deception detection and also an overview of some of the work being carried out in this area by other researcher. Chapter 3 discusses the requirements of the project. It also discusses some of the background of the techniques that have been used in this project. Chapter 4 gives an overview of the work that has been done. Finally Chapter 5 gives a summary of the system along with the results and conclusions and future works that can be drawn from Chapter 6.

Page 10: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 2 Literature Survey

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 3 -

Chapter 2

Literature survey

2.1 Overview of stress

There have been a number of studies carried out to determine the relationship between what the voice stress analyzers detect and the emotional stress. It would appear to be useful to identify vocal stress and define its nature. The typical common definition of stress considers a mechanical issue: “… when a stress is applies to a body … a corresponding strain is produced” [5] However, this definition is not very useful because, even though the stressor itself is defined, the subjects’ emotional state is unknown. Another definition which was discussed by [5] is like that: “Stress is observable variability in certain speech features due to a combination of unconscious response to stressors and/or conscious control”. This definition not only emphasizes that the nature of stress as a variability but also refers an “unstressed” state. However, the problem is: it is unclear how to define a meaningful “unstressed” reference condition. This definition does not clearly explain which features or combinations of feature variations correlate with different stressors [5]. Another typical common definition of stress is non-specific. “Stress is a general arousal or change in physiology.” It defined that stress is directly correlated to individual security and emotion. This notion of stress indicates a variation of arousal, thus it is useful but it is not easy to predict the specific outcome [5]. In the ESCA-NATO Workshop on Speech Under Stress, one definition of stress was discussed which proposed to separate the cause of stress and the effect of stress. “Stress is an effect on the production of speech (manifested along a range of dimensions), caused by exposure to a stressor” According to this definition, voice stress can be defined as a cause and also as an effect. In other words, stress is an effect on human caused by a stressor [6]

Page 11: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 2 Literature Survey

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 4 -

It is not easy get a single acceptable definition of stress to satisfy every perspective based on different research domain; the concept of stress is broad and based on specific studies. There are still many other definitions of stress made by other researchers; each of them has its own particular emphasis, which can not be considered as wrong. It is no need to define a unified definition for voice stress.

2.2 Voice Stress Analysis (VSA)

Voice stress analysis originated from the concept that when a person is under stress. In the moment of stress, especially if the speaker is under jeopardy, the body prepares for fight by increasing the readiness of its muscles to spring into action. Changes in the acoustic speech signal due to stress are mainly caused by these stress-reactions. These changes also affect the organs of speech, such as the respiration and muscle tension. Hence, it should be possible to establish whether a person is stressed just by analyzing his/her voice [7]. The terminology of the muscle vibration is “micro-muscle tremors” (MMT) or “micro tremor”. The micro tremors occur in the muscles that make up the vocal tract which are transmitted through the speech. It is described as a slight oscillation at several cycles every second which is claimed to be the sole source of detecting if an individual is lying [8]. Much work on stress analysis in real life situations concentrates on communication under dangerous (Jeopardy) conditions. In many of these studies [1] an increase of the fundamental frequency (F0) of the voice in situations of increasing danger is reported. Williams and Stevens [1] reported an increase in F0 range and abrupt fluctuations of F0 contour, with increasing stress. In a Russian study [15] the voices of astronauts are examined and changes in spectral energy distribution (spectral centroid moving to higher frequency) are reported. According to those theories mentioned above, lie detection devices have been invented and deployed. These devices offer several potential advantages over the standard polygraph (See Tab 2.1 below). Firstly, the training time is less than polygraph training, and there are no academic prerequisites to receive that training. Then, the VSA takes little time, averaging about 45 minutes per session. The most convenient thing is that compared with traditional polygraph method there are no sensors placed on the body, only a small microphone clipped to the examinee’s clothing. Only the input voice is used, the examinee need not even be present during the examination. Moreover, recording from a remote location can be processed with the equipment.

Page 12: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 2 Literature Survey

Tab 2.1 Cost comparisons between VSA and Polygraph (Taken from [8])

VSA systems can be broken into two separate categories, energy-based systems and frequency-based systems. The majority of the systems evaluated are based on the detection of the MMT” [9]. He did an experiment based on his theory and claimed that when a voice data is processed through a bank of filters, a series of waveforms can be obtained which may represent the stress in the voice. In the case of a non-stress response, the shape of the waveform is looked like a Christmas tree (see Fig 2.1 [9]). As stress increased, the shape turned to be flatter (see Fig 2.2 [9] and Figure 2.3 [9]).The waveform which shows signs of significant stress can be labeled as deceptive. This type of technology is known as energy-based VSA.

Fig 2.1 Waveform used to measure stress based on energy in the waveform (Little or no stress) [Taken from [9]]

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 5 -

Page 13: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 2 Literature Survey

Fig 2.2 Waveform used to measure stress based on energy in the waveform (Medium Stress) [Taken from [9]]

Fig 2.2 Waveform used to measure stress based on energy in the waveform (Hard

Stress) [Taken from [9]] The other kind of systems is frequency based VSA systems which are claimed to be able to identify changes within frequency bands and the distribution of the frequencies within those bands [9]. In these systems, a continuum of stress can be identified and a comparison of the position of the relevant stress on this continuum will be used to determine whether the answers are deceptive or non-deceptive. The aim of VSA is to analyze the levels of vibrations which are measures of stress. However, stress can also be brought on by a simple thought or memory such as suddenly remember some dangerous things in the future. Due to the difference of environment and situation, different people may have different stress levels, and this level changes from day to day along with people’s mood. It is well accepted that it is possible to decode emotions from speech. The reason is that the changes of emotion may indicate the "perceived jeopardy" or "danger" of statements [10].

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 6 -

Page 14: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 2 Literature Survey

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 7 -

2.3 Deception Detection (DD)

In the last 25 years, voice stress analysis technologies have been introduced in the lie detection area which are claimed to be more convenient and accurate compared to the traditional polygraph one. By now police agencies and insurance companies are trying to use them to detect frauds. For instance in the UK, a car insurer (Highway Insurance) which introduced phone lie detectors says a quarter of all vehicle theft claims have been withdrawn since the initiative began [1]. Deception detection aims to determine whether the information is deceptive or non-deceptive. There are mainly two types of approaches to detecting deception: manual approach and automatic approach. Compared with the manual approach, the automatic approach is more efficient and easier to use. However, due to high-stakes of some deception situations, manual DD is still very common among domain experts in the field. DD approaches usually depend on a set of cues that are indicative of deception. Traditional deception research has identified a rich set of cues to deception that have been tested in lab or field environments. Recently, attempts have been made to discover automated cues to differentiate deception from truth. It saves a lot of amount of time because it avoids encoding deception data manually. Human’s ability to identify deception is no better than chance [11]. The automatic discovery of cues to deception improved the detection performance to some degree [12]. There was a study based on VSA by NITV [13]. They compared the traditional Polygraph method with two VSA products. Fig 2.1 illustrates that both the Polygraph and VSA examiners were able to accurately discriminate between truthful and deceptive cases. They both got a higher rate than the one of chance. The polygraph examiner achieved an overall accuracy rate of 74%. Concurrently, the VSA users’ overall accuracy in this case was 84%. VSA is based on the principle of measuring the amount of changes in the parameters associated with the involuntary dissipation of the FM component of the voice. These changes related to psychological stress which induced by fear, anxiety, guilt, or conflict can be useful in the detection of deception [13].

Page 15: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 2 Literature Survey

Fig2.3 Truthful/Deceptive options with inconclusive decisions excluded [Taken

from [13]] There is another example as the practical application of VSA: a real child abuse case solved by Computer Voice Stress Analyzer (CVSA) was released by National Institute for Truth Verification (NITV) [13]. The victim was a two and a half year old child. The suspects were her mother and her mother’s roommate. Both were given CVSA exams and both charts were “DI”. Fig 2.5 illustrates copies of both charts. By contrasting to Fig 2.1-2.3, it can be concluded that the two suspects became much stressed when they were asking the specific questions (see the ones with red circles). It was reported by that: “voice stress analysis should be considered as an investigative aid to preferably be used after investigators have thoroughly collected all relevant evidence of a criminal offense”, which in another word is that the VSA should not be used as a device for the purpose of coercion. The truth verification examination should be a part of the investigation, not the investigation. This requires not only the high accuracy of the voice stress analysis device but also high level of professional interrogation skill.

Fig 2.4 A real deception detection case solved by CVSA [Taken from [13].]

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 8 -

Page 16: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 2 Literature Survey

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 9 -

On the other side, controversies over the use of voice stress analysis came up along with the presence of this technology. The accuracy of the VSA to detect deception has been evaluated by [14]. One hundred and nine people participated in his experiment and half of which were asked to commit a realistic and engaging mock crime. The other half had little knowledge about the mock theft. The CVSA examiners scored the exams in accordance with NITV procedures. Charts were blind-scored by three other evaluators. Blind CVSA evaluators made correct decisions on 49.8%, while the testing examiners achieved 48.6% accuracy. However, both of these accuracy rates did not exceed chance accuracy 50%. Then [14] concluded in his paper that VSA is not reliable for lie detection because the VSA sensitivity to detect lies was very low. Another study discussed and analyzed the major empirical evidence which was claimed by the voice stress analysis proponents, specifically the statement that voice stress devices are effective in deception detection. In this study, a sub audible micro tremor signal generated by voice stress devices had been extracted from the vocal spectrum. The results showed that the promise of voice stress analysis in the lie detection field was not and may never be a reality. It was stated that the reliable evidence that the VSA devices were useful in detecting deception did exist but there was no correlation with stress [15]. Until now, there are a number of people believe that the voice stress analyzer are not reliable to the detection of deception, and a lot of experiments have been done to prove their arguments. However, most of the laboratory-based evidence by these people was in the “game playing” situations with low level of jeopardy. The result that the VSA is not reliable also seems to be not reliable. The U.S. Supreme Court, after listening to all of the 'studies' presented on the accuracy of the polygraph (over 70 years’ development) and VSA, declared that "There is no consensus in the scientific community that polygraph evidence is reliable" [16]. Despite progress has been made, most of attempts have not been successful. The distance from applying them to real-world situations in making critical decisions is still far.

2.4 Bayesian Hypothesis Testing

Bayesian hypothesis testing is less formal than non-Bayesian varieties. In fact, Bayesian researchers typically summarize the posterior distribution without applying a rigid decision process. Since social scientists do not actually make important decisions based on their findings, posterior summaries are more than adequate. If one wanted to apply a formal process, Bayesian decision theory is the way to go because it is possible to get a probability distribution over the parameter space and one can make expected utility calculations based on the costs and benefits of different outcomes. In this project, pair wise (Deceptive/Non-deceptive) classification was considered. The

Page 17: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 2 Literature Survey

classifier is similar as Bayesian hypothesis testing. In Bayesian hypothesis method, there are two hypotheses termed H0 and H1. According to this project, under H0, the speech was addressed as non-deceptive and on the other hand, under H1, the speech is

deceptive. Putting all these into a feature vector x (x = , m is the size of the

vector) the following two conditional probability densities p(x|H0)and p(x|H1). (PDF)

are estimated [15]. With these PDFs, the likelihood ratio

mxx ......1

λ is then defined as:

)0|()1|(

HxpHxp

=λ Eq 2.1

Whether the input speech is deceptive or non-deceptive is decided by comparing the

likelihood ratio with a pre-defined valueβ . If λ is bigger than β , then the input

speech is considered as deceptive; otherwise it is classified as non-deceptive. The

value of β depends on what criterion is used for detection. In the classification

system, a criterion should be selected so that the two important probabilities, the false acceptance rate (FAR) and the false rejection rate (FRR), should be as low as possible.

For a stress classification system, the value of β depend on the equal error

(FAR=FRR) rate (EER). In order to form the likelihood ratio in Eq 2.1, the PDFs of both probabilities of deceptive and non-deceptive speech should first be estimated. If it can be assumed that

all the components (z| ) of the feature vector x are independent and

identically distributed Gaussian random variables with mean

Mxxx ...., 21

nμ and variance

under non-deceptive conditions, but with a different mean

2nσ

sμ and variance under

deceptive conditions, then the individual feature component PDFs conditioned on neutral (H0) or stressed (H1) speech is as follows,

2sσ

( )⎟⎟⎠

⎞⎜⎜⎝

⎛ −−= 2

2

2 2exp

21)0|(

n

ni

n

ix

Hxfσμ

πσ Eq 2.2

( )⎟⎟⎠

⎞⎜⎜⎝

⎛ −−= 2

2

2 2exp

2

1)1|(s

si

s

ix

Hxfσμ

πσ Eq 2.3

With these PDFs and assuming statistical independence, the overall conditional probabilities p(x|H0) and p(x|H1) can be computed as,

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 10 -

Page 18: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 2 Literature Survey

( ) ( ) ⎟⎟⎠

⎞⎜⎜⎝

⎛−−= ∑

=

−M

ini

n

M

n xHxp1

22

22

21exp2)0|( μσ

πσ Eq 2.4

( ) ( ) ⎟⎟⎠

⎞⎜⎜⎝

⎛−−= ∑

=

−M

isi

s

M

s xHxp1

22

22

21exp2)1|( μσ

πσ Eq 2.5

Substituting Eq 4.9 and 4.10 into Eq 4.6, the likelihood ratio can be computed as,

( )( ) ( ) ( ) ⎟⎟

⎞⎜⎜⎝

⎛−−−⎟⎟

⎞⎜⎜⎝

⎛== ∑∑

==

M

isi

s

M

ini

n

M

s

n xxHxpHxp

1

22

1

22 2

12

1exp0|1| μ

σμ

σσσ

λ Eq 2.6

Taking the logarithm of each side, the log likelihood ratio is obtained as follows,

( )( ) (( 222

222 22

lnln ss

nns

n MMM μμσσ

μμσσσ

σλ −+−−++⎟⎟

⎞⎜⎜⎝

⎛= )))) ) ) Eq 2.7

Where the μ) and 2σ) are the estimated mean and variance of the input sample

feature vector, x, which are defined as,

∑=

=M

iix

M 1

1μ) Eq 2.8

∑=

−=M

iix

M 1

22 )(1 μσ )) Eq 2.9

The decision of whether the input speech is deceptive or non-deceptive is made by comparing the likelihood (Eq. 2.1 for Gaussian distributed features) or log likelihood

ratio (Eq. 2.7) with a pre-defined thresholdβ . For a deception classification system,

however, we are only interested in the overall accuracy and have no preference for

either FAR or FRR. Therefore, the value of β corresponding to equal error

(FAR=FRR) rate (EER) is selected. In the experiments performed here, the values of FAR and FRR were calculated as the ratio of the number of falsely accepted vowels to the total number of speech, and the ratio of the number falsely rejected vowels to the

total number of vowels, respectively. By changing the threshold value, the value of β

corresponding to EER can be found

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 11 -

Page 19: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 3 Requirements and Analysis

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 12 -

Chapter 3

Requirements and Analysis

3.1 Introduction

This project aimed at developing a system for the classification of deceptive and non-deceptive voice data. A review of the techniques has already been presented in the previous section. Apart from the techniques already presented in the preceding section, certain features of voice data should be extracted from a sufficient database. There is no ready-to-use database available for the project; hence, a rational data collection was designed and set up to create the speech database. Around 450 pieces voice (include both deceptive and non-deceptive) data was collected. However, segmentation of these data can only be done manually. In this phase, a segmentation tool called “Transcriber” was employed for the first segmentation and also labeling (There were two stages of segmentation in this project). All the data were segmented and labeled again by using Praat. Praat can automatically extract segmentations and save them into single data. It can also extract pitch and jitter from a selected voice data. (The script code can be found in the Program Codes). In order to get the detection probability of the classifier, one approach can be first of all manually classify the signal and then classify it using the system and see the probability. The classifier is based on Bayesian Hypothesis Testing theory which has already been described in the above section. In this phase, statistical toolbox of Matlab was employed to statistically analyze the features.

3.2 Data Collection

The objective of this research is to find out whether voice stress analysis (VSA) technologies can detect deception. Getting deceptive/non-deceptive voice data is an essential part. In previous studies, these data were obtained by asking subjects to act as liars. For instance, in Horvath and Heisse‘s studies [17] [18] and many other works, there was a made-up a scenario, subjects who participated in the scenario acted as liars and judgers (criminal and judges). All their audible performances were recorded as the voice data which would be used in their experiment. However, the most important disadvantage of this method is that all the subjects have already known that they were acting and they might not feel jeopardy. Hence, their voice data may not have distinct differences between deceptive and non-deceptive. In this project, to avoid this problem, a card guessing game was designed for voice data collection (see Appendices I). Subjects were encouraged to lie in order to win the game but not to act as a liar. Only through lying, one can beat against his/her opponents and win the final prize (See Fig

Page 20: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 3 Requirements and Analysis

3.1).

Fig 3.1 Voice Data Collection

3.3 Analysis Software

There are three software that were used in the project which were MATLAB, Praat and Transcriber. They are both considered due to the features which each possess that would aid the completion of this project.

3.3.2 Matlab

This project used MATLAB to implement the feature extraction system and classifiers. The name MATLAB stands for matrix laboratory. It is a high performance language for technical computing. It integrates computation, visualisation and programming in an easy to use environment where the problems and their solution are expressed in familiar mathematical notation. It is usually used for mathematics and computation, algorithm development, modelling, simulation and prototyping, scientific and engineering graphics, application development, including graphical user interface building [19]. MATLAB provides a high performance language for mathematical computation. The main feature of MATLAB is its ease of use when dealing with complex mathematical manipulations and visualization. Unlike other high level programming languages, MATLAB provides an easy language for dealing with vectors and matrix manipulations. The advantage is that each discrete time signal can be represented as a

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 13 -

Page 21: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 3 Requirements and Analysis

matrix. In the project, each speaker both non-deceptive and deceptive voice data were loaded as vectors separately. Moreover, it also provides addins in the form of toolboxes that can be used to extend its functionality. The statistical toolboxes were used here in the project for modelling the classifier based on Bayesian hypothesis testing theory. It includes graphical user interfaces (GUIs) and command line tools that make it easy to look at probability distributions, fit them to data, or generate random samples from them. The Distribution Fitting Tool is a GUI (see Fig 3.2) that enables users to learn about a variety of probability distributions—for example, a probability density function or cumulative distribution function can be graphed and investigated how a distribution’s parameters affect its position and shape. The Distribution Fitting Tool fits data using 16 predefined probability distributions, a nonparametric (kernel smoothing) estimator, or a custom distribution [Matalab 7.04 help].

Fig 3.2 Distribution Fitting Tool

A second GUI provides a random number generator to simulate behavior associated with particular distributions. Random data can be used in testing hypotheses or models under different conditions.

3.3.2 Praat

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 14 -

One other software that has been used extensively in this project is Praat. Praat is a program that has been designed to be used in phonetics research in order to analyze, synthesize and manipulate speech. It also has the ability to generate different pictures to support speech analysis. Praat has been developed by the Institute of Phonetic

Page 22: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 3 Requirements and Analysis

Science, University of Amsterdam. Apart from the above mentioned features, it also has the ability to perform Signal Labeling, Segmentation, Speech manipulation and a high level scripting language. All the features are available through interactive menus. According to this project, Praat was employed for precisely speech segmentation and feature extraction. Non-deceptive and deceptive speech signal were segmented into many small individual wave files, then, pitch and jitter of each voice file could be automatically extracted. The following figure shows how Praat deal with speech waveform labeling and segmentation.

Fig 3.3Labeling and segmentation by Praat

Two annotation tiers were used. Segmentation of deceptive and non-deceptive was saved in each tier. Then corpus can be extracted from each tier. The format of the text file which the annotation information is saved is shown blow (See Fig 3.4)

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 15 -

Page 23: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 3 Requirements and Analysis

Fig 3.4 TextGrid file format

3.3.3 Transcriber

Transcriber is open-source software. It is a tool for assisting the manual annotation of speech signals. It provides a graphical user interface (GUI) for segmenting speech recordings, transcribing them, and labeling speech turns. It is useful in the area of speech research. As for this project, the Transcriber will be used to segment and label the recordings of subjects after the card game. Sentences of each subject will be label as “T” for Truth; “L” for Lie and “R” for reply (Accept/Do not Accept ). This software is free and has a very good GUI. (See Fig 3.5) A big advantage of this software, as with some other computer-based transcription systems, is that the transcript is synchronized to the audio file. So, when using Transcriber, any portion of the transcript and immediately be played with the corresponding audio. It is also very easy to search for all sections marked [inaudible] and listen to the corresponding audio.

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 16 -

Page 24: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 3 Requirements and Analysis

Fig 3.5 Use Transcriber to segment and label

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 17 -

Page 25: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 4 Design and Implementation

Chapter 4

Design and Implementation

4.1 Data Collection

Data collection in this project was divided into two major phases. Firstly, a trial collection has been made. 4 subjects were invited to play the game in the departmental recording booth. Two laptops with recording software installed were employed for the real time data storing. There are two objectives to do the trail experiment. One is to test the hardware and software, make sure they work properly. The other objective is to calculate how many non-deceptive/deceptive sentences can be obtained from one game. According to statistical significance, if the non-deceptive and deceptive data can be detected as they have significant differences; there should be at least about 300 pieces of experimental recordings used in the project. Thus, after the trial, the number of cards was added to 22 for each player to make sure sufficient data can be collected from the following formal collection experiment. Second, the formal collection experiment was executed afterwards. In this stage, 8 subjects were involved. They are all from department of computer science. Two of them are research staff of the department and the rest are master students. It is very hard to come up with a test in which the subject is sufficiently involved to create a high stress level speech. Thus, all of them were asked to play the card game as a knock-out competition and the final winner could have the prize (see Fig 4.1).

Fig 4.1 Knock-out game table

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 18 -

Page 26: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 4 Design and Implementation

Each player was equipped with a headset microphone, and the game will also be filmed by a digital camera. During the game, both of the players have to show their cards in front of the camera before playing them out. The video recording will record all the cards played by players and also their facial expression which might be used for the classification in the later facial experiments. Both voice data and video data of the game and the deceptive voice can be picked out afterwards as following graph (see Fig 4.2).

Fig 4.2 Card game snap

According to the instruction of the game (Appendices I), telling truth will always be safe for the players, but three times continuously “truth” statement found by their opponents will lead the player to loss of that round. To some extends, deception is encouraged or even forced to the players. This additional rule balanced the number of deceptive and non-deceptive performance in the game. According to the theory of significant difference, not only the total number of the data should be sufficient, but also it should not be polarized. The final database contains 336 pieces of speech recording which includes 181 non-deceptive voices and 155 deceptive voices (Corresponding data value can be found in appendices). Moreover, a 50 pounds prize has been funded by the department of computer science which may stimulate subjects perceive jeopardy when they are telling lie. The experiment focused on analyzing features of these data to identify any difference of these two kinds of data (deceptive/non-deceptive). If significant differences do exist,

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 19 -

Page 27: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 4 Design and Implementation

it then indicates that deception can be detected by using VSA method. The following figure (Fig 4.3) shows the experiment setup and workflow.

Fig 4.3 Experiment Setup and workflow

According to the figure above (see Fig 4.3), the following paragraphs demonstrate steps of data collection with in detail: Step 1: Collection method design. A card game (Appendices I) was designed enlightened by a German dice game. The major idea of the game is to make people lie to win the game. Step 2: Hardware facilities setting up. The quality of the recording is vitally important in this project. A noise reduce recording booth (Department of Computer Science, Sheffield University) was employed for recording. Two headphone mic-phones were plugged into two laptops which were recording software pre-installed (WaveCN. V1.08). Moreover, a Canon digital video camera with a tripod was set up in the booth. Each player’s 22 cards were randomly selected from 2 new packs of squeezers. Then, the player can not guess opponent’s cards by memorizing played cards. Step 3: Trial data collection. The trial experiment was essential. First, all the devices were checked for the

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 20 -

Page 28: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 4 Design and Implementation

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 21 -

following formal experiment. Second, voice data from the trial might tell whether adjustment needed. Thirdly, average number of deceptive and non-deceptive speech was calculated which shown that whether more cards are needed to the player to obtain more voice data for statistical analysis. 4 subjects involved in the trial experiment. They played with 11 cards of each other initially, and then 15 cards and finally 22 cards for each player which was used for the following formal experiment. Step 4: Method adjustment & formal collection design According to the number of available subjects and the number of voice data needed, a knock out competition style was considered. There were totally 8 subjects participated and 7 games in the competition (see Fig4.1). Final winner could win 30 pounds prize (50 pounds funded by the department of computer science, 20 used in the trial stage) Step 5: Implementation Set up time for each subject and executed the experiment. Step 6: Data transfer/convert, labeling and segmentation The entire wave files were transferred to one computer and were given normalized names. Then, Transcriber and Praat were employed for voice labeling and segmentation. Moreover, the video data stored tapes were also converted into windows media video (WMV) style by using windows movie maker. Step 7: Features extraction In this project, only two features were used which were pitch and jitter. Those features’ value was extracted automatically by using Praat. Then, both pitch and jitter values were divided into several open-set by speaker dependent and speaker independent for the following classification. Step 8: Classification and Comparison Bayesian Hypothesis Testing was used for deception classification and getting final detection probability.

4.2 Pitch Extraction

Pitch feature was reported to be able to detect stress [20]. More features will used together because it shows that more features can detect stress better than one. Therefore, jitter will be used as another feature accompany with pitch for data analyzing. In this project, Praat was employed for pitch extraction. Praat uses the autocorrelation method for pitch analysis. The algorithm performs acoustic periodicity detection on the basis of an accurate autocorrelation method [21]. [22] reported that this method is more accurate and noise-resistant then the cepstrum one and also the original

Page 29: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 4 Design and Implementation

autocorrelation ones. The algorithm he used is that:

)(/)()( τττ wxwx rrr ≈ Eq 4.1

To estimate the autocorrelation )(τxr of the original signal segment, it divides the

autocorrelation )(τxwr of the windowed signal by the autocorrelation )(τwr of the

window. This estimation can easily be seen to be exact for the constant signal x(t) = 1 (without subtracting the mean, of course); for periodic signals, it brings the autocorrelation peaks very near to 1. The first consideration for stress evaluation involves characteristics of fundamental frequency f0, including contours, mean, variability, and distribution. Then, the differences in mean, variance, and distribution of pitch (f0) were also considered. The results are assumed that statistical tests performed sample variables to be Gaussian distributed, so a comparison of f0 distribution contours will be performed.

All the information of fundamental frequency were stored and represented by PitchTier. PitchTier is one of the types of objects in PRAAT. The object represents a time-stamped pitch contour, i.e. it contains a number of (time, pitch) points, without voiced/unvoiced information. For instance, if PitchTier contains two points, namely 150 Hz at a time of 0.5 seconds and 200 Hz at a time of 1.5 seconds, then this is to be interpreted as a pitch contour that is constant at 150 Hz for all times before 0.5 seconds, constant at 200 Hz for all times after 1.5 seconds, and linearly interpolated for all times between 0.5 and 1.5 seconds (i.e. 170 Hz at 0.7 seconds, 210 Hz at 1.1 seconds, and so on). According to this project, the interval was set as 0.01 second which means pitch value was extracted after ever 0.01 second. A sample is shown on Figure Fig4.4.

Fig 4.4 PitchTier

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 22 -

Page 30: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 4 Design and Implementation

Fig 4.5 Pitch extraction script for Praat

A manual script of Praat has been written to automatically do this job and save each wave file’s pitch information into a single “.PitchTier” file. Then each file’s average pitch value can be calculated. Another text file was used to save all the average pitch values (See Fig 4.5).

4.3 Jitter Extraction

Jitter has also been extracted from the voice data. Jitter is the perturbation in the vibration of the vocal chords. It is known as Period-to-period fluctuations in F0. This causes the variation of the fundamental frequency in different cycles.

∑=

=N

iiN 1

1 αα Eq 4.2

Perturbation means a deviation from steadiness [7]. Let iα be any cyclic parameter Voice Stress Analysis: Stress and Detection of Deception

Department of Computer Science, The University of Sheffield - 23 -

Page 31: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 4 Design and Implementation

(amplitude, pitch period, etc.) in the cycle of the waveform. Then the steady value

of this parameter over a span of N cycles can be estimated from its arithmetic mean

(See Eq 4.2). And the order perturbation functions as the arithmetic difference

(See Eq 4.3):

thi

thzero

,0 αα −= iiP i = 1…,N Eq 4.3

where the superscript gives the order of the perturbation function. Higher-order perturbation functions can be obtained by alternately taking backward and forward differences of lower order functions. We will consider the first-order perturbation function (See Eq 4.4):

,101

01−− −=−= iiiii PPP αα i = 2…,N Eq 4.4

The first order perturbation function can be used to determine the fundamental

frequency perturbation if in Equation 4 iα is taken to be the fundamental frequency.

The fundamental frequency is computed only for the voiced parts of speech. The fundamental frequency perturbation is defined as the average of the absolute values of all these differences normalized to percentage:

∑=

−−−

=N

iiiN

jitter2

1)1(100 αα

α Eq 4.5

F0 perturbations have been found to be different between “emotional modes,” such as anxiety, fear, anger, etc. [23], [24]. In this project, deceptive (stressed) and non-deceptive (natural) segments of speech will be analyzed. They will be presented in a using the average perturbation contour diagram. If the contours of the non-deceptive and the deceptive speech segments do not overlap to each other, then the result can be made that jitter analysis can be used in the detection. Like the pitch extraction, jitter can also be automatically extracted by Praat. The value of jitter includes five different types which are: local, rap, ppq5 and ddp. The first one jitter (local) was used here. This is the average absolute difference between consecutive periods, divided by the average period. It was given 1.040% as a threshold for pathology. According to the threshold of pathology, a normal speaker’s jitter value should never exceed this threshold and even much smaller than it. As shown in the Appendices, the jitter values are all around 100 times smaller than 1. All the speakers are normal person without any abnormal phenomenon for their vocal organs, thus the extracted value of jitter can be considered to be usable. The same as the extraction of pitch, extracted values of jitter were stored in a single .txt file (See Fig 4.7).

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 24 -

Page 32: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 4 Design and Implementation

Fig 4.6 Pulses (Jitter)

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 25 -

Page 33: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 4 Design and Implementation

Fig 4.7 Jitter extraction script for Praat

4.4 Classification and Detection

It is now turn to the related problem of classification of deceptive and non-deceptive speech. The task is to formulate an algorithm for detection of speech spoken under one particular deceptive style versus non-deceptive speech. Here the two terms, classification and detection which can be used interchangeably since only pair wise classification is considered. There were two processing stages are required for deception detection. In the first stage, acoustical features are extracted from an input speech waveform which has already been mentioned above. The second stage is focused on detection of deceptive speech from non-deceptive using one or more available methods. A variety of methods exist for stress detection which include, but not limit to, detection-theory based methods, methods based on distance measures, neural network classifiers, and statistical modeling based techniques [8]. In this section,

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 26 -

Page 34: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 4 Design and Implementation

Bayesian hypothesis-testing framework was employed detect deceptive (stressed) versus non-deception (neutral) speech. The Bayesian hypothesis method is a stress detection technique to determine if a given audio data is either neutral speech or stress speech. It can also classify different types of emotional stress [25]. In Bayesian hypothesis method, there are two hypotheses termed H0 and H1. Under H0, the speech is neutral; on the other hand under H1, the speech is stressed. Putting

all these into a feature vector x (x = , m is the size of the vector) the

following two conditional probability densities p(x|H0)and p(x|H1) (PDF) are estimated [16]. With these PDFs, the likelihood ratio

mxx ......1

λ is then defined as:

)0|()1|(

HxpHxp

=λ Eq 4.6

. If λ is bigger thanβ , then the input speech is considered as stressed; otherwise it is

classified as neutral. In order to achieve open-set performance, the entire data set was first divided into 22 equal-size groups with both deceptive and non-deceptive speech. Each group contains the same number of non-deceptive and deceptive data. Then, the subsets were divided again by speakers. For instance: subsets 1 to 5 belong to one speaker and 6 to 10 belongs to another which could be set as two different subgroups. It should be noticed that after calculating each speaker’s mean and variance of both pitch and jitter, big differences can be found individually (See Tab 4.1 and Tab 4.2).

Non-Decption A.T J.E L.W L.G Y.D Y.P Y.T

Mu 147.352 155.9583 153.1072 116.3107 166.6665 134.452 135.8364

Sigma 29.9724 97.3767 1489 53.5314 549.6449 31.3271 5.8552

Deception A.T J.E L.W L.G Y.D Y.P Y.T

Mu 155.0814 166.6267 168.5697 119.3156 195.4759 144.428 144.1616

Sigma 70.5265 1073.1 2034.5 71.3751 1879.9 82.0586 49.7894

Tab 4.1 mean and variance value of Pitch by speaker. Non-deception A.T J.E L.W L.G Y.D Y.P Y.T

Mu 0.0112 0.0126 0.0152 0.0167 0.0119 0.0143 0.0122

Sigma 9.16E-06 2.14E-05 1.92E-05 2.40E-05 6.63E-06 2.21E-05 1.61E-05

Deception A.T J.E L.W L.G Y.D Y.P Y.T

Mu 0.0119 0.0149 0.0155 0.0158 0.0119 0.0138 0.0137

Sigma 1.23E-05 9.47E-06 1.50E-05 3.27E-05 8.11E-06 8.67E-06 1.54E-05

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 27 -

Page 35: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 4 Design and Implementation

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 28 -

Tab 4.2 mean and variance value of Jitter by speaker The above tables indicate that there may be difference between doing the classification with speaker independently and speaker dependently. Based on the point of view, the experiment of speaker dependent and speaker independent will be considered separately. In the speaker dependent experiment, each speaker’s speech data were divided into three parts. A quarter of them were used as the training data, and then half of the data was for the development. In this development stage, the threshold detection value, could be calculated. The test part was the rest quarter of the speaker’s data which gave the final detection probability of the speaker. One the other hand, speaker independent does not care about differences between speakers. The experiment based on this used the 22 subsets mentioned above and set aside the training and final test corpus by one speaker at a time. The final detection probability can be found in the following chapter. Each such group of identical speaker was set aside to train the system and the remaining data (groups) was used to test the system to obtain the overall error rate threshold for Bayesian hypothesis-testing method, and the mean and variance for the distance measure approach. The final error rate is obtained by accumulating all error rates from 22 open-set tests. Then, the training groups were changed to another speaker and the corresponding rest were set as the test data. In each test set (as described above) value was stored as vector which would be used in matlab. Two lengths of vectors were considered here as different categories of test set. The full length of each vector should be the sum of the number of deceptive and non-deceptive speech. Then, half of those data were selected to form new vectors. All these vectors were taken to the Bayesian method to calculate the final correctness rate.

Page 36: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 5 Result and Discussion

Chapter 5

Results and Discussion

5.1 Probability Density Functions

As mentioned above, in this project, both speaker dependent classification and speaker independent classification were considered. The reason to do this is it was assumed that big differences could be found between those two methods by approximately comparing the value of pitch and jitter. Firstly, the probability density of both speaker dependent and speaker independent were plotted (as described in the following paragraphs): The following figures show the probability density function of pitch of each subject (See Fig 5.1-4). The probability density function (PDF) has a different meaning depending on whether the distribution is discrete or continuous. For discrete distributions, the pdf is the probability of observing a particular outcome. Unlike discrete distributions, the pdf of a continuous distribution at a value is not the probability of observing that value. For continuous distributions the probability of observing any particular value is zero. To get probabilities you must integrate the pdf over an interval of interest. A pdf has two theoretical properties: The pdf is zero or positive for every possible outcome. The integral of a pdf over its entire range of values is one.

Fig 5.1 Speaker independent pitch pdfs

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 29 -

Page 37: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 5 Result and Discussion

The above graph demonstrates the pitch probability density of all speakers’ deceptive and non-deceptive voices. The blue line represents the deceptive data’s probability density and the red one represents the non-deceptive data’s probability density. From the graph it can be found that probability pitch distribution of deceptive voices has litter difference between the non-deceptive one. Only from the graph it is assumed that the deceptive and non-deceptive data are in much common and detection on them is difficult. However, it is only an assumption before the classification. The final experiment result in the following part will approve this hypothesis. Then, it turns to the jitter. According to the following graph (Fig 5.2), difference is clearer compared with the pdf graph of pitch. Nevertheless, the difference is still not so distinct that it can easily tell each kind of data only from their distribution. There is still a big part of their plot is overlapped.

Fig 5.2 Speaker independent jitter pdfs

The speaker independent pitch and jitter distribution overlapped a lot which seems not suitable for deception classification. However when it comes to speaker dependent, the result seems quite different. The following figures show each speaker’s pitch and jitter individually (Fig 5.3).

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 30 -

Page 38: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 5 Result and Discussion

Fig 5.3 PDF of Pitch

The above figures indicate that the probability density in probability density of both deceptive and non-deceptive voice data. From the six graphs above it can be clearly found that except for the two graphs in the middle (Subject 2 and 3), the difference of probability density between deceptive and non-deceptive data are prominent. Take the first figure for instance; the mean is about 145 Hz for non-deceptive voice and 155 Hz for deceptive data. There distribution density are also significant different. The figure of subject 2 also shows big difference between deceptive and non-deceptive. Nevertheless, even in the middle two graphs, difference between the two curves still can be observed. Their variance looks similar however their mean pitch values have about 10 Hz’s dispersion. Compared with the speaker independent figures above, it can be claimed that the classification result on speaker dependent and speaker independent may be quite different and the latter one seems more feasible in deception detection. However, results are different in jitter. The following figures (Fig 5.4) show the speaker dependent jitter distribution which is much different from that of pitch.

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 31 -

Page 39: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 5 Result and Discussion

Fig 5.4 PDF of Jitter

In this combined graph, the probability distribution of the two curves largely overlaps. It can be hardly believed that the speaker dependent jitter classification can result a significant detection probability ratio from each speaker. In general, each speaker has different fundamental frequency of their voice, and even the same speaker in one or more games may have various performances. For example, the subjects may feel more and more nervous along with their qualification. It is assumed that the final round of the card game would be the most furious one which the winner will finally win the 30 pounds prize. It is hard to detect deception without considering those differences between speakers. Moreover, jitter which was illustrated in literature review in the above paragraphs was supposed to be able to be used for deception detection. However, the probability density graphs here indicate that jitter may not be feasible for deception detection.

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 32 -

Page 40: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 5 Result and Discussion

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 33 -

5.2 Final Detection probability

As described above, after compared the probability densities of both deceptive and non-deceptive data, speaker dependent classification seems more qualified for this project. Nevertheless, both speaker dependent and independent classification has been carried out. Values were shown in the following table (see Tab 5.1):

Speaker Independent Speaker Dependent Speaker Pitch Jitter Pitch Jitter A.T 60.0% 52.7% 87.0% 51.7% J.E 68.8% 28.3% 76.6% 53.9% L.W 70.0% 54.4% 64.4% 31.3% L.G 40.0% 54.0% 75.0% 53.7% Y.D 25.0% 64.2% 48.1% 23.1% Average 53.2% 50.7% 70.2% 40.7%

Tab 5.1 Final detection probabilities As the table illustrated, there are two categories of values which are speaker independent detection probability rate and the speaker dependent detection probability rate. It should be noted that in the speaker independent category, the name of the speaker stands for from which speaker the training and test corpuses were selected. And then the development subsets were the reset corpus of that speaker plus all the other speakers’ corpuses. On the other hand, the probability values in the speaker dependent category were calculated and obtained only within that person. In addition, these values are presented as statistical significance levels but not exact detection ratio. In Statistics, "significant" means probably true (not due to chance), according to this project, these percentage values indicate that how much deception can probably be detected by certain methods.

5.2.1 Speaker Independent

From the table, great difference can be found between speaker independent and speaker dependent results. Take pitch for instance, in speaker independent experiment, the highest detection probability is 70%, two are around 60% and two are below 50%. The average detection probability is 53.2% which is only a little bit higher than chance. Detection probability at this level is not convincible that deception can be detected. Though, some values are higher than 50% and even reach 70%, there is a great fluctuation between them which also means speaker independent detection is not quite effective. According the probability density function graph (Fig 5.1) above, it can be concluded that speaker independent detection on pitch is not feasible.

Page 41: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 5 Result and Discussion

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 34 -

The value of detection with jitter in speaker independent category is different from pitch. There is a minor fluctuation which seems speaker independent detection on jitter has little influence on varieties of speakers. However, the detection probability is very low. It is not significant higher than 50% (the chance level). In summary, the experiment result reflects the opinion in the probability density function figures (Fig 5.1 and Fig 5.2). Speaker independent seems not feasible for deception detection. When detecting with pitch, some results reach high statistical significant levels but some are much lower than chance level (50%). Great fluctuation of the final detection probabilities can be found which do not match the speaker independent method. On the other hand, detecting on jitter seems has little influence on varieties of speakers and it can generates steady probability values. Nevertheless, all those probabilities are much lower than statistical significance level. Therefore, it can be concluded that speaker independent deception detection on pitch and jitter may not be effective.

5.2.1 Speaker Dependent

When it comes to speaker dependent detection, the results are quite different. As values shown in Tab 5.1, detection probability on pitch is much higher than chance level. However, the results of jitter are still not significant higher than 50%. According to the results on pitch, different speaker’s detection probabilities are different. 90% detection probabilities on pitch are higher than 60%. One speaker’s detection probability is nearly 90%, two are over 70% and one is lower than 50%. It should be also noticed that the average result is 70.2%. From a statistical perspective, this probability is significant higher than chance which should be considered that this method is effective to detect deception. The results on jitter are still not significant high enough to suggest that the detection is effective. Three values are about at the chance level, two are much lower than 50%. The average is only 40%. According to statistical significant, this result can not prove that speaker dependent deception detection on jitter is feasible. Compared with speaker independent detection, speaker dependence seems more proper in deception detection. According to this project, pitch is the feature that can be used in the detection. However, jitter which was claimed to be able to detect deception is not the right feature here in both speaker dependent and independent experiments. The following graph shows the final detection probability value (Fig 5.5).

Page 42: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 5 Result and Discussion

Detection Probability of Pitch and Jitter

0.0%10.0%20.0%30.0%40.0%50.0%60.0%70.0%80.0%90.0%

100.0%

A.T J.E L.W L.G Y.D Average

Speaker Independent PitchSpeaker Independent JitterSpeaker Dependent PitchSpeaker Dependent Jitter

Fig 5.5 Final detection probability of pitch and jitter

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 35 -

Page 43: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 6 Conclusion

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 36 -

Chapter 6

Conclusion and Future Expectation

The issues of stress classification and deception detection are becoming increasingly important for law enforcement and military in the field. Previous methods for VSA deception detection have focused on micro tremors in the muscles for voice production. While there is evidence which suggests that muscle control within the speech production system could be influenced by the presence of stress experienced by the speaker. Nevertheless, there is still uncertainty about in what degree and how this change in speech muscle control could manifest itself into micro tremors. In this report, previous studies on speech under stress and deception detection have been considered. Our own results have been got from our own evaluations and experiments by using features of pitch and jitter. All of these findings suggest that when a speaker is telling a lie, she/he may get under stress and their voice characteristics change. Changes in pitch are influenced in different ways by the presence of speaker. In other words, it is quite dependent on speakers. On the other hand, changes in jitter are small and are not influenced by individual speaker. However, as there is the case with speaker control of pitch, a variety of factors could influence the presence or absence of the micro tremor. It is claimed that human can control their muscles during speech production [10]. Thus, it seems unlikely that the measure in micro tremor (jitter) based on the CVSA could be successful in deception detection. Nevertheless, it is not impossible that under extreme levels of stress, the muscle control throughout the speaker will be affected. Then those speeches can be detected as deceptive or other emotion. In this project, both the speaker dependent and independent results show that jitter is not the proper choice. In conclusion of the project, a deceptive/non-deceptive speech database was built by asking subjects participate into a card game. According to statistical significance theory, there are more than 400 corpuses included. Around 300 corpuses were finally selected to be used in the experiment. Two features pitch and jitter were extracted form the database. Then, Bayesian Hypothesis Testing was employed to statistically classify deceptive and non-deceptive corpus. The final experiment results indicate that pitch is the feature that can be used in deception detection. Nevertheless, only speaker dependent detection is effective. Jitter is not statistical significant in the detection in both speaker dependent and independent one. Although these results state fundamental frequency detect deception, this still need further proven. It is a fact that the deceptive stress data plays a vitally important role in detection and classification of deceptive stress voices. However, only 50 pounds prize may not make the subjects feel much jeopardy. From the video tapes it can be founded that some subjects in the earlier

Page 44: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Chapter 6 Conclusion

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 37 -

rounds of the game looks relax. This may directly infects the final result of detection. If a new method can be employed to make subjects feel more jeopardy when they telling lie, the results will be more exact and convincible. Moreover, in this project, deceptive stress versus emotional stress or physical stress was not tested. This is also a very important issue in deception detection. Many liars are under an extreme amount of stress when being interrogated. Do these VSA systems actually differentiate between those types of stress? It is the future extension of the project needs to be proven.

Page 45: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

References

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 38 -

References

[1] BBC News (2003). Lie detectors ‘cut car claims’. Available: http://news.bbc.co.uk/1/hi/uk/3227849.stm

[2] Love Detector. Nemesysco Ltd. Availeble: http://love-detector.com/ [3] Darren Haddad, Sharon Walter, Roy Ratley, Megan Smith. Investigation and

Evaluation of Voice Stress Analysis Technology. 2002, The U.S. Department of Justice report (98-LB-VX-A013)

[4] F. Botti, A.Alexander, A.Drygajlo,On compensation of mismatched recording conditions in the Bayesian approach for forensic automatic speaker recognition, 2004,Forensic Science International, 146S, S101–S106

[5] Iain R. Murray, Chris Baber, Allan South (1996).Towards a definition and working model of stress and its effects on speech. Elsevier Science Publishers B. V. Volume 20 , Issue 1-2, pp 3-12

[6] Harry Hollien, Laura Geison, James W. Hicks, Jr. (1987). Voice stress evaluators and lie detection, Journal of Forensic Science, JFSCA, Vol. 32, No.2, pp.405-418

[7] M. H. Beers and Robert Berkow, The Merck Manual of Diagnosis and Therapy, 1999, 17th Edition, John Wiley & Sons.

[8] D. Haddad, et. al, Investigation and Evaluation of Voice Stress Analysis Technology. Final Report for National Institute of Justice, Interagency Agreement 98- LB-R-013. Washington, DC, 2002. NCJRS, NCJ 193832.

[9] Clifford S. Hopkins, Daniel S. Benincasa, Roy J. Ratley, John J. Grieco, (2005). Evaluation of voice stress analysis technology. Hawaii International Conference on System Sciences (IEEE).

[10] Scherer K.R, Oshinsky J.S. (1977), Cue Utilization in Emotion Attribution from Auditory Stimuli. Motivation and Emotion, Vol.1, pp.331-346

[11] Lina Zhou, Azene Zenebe, Modeling and Handling Uncertainty in Deception Detection. Proceedings of the 38th Hawaii International Conference on System Sciences, 0-7695-2268-8/05/$20.00 (C) 2005 IEEE

[12] B. M. DePaulo, J. T. Stone, and G. D. Lassiter, Deceiving and detecting deceit, in The Self and Social Life, B. R. Schlenker, Ed. New York: McGraw-Hill, 1985, pp. 323-370. [13] A real case solved by CVSA (2000/2001). National Institute for Truth Verification, Journal of continuing education. [14] Cestaro, V.L. (1996). A comparison between decision accuracy rates obtained

using the polygraph instrument and the Computer Voice Stress Analyzer (CVSA) in the absence of jeopardy. Polygraph, 25(2), 117-127.

[15] Horvath, Frank. (2002). Detecting deception: The promise and the realty of voice stress analysis. National Criminal Justice Institute. NCJ Number: 196936. Polygraph Journal: Volume 31, Issue (2).

[16] The CVSA vs Polygraph war-Who’s winning? (2000/2001). National Institute for Truth Verification. Journal of continuing education. pp: 12

Page 46: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

References

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 39 -

[17] L. Zhou and D. Twitchell, An Exploratory Study into Deception Detection in Text-based Computer-Mediated Communication,presented at Proceedings of Thirty-sixth Hawaii International Conference on Systems Sciences(HICSS'03), Hilton Waikoloa Village , Island of Hawaii (Big Island), 2002. [18] B. Walsh (2004), Introduction to Bayesian Analysis. Lecture Notes for EEB 581. [19] Horvath, F., “Detecting Deception: The Promise and Reality of Voice Stress

Analysis,” Journal of Forensic Science, Vol. 27, No.1, Jan. 1980, pp. 340-351 [20] Heisse, J. W., “Audio Stress Analysis—A Validation and Reliability Study of the

Psychological Stress Evaluator (PSE)” in Proceedings of the Carnahan Conference on Crime Countermeasures, Lexington, KY, 1976, pp. 5-18

[21] Duffy, Dean G, Advanced engineering mathematics with MATLAB, 2003, Boca Raton, Fla. : Chapman & Hall/CRC, 2nd ed

[22] Darren Haddad, Sharon Walter, Roy Ratley, Megan Smith (2002). Investigation and Evaluation of Voice Stress Analysis Technology. The U.S. Department of Justice report (98-LB-VX-A013)

[23] Paul Boersma (1993), Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound, Proceedings 17 (1993), pp 97-110.

[24] P. Lieberman and S. B. Michaels. Some aspects of fundamental frequency and envelope amplitude as related to the emotional content of speech. J Acoust Soc Am, 32(7):922–927, 1962.

[25] C. E. Williams and K. N. Stevens. Emotions and speech: some acoustical

correlates. J Acoust Soc Am, 52(4):1238–1250, 1972.

Page 47: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Appendices

Appendices

Appendices I:

Deception Encouraged Card Game Instruction

Players: For only two players

Object of the game: Try to lie to the opponent to run out all cards – which player firstly run out of cards will be the winner.

Equipment: A deck of playing cards consists of 52 cards.

Definitions: 1. PAIRS, TREBLES or FULLS: A pair is made of two cards with the same number showing on their face. A treble is made of three cards with the same number showing on their face. A full is made of full cards with the same number showing on their face Examples: Player A has a pair of 3, which could be 2 "3s". The cards can be in any combination of suits.( hearts, diamonds, clubs and spades) (Fig 2)

Fig 1 2. RUNS: A run is made of four or more cards numbered in order. Example: Player B can play a run of 4, which could be "3", "4", "5", and “6". The cards can be in any combination of suits. (Fig 2)

Fig2 3. ALL ONE SUITE: The cards are all the same suit. Example: Player A may play 3 cards of one suit, which could be 3 hearts cards or 3 spades cards etc. (Fig 3)

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 40 -

Page 48: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Appendices

Fig 3

4. DEAD CARDS: If the player believe what his/her opponent said about the played cards, then, the cards should be pushed aside and kept face down by the end of the game. These cards are called “Dead Cards”.

Play: One player is chosen to be the dealer at the very beginning of the game, who shuffles the deck and deals out all the cards one by one for each other. All the dealt cards are kept face down in front of each player. After dealing all the cards, each player at that time should have 26 cards (half of the deck) holding in their hands. Then the game starts from the one who does not deal this turn. The starter can play any card and any combination (pairs, one suit, runs) of cards at one time. When she/he plays, the card(s) should be placed in the middle of the two players and faced down so that the other player can not see it (them). Then, to continue to game, the player who played the cards has two options to tell the opponent about his/her played cards, which is the main point of the game: 1. Tell a lie: the player can decides to tell a lie of what he/she has played and wait for his/her

opponent’s reply. Example: Player A has played a runs “clubs 3, diamonds 4, hearts 5, and spade 6” Then he can say that: “two pairs, two 3s and two 6s”. Now Player A is telling a lie because actually he played a runs. (Fig 4)

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 41 -

Fig 4

2. Tell the truth: the player can also decide to tell the truth of what he/she has played and wait for his/her opponent’s reply. Example: Player A has played a runs “clubs 3, diamonds 4, hearts 5, and spade 6”, then he say: “a runs clubs 3, diamonds 4, hearts 5, and spade 6” Now Player A is telling the truth. (Fig 5)

Fig 5

Now, it turns to the other player to play, this player should give a reply of the previous one said. At this time this player also has two options: 1. Believe what the other player said: the cards are faced down; he/she is not sure whether or

+ Lie Player A

Play

Truth Player A

Play

Page 49: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Appendices

not the previous player told the truth. Then he/she can accept what the other said. If he/she believes the other player’s words, the played cards should be pushed aside (without turned over) and become dead cards. The game keeps going and the role of the player will be changed in the next turn. Example: Player A play four cards face down in front and said:” four 8s”. Player B believes it and say:” I believe.” Then these cards become dead cards. (Fig 6)

Dead Cards Believe

Player B Push aside

?

Fig 6

2. Do not believe what the other player said: the player does not believe what the other player said. He/She should first say:” I do not believe it”. Then turn over these cards the previous played. If he/she is right (the previous player told a lie), all these cards will be returned to the previous player who played them. The game keeps going and the role of the player will be changed in the next turn. Example: Player A played a runs clubs 3, diamonds 4, hearts 5, and spade 6 and said:” two pairs, two 7s and two Jacks”. Player B says:” I do not believe it. ” Then, Player B turns over these cards and found he/she is right (Player A told a lie). All these 4 cards will be returned to Player A. (Fig 7)

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 42 -

Fig 7

On the other hand, if he/she is wrong (the previous player told the truth) then he/she should collect all the cards as a punishment. The game keeps going and the role of the player will be changed in the next turn. Example: Player A played a runs clubs 3, diamonds 4, hearts 5, and spade 6 and said:” a runs clubs 3, diamonds 4, hearts 5, and spade 6”. Player B says:” I do not believe it. ” Then, Player B turns over these cards and found he/she is wrong (Player A told the truth). All these 4 cards now belong to Player B. (Fig 8)

Fig 8

Player A told the truth Turn over

Do not believe Player B

Collect

Player B Do not believe Check

Turn over

Player A lied Player A

Return to

Check

Page 50: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Appendices

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 43 -

The Winner: When, after several rounds, the player who first runs out of cards is declared the winner.

Additional Rules: 1. Rule One: There is no specific restrain of how many “truth” or “lie” statements a player can

make. However, telling a truth is always safe for players. Therefore, there is a rule that if a player is found by his/her opponent that he/she has continuously made three times “truth” statements, then, this player will be declared loss.

2. Rule two: Player can play at most four cards at one time. (e.g. four cards runs, two pairs, four cards all one suits, a full or one treble plus a single card)

Page 51: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Appendices

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 44 -

Appendices II

Extracted Pitch Value for Non-deceptive Voices

128.09827993605720 118.82025793920715 118.36345753410458 126.04486108321123 117.06480864262316

117.13998195132154 105.66292535725498 115.68389095397200 109.33741730417672 105.15556372785949

150.04536144545861 255.31140725308424 156.78170011673640 169.35161309890498 148.25932742057225

142.86173347436414 146.53924893328642 154.10036184950832 145.76864351762782 149.37900155994689

144.50009412585621 158.98384365354167 169.97785652395112 147.56718099257364 129.49389137693618

110.24915254577955 129.18784741144296 147.33263100060381 141.35354627864024 206.23229278092586

133.64154616480025 276.36196791510992 111.57168564820746 137.86079830816632 132.57624522583771

129.65026539027838 136.59200922969990 172.15426918317317 171.79901117138439 232.24379352890190

215.86441616436517 171.39538458657662 180.71345179660156 139.90380839668063 124.52821135963218

168.93623659826687 137.99929399146325 131.29455828762588 136.49161314639349 159.18804011290021

150.92715760272989 165.07465687451304 166.39340054878008 155.23337094912267 147.76767983726722

159.52205920227138 158.62420998673900 188.65636427267464 129.15277953595611 156.13994085507198

165.17514287451897 173.77921079189528 160.71899642964965 149.66810766682880 140.42903127078421

166.33624948709746 152.66365653548090 146.57943541732669 131.69853295578480 230.19451372455995

132.17670667424915 143.62085955132949 124.68025369322488 154.95748178591492 110.93523056849458

144.22565907554747 246.93732652512671 143.30992712717969 111.03997338537835 192.63990110493469

138.39900176351705 147.56551596852430 127.93780613631078 135.53235283480180 136.83658019038202

235.70341111617813 145.19943479975652 232.64263947070015 192.82350594402186 163.49940460050223

134.59344263040094 133.41619177397226 169.72618359199282 135.06641221888634 136.37963682771502

159.82757729270855 177.38543620053443 154.36054099419312 134.06368796956923 187.44804330812937

161.66550443364807 176.20352613953662 154.44021937912285 214.03150258081772 210.49848788422057

234.15691046288850 158.82834200924168 147.84238346937889 203.26907917657274 203.86242940932391

144.01341366004061 147.97811161492586 150.20780094618650 168.05900440332883 142.44696166958710

153.07557529906299 143.26950342593602 152.49401810875685 160.68975550390096 149.51834463462168

154.64171066098103 142.30217360153537 152.66188879746358 142.29638492580909 276.16774547122981

143.82168475796416 149.05317469273871 154.49670157616043 147.38263675721618 155.90874982926587

155.95329992143817 194.45540385772293 166.98404220833342 168.06814297669183 176.73745374670730

169.66969782315215 169.22049871195520 169.81134241654007 172.49729472532968 169.56145039213197

174.05653148285978 127.18017143941134 165.89361525079906 168.43870852101111 164.83404983138502

172.85521914126579 159.41779667422267 144.19842785221761 152.08838197240283 163.89624869782784

162.12925603249772 156.58683004013005 146.77687026774703 168.25898161978225 172.57136022174495

253.12954270971699 288.40518496241191 163.27280664670809 175.35373201953286 274.07471223119251

157.36048582514746 142.78829363210079 142.58841876734593 158.01683996266459 152.74023433740319

156.23242175736991 145.97223515020866 149.45978941244695 147.41748428539216 147.39786684603484

145.89537404126824 152.61728590075884 149.00051552350504 139.18524708565900 141.61260098865574

148.87491500561043 147.99073384668085 152.40524751151818 118.04664667543052

Page 52: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Appendices

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 45 -

Appendices III

Extracted Pitch Value for Deceptive Voices

124.51886994534780 146.24090465736722 157.81670104588369 154.04687346328154 172.36876031702894

150.99806283176522 157.06170664309914 157.66544784172950 150.04937337681778 144.41076543052017

120.71132481070872 121.42176146497435 110.97021358944944 116.96530277454313 119.07202517693973

117.86311554169824 243.90242788673416 122.52324246557809 151.70634898133719 138.10288558900726

157.44946484640840 144.76555332211527 135.62892522726466 142.11053676946011 149.11839957962710

142.03154223032970 138.38796825488447 161.21138482070810 134.42226550991509 138.11588600288869

171.31695196260807 134.31764016520520 141.31366150787162 140.00139060449610 275.05592995383421

130.55120050686321 130.64904100050001 127.22010802561051 136.40754863771801 276.07973660942366

137.23107483383160 136.18832682317137 211.54877017320086 127.78076647741396 265.25610095205246

156.76639276358588 137.93235282244638 134.76505446380841 192.79432083263626 124.10787442109483

193.72614620202114 154.54009110081000 207.82745725488041 178.23832458039163 150.64656058708204

148.03893752699372 162.47236008998126 233.43448607574300 154.65694486894259 154.49786707029176

123.99455145292255 154.95219686243817 149.97669313049664 160.97713541399395 170.00925476167853

142.71562289534097 140.59999959168309 147.04056224300527 137.45871276777660 120.61702748519949

134.38373989446734 145.56350503981994 168.30647690379360 157.56355366813693 127.58719122046762

148.88599288109222 121.20501646675467 127.43184218402409 229.32819366711826 167.53902374301319

291.54108439040419 137.52103851350714 139.65262217676133 148.29164967200430 139.38056242542069

129.33270345605891 136.56326949104275 131.23725289211379 133.05885868787840 139.02912475240800

140.00837424935975 129.47776803724292 140.53997292194691 146.59949269582839 153.34185125364732

239.48501033579544 157.33301397792431 148.01768800166340 151.79600065085273 179.22235511254866

137.10214728613400 144.91610273233999 120.65512730497741 188.58573050528221 145.74208591184248

256.24022656184877 186.73240054806286 173.21878648943715 154.68997592080723 144.80844376060335

155.45648872052678 154.86385462221804 159.30148505853032 317.98335147130496 154.77884650063848

157.32013197359470 147.15599540238091 163.90038159769594 162.71116430235551 167.36147935396204

163.11755391142449 129.37808720319148 172.98931583295743 133.89767210312334 174.90859670556921

173.98374801918419 169.04147033351498 273.30286501317926 169.26730074366370 253.24967509388125

160.94977303197444 154.60282098175844 153.22047488192914 175.66924128454127 262.70927786573270

178.43526636221955 156.82221242970448 134.29123823832950 131.08867000560116 156.08179737113340

136.94983603806369 127.57352531133353 140.49179540302163

Page 53: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Appendices

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 46 -

Appendices IV

Extracted Jitter Value for Non-deceptive Voices

0.01239462825660 0.01000702722166 0.01266375966048 0.01334736925665 0.00779339198290

0.01216485576507 0.00862246377159 0.01115835097402 0.00926547963643 0.01234321087564

0.01854170865751 0.01077473187173 0.01116209080919 0.00805527653438 0.01307082511835

0.01099453683372 0.01146491653180 0.01098675666646 0.01207414039318 0.01192982444861

0.00946559084699 0.00649010653825 0.00972001190024 0.01426666640101 0.01836401277029

0.00847673641362 0.01236163337366 0.01356306930215 0.01340673901064 0.00968561846436

0.01453598572587 0.01658757460967 0.00787552980439 0.01400378328405 0.01707194653935

0.00894960363954 0.00904925744897 0.00661962555991 0.01148100178208 0.01186491832382

0.00976638593004 0.00447661067584 0.00760291267184 0.01547749091230 0.00888190653264

0.01163734519193 0.01048881878832 0.01156198067217 0.00696421863224 0.01353430082863

0.01643569970913 0.01428364215000 0.01070750094086 0.01558903349567 0.01794358655044

0.01506883359076 0.00672267826889 0.00935224810751 0.01171864812113 0.01545544822086

0.01194542638103 0.00373619849872 0.00591843525006 0.01663062732701 0.01319543019752

0.01349456992223 0.01744700882721 0.02000336958128 0.01094204116342 0.01094861864912

0.01237936813375 0.01564580079921 0.01421575802084 0.01137662314152 0.00716387136147

0.01656314291732 0.00966301834003 0.01815214661982 0.01006691622868 0.00948686942559

0.01979025642212 0.02127332533199 0.01702133649171 0.01249043674660 0.01233634504138

0.00318277917993 0.00498592455215 0.01200422723932 0.00334260373099 0.01971377779441

0.01381102608594 0.01320726653873 0.02256883161386 0.01834798520344 0.02051119281917

0.01418427440875 0.01143075227049 0.00971964976656 0.02531380311633 0.01543568267123

0.01344476247792 0.02132094393134 0.01091202607353 0.02180197710025 0.01754875600208

0.01663963619570 0.00982515301287 0.01246802148900 0.00717772682266 0.01791612533761

0.01500900151985 0.01625351847804 0.02243002225291 0.01236120300799 0.01452907736980

0.01531448639875 0.01147669689308 0.00821293311644 0.02277641199108 0.01163758841771

0.01347253976443 0.01750291424368 0.01113541088445 0.01204087490573 0.01397748475953

0.02167114142947 0.01818827920535 0.01587154872317 0.01625255407196 0.01647694010607

0.02164266901926 0.01079042780861 0.01435833396333 0.01206218822395 0.01188074381608

0.02790210351452 0.01749659029427 0.01461653507471 0.01091510350809 0.01518714065254

0.01489219874102 0.01426523492847 0.01885546071331 0.00873523815840 0.00664892529305

0.00960550576217 0.00875029965932 0.01007219363494 0.00879606962362 0.01134471254695

0.01244777693575 0.01370578499724 0.01386760295615 0.01034275341499 0.01206361469831

0.01209047720719 0.01382408694175 0.01017342288595 0.00809040295530 0.01366229162381

0.01133127881525 0.01203291025330 0.00904987541104 0.01183661082119 0.01327476241290

0.01480829006322 0.01565647769769 0.01829048286172 0.01562692686423 0.00937787898377

0.01223393659934 0.01301762551055 0.01162886717988 0.01317842507893 0.02054140079680

0.01867361089426 0.01611153016019 0.00907169019984 0.01827292361060 0.00859657411659

0.01357300176124 0.00761712110582 0.01242236790967 0.00933599913791

Page 54: University of Sheffield - Makers of MATLAB and … · University of Sheffield ... A deception encouraged card game was designed ... especially if the speaker is under jeopardy,

Appendices

Voice Stress Analysis: Stress and Detection of Deception Department of Computer Science, The University of Sheffield

- 47 -

Appendices V

Extracted Jitter Value for Deceptive Voices

0.01492581246668 0.01262286447068 0.01874067855117 0.01537220717245 0.01118101775695

0.00984389283145 0.01369850325308 0.01261734141761 0.00845970793719 0.00831335092537

0.01175373319154 0.01482756999096 0.01216468357403 0.00868064751174 0.00708471889028

0.01170249719809 0.01189765451018 0.01159251459575 0.02269994823543 0.01158084701383

0.01679892557882 0.00777683089468 0.01190577057462 0.01592793546229 0.00691971108044

0.01510127659945 0.00828882182444 0.00663099325906 0.01513987449389 0.01463832753507

0.00834784454056 0.01577657075003 0.01092470096475 0.00934116614735 0.01328239729091

0.01379144929662 0.01279489064647 0.01366326954624 0.01168495800325 0.01352830070041

0.01305397703912 0.01435992377204 0.01072889450297 0.00655978366228 0.01469473756940

0.01903410377829 0.01793129476423 0.01259658789575 0.01154625107952 0.01076186884375

0.01223102171688 0.01192524802549 0.00725580041376 0.00648324094082 0.01689405000316

0.01472635751643 0.01464900923126 0.01634978975942 0.01269614783643 0.01098368353274

0.00751458841628 0.01311443338811 0.00960973369465 0.01146371132469 0.01251060668573

0.01444496380038 0.00935302198365 0.01536675588171 0.01946558264594 0.01631779013586

0.01158455965225 0.02238431387682 0.01640408767927 0.01394289034357 0.01872830083158

0.01764682331536 0.01471534368684 0.01940597137752 0.00692366545307 0.01663961661292

0.01498901446446 0.02053933554115 0.01242754133274 0.01344964774929 0.01821179746597

0.01388282012566 0.01297329536826 0.01468964356737 0.01018880190150 0.01896503856515

0.02452619860987 0.01155871149902 0.01052989364111 0.01773334877610 0.01594558491363

0.01363078062008 0.01615266002204 0.01622072641366 0.01257043830390 0.00677534255393

0.01476277032853 0.01802621379240 0.02067115197647 0.01321431092560 0.01776134054226

0.01697177362866 0.01231212634696 0.00814677651027 0.01796597318077 0.00840377774634

0.02284619564532 0.01359245043438 0.02424875030457 0.01034309957330 0.01504583358920

0.01541582705262 0.01608652609875 0.01605916963648 0.02114266980824 0.01251155799993

0.01560761282559 0.01672278531928 0.01387941217318 0.01711050431144 0.01582167540874

0.01561153247806 0.01756047727338 0.01558871622650 0.01544089128958 0.02089663240153

0.01391139866312 0.01489339101058 0.01046141747288 0.01044023304028 0.01806792522159

0.01046439056603 0.01619849360436 0.01939510199233 0.01194054681956 0.01362025187616

0.01554442924674 0.01450886162700 0.00796269328134