Accurate and Privacy Preserving Cough Sensing from a Low Cost Microphone

cough

UbiComp LabDesign Use BuildUniversity of Washington

Laboratory of Ubiquitous ComputingUniversity of Washington

Seattle Children’s Hospital

Accurate and Privacy Preserving Cough Sensing using a Low-cost

Microphone

Eric LarsonTienJui Lee, Sean Liu, Margaret Rosenfeld, Shwetak Patel

most common symptom in the world

40% of people have or will experience chronic cough

cough

fear of illness

loss of appetite

loss of sleep

hurts to breathe

no motivation to leave home

and life

change in lifestyle, depression

self conscious

broken rib

cough

lung cancer

common cold

GERD

COPD

pneumonia

tuberculosis

transplant

chemotherapy

treatment and diagnosis of:

pulmonary embolism

post nasal drip

asthma

cystic fibrosis

heart failure

croup

bronchitis

vocal chordpalsy

smoking

allergies

obstruction

chronic cough

... and more

infection

ACE inhibitors

psychological

cough used as tool

cough sensing

our contributions in

accurate cough detection1

generalizes across subjects2

reconstructable cough audio3

privacy of speech4

leverages existing mobile phone5

cough sensing

history of

��

��

��

��

��

��

��

��

sensingcough

��

��

��

��

��

��

��

��

Woolf & Rosenberg

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

-1994: asthma

-1997: cold

-2002: cystic fibrosis

-2003: pneumonia

-2004: COPD

-2008: GERD

-2009: transplants

��

��

��

��

��

��

��

��

-custom software

-new sensors:holter, throat mic

��

��

��

��

��

��

��

��

-classification on sensorstreams

-lifeshirt-vitaloJAK-EMG

��

��

��

��

��

��

��

��

-2006: HACC (semi)-2008: LCM (semi)

New guidelines: *

-unobtrusive-mobile-private-processing-24 hour-specific-automatic

*Decalmer et al. 2007 Morice et al. 2008 Dispinigatis 2011

audio in

classify sound

is cough?

save audio

yes

no

human listener

is cough?

save to database

yes

saved audio

user examples

cough database

no

existing audio cough sensing

70-85% 2-3%

audio in

classify sound

is cough?

save audio

yes

no

human listener

is cough?

save to database

yes

saved audio

user examples

cough database

no

true positive false positive

0-0.5%false positive

~3-5 min/hr

audio in

classify sound

is cough?

save audio

yes

no

human listener

is cough?

save to database

yes

saved audio

user examples

cough database

no

70-85% 2-3%

audio in

classify sound

is cough?

save audio

yes

no

human listener

is cough?

save to database

yes

saved audio

user examples

cough database

no

true positive false positive

0-0.5%false positive

privacy

calibrationreviewtime

accuracy

existing audio cough sensing

~3-5 min/hr

audio in

classify sound

is cough?

save features

yes

no

human listener

is cough?

save to database

yes

saved features

cough features database

no

transformationreconstruction

more privategeneralizes

more accurate

our approach

audio in

classify sound

is cough?

save features

yes

no

human listener

is cough?

save to database

yes

saved features


no


more privategeneralizes

more accurate

our approach

transformation

classify soundcough

features database

reconstruction

data collectiongo back to daily routine

for 3-7 hours

pay attention to your cough frequency

come back and self-report cough

frequency

data annotation

one week pilot and set up guideline and shared wiki

4 weeks6 linguistic students

cough speech laughter breath

sneeze wind sniff noise

throat-clearingthroat-clearing others’ coughothers’ cough

annotate each sound type

come to office

coughing?


17 participants

events durationcough 2558 12.2 min

speech 5404 15.8 hrlaughter 819 14 min

breathing 522 11.2 minthroat clearing 1210 10.2 min

sneezing 53 35 secnoise 7296 28.5 hr

sniffing 1289 9.2 minbystander cough 901 5.7 min

total 72 hrs72 hrs

difference from self reportfor 17 participants

6-139 coughs/hr 0.02 corr


:

audio in

classify sound

is cough?

save features

yes

no

human listener

is cough?

save to database

yes

saved features


no


transformation

classify soundcough

features database

reconstruction

sniff noisethroat-clearingcough speech speech + noise

transformation

time

freq

uenc

y

transformation

time

freq

uenc

y

transformation

time

freq

uenc

y

• five-step process: • initial deep inspiration

• glottal closure• contraction of muscles

against the glottis

• sudden glottis opening, explosive expiration

• wheeze or “personal” sound

transformation

time

freq

uenc

y

transformation principal components analysis



Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0

0.02

0.04

0.06

0.08

0.1

0.12

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0

0.02

0.04

0.06

0.08

0.1

0.12

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0

0.02

0.04

0.06

0.08

0.1

0.12

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0

0.02

0.04

0.06

0.08

0.1

0.12

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

Time (s)

Fre

qu

en

cy (

kH

z)

0.01 0.02 0.03

2

4

6

8

10

12

14

16

0

0.02

0.04

0.06

0.08

0.1

0.12

com

pone

nt 1

com

pone

nt 2

com

pone

nt 3

com

pone

nt 4

com

pone

nt N

...

principal components analysistransformation

component weights

audio in

classify sound

is cough?

save features

yes

no

human listener

is cough?

save to database

yes

saved features


no


transformation

classify soundcough

features database

reconstruction

classify sound

P1 P2 P3 P4 P5 P6 P7 P8 P9

P10 P11 P12 P13 P14 P15 P16 P17

data folding

classify sound

P1

P2 P3

P4

P5 P6

P7

P8

P9P10

P11

P12

P13 P14

P15

P16

P17

data folding

training grouptest

classify sound

PCA

random forest

event extraction

decision

for each group of training folds:

algorithm

10 weights

1 2 3 4 5false positive rate (%)

100

90

80

70

true

pos

itive

rat

e (%

)classify sound results

fold 1

fold 2

fold 3

fold 4

fold 5

1 2 3 4 5false positive rate (%)

fold 1

fold 2

fold 3

fold 4

fold 5

100

90

80

70

true

pos

itive

rat

e (%

)classify sound

70%

78%

85%

93%

100%

A B C D E mean

92%

99%96%

91%89%

85%

A

B

CD

E

0%

0.2%

0.4%

0.6%

0.8%

A B C D E mean

0.5%

0.8%

0.5%

0.3%

0.6%

0.3%

true

pos

itive

rat

e (%

)fa

lse

posi

tive

rate

(%

)

results

100

90

80

701 2 3 4 5

false positive rate (%)

true

pos

itive

rat

e (%

)

70%

78%

85%

93%

100%

A B C D E mean

92%

99%96%

91%89%

85%

A

B

CD

E

0

7.5

15.0

22.5

30.0

A B C D E mean

17

12

29

1720

7

true

pos

itive

rat

e (%

)fa

lse

alar

ms

per

hour

classify sound results

audio in

classify sound

is cough?

save features

yes

no

human listener

is cough?

save to database

yes

saved features


no


transformation

classify soundcough

features database

reconstruction

reconstruction

is speech intelligible?

is cough high fidelity?

reconstruction

is speech intelligible?

is cough high fidelity?

-8 original segments of speech-four male, four femaleenter text

play audio

-5, 10, 15, 25, 50 components-48 audio segments-4 listeners per segment

play audio

very similar

Not Set

how do these cough sounds compare?

same cough sound

somewhat similar

somewhat different

very different

-12 original cough recordings-six male, six female

-72 audio segments-13 listeners per segment-810 subjective ratings

-5, 10, 15, 25, 50 components

0:00 / 4:59

experimental design

experimental results

5 10 15 25 50 baseline

50%

100%

75%

25%

wor

d er

ror

rate

(%

)

0%

sim

ilari

ty (

z-sc

ore)

5 10 15 25 50 baseline

1

-1

0

error bars = interquartile range

reconstruction

audio in

classify sound

is cough?

save features

yes

no

human listener

is cough?

save to database

yes

saved features


no


transformation

classify soundcough

features database

reconstruction

contributions

future workFW battery life to 24 hours

FW fidelity of processed features

-unobtrusive-mobile-private-processing-24 hour-specific-automatic

accurate cough detection1

generalizes across subjects2

reconstructable cough audio3

privacy of speech4

leverages existing mobile phone5

UbiComp LabDesign Use BuildUniversity of Washington

Laboratory of Ubiquitous ComputingUniversity of Washington Seattle Children’s

Hospital

Accurate and Privacy Preserving Cough Sensing using a Low-cost

Microphone

TienJui Lee, Sean Liu, Margaret Rosenfeld, Shwetak Patel

Eric [email protected]@ericcooplarsonubicomplab.cs.washington.edu

#privacycough

mailto:[email protected]

mailto:[email protected]

reconstruction

spectrogram

weights+phase

reconstruction

freq

uenc

yco

mpo

nent

in

dex

freq

uenc

y

predictive cepstral coefficients (LPCC). They applied a Neural Network classifier and achieved an 80% (55-100%) true positive rate and 4% (2-8%) false positive rate. How-ever, they recorded audio signals in an outpatient clinic for only one hour per person, which is a relatively controlled and noise-reduced environment.

Similarly, Matos et al. created a system called the Leicester Cough Monitor (LCM) [26], which uses a lapel microphone with a portable audio recorder. They used MFCCs (with derivatives) as features to a Hidden-Markov Model (HMM). Their average true positive rate was 71% (50% -99%) and a false alarm rate of 13 cough events per hour (false positive rate not reported). After applying an energy threshold to discard low intensity coughs, the average true positive rate for LCM could be boosted to 82% and false alarms reduced to 2.5 events per hour. However, the tra-deoff was to discard on average 29% (6-72%) of the cough events for each subject, and the energy thresholds were required to be computed per individual. Recently, LCM has reported a true positive rate of 91% and false positive rate ~1% [4]. However, this has received unfavorable criticism by the medical community [28], who point out that their most recent publications are not forthcoming about whether the true positives are reported with or without the energy threshold and the system is only evaluated on a small subset of their audio data. They also point out that to get such a low false positive rate, the system requires hired annotators to listen to the low confidence coughs and the annotators must provide a portion of hand segmented cough examples in order to prime the algorithm. As such, the system should actually be coined as semi-automated.

Our system uses principal components analysis (PCA) and a random forest classifier. It has comparable accuracies to existing detection algorithms (92% mean true positive rate), but does not require any automation in order to prime or retrain the models. We note that a direct comparison be-tween our approach and HACC or LCM is impossible. Many of these systems consider their algorithms as proprie-ty, so there is limited information on many of the actual details. Instead, we must opt to compare algorithms on the published accuracies, albeit different datasets. Table 2 summarizes and compares the classification rate of ambula-tory cough detection algorithms.

Audio Privacy Prior work in audio privacy has largely dealt with hiding certain cues about the speakers and conversations around them so that a machine learning algorithm cannot recon-struct valuable information from the feature sets. It is gen-erally accepted that MFCCs are poor features for maintain-ing privacy, as they reveal not only speech, but also inflec-tion, and prosody [46]. As such Wyatt et al. have devised audio features that can successfully hide speech intelligi-bility, while simultaneously providing cues for prosody and recognition of conversations [46]. Most of the related audio privacy work attempts to preserve certain quantities while providing poor features for modern speech recognizers [32]. Chen et al., on the other hand, use linear prediction to re-place vowels in speech, while keeping environmental noises such as cars and running water intelligible to subjects [9]. Our work in this paper, similar to [9], attempts to make the speech unintelligible, but also make it possible to recon-struct cough sounds. Our methodologies however, are quite different.

Eigenvector Feature Selection The most common application of eigenvectors in machine learning is called principal components analysis (PCA). PCA uses orthogonal components (i.e., eigenvectors) of a particular feature space to reduce dimensionality. Compo-nents can be sorted in terms of their corresponding Eigen-value, which ranks the components by how much variation they can explain in the data. Traditional PCA is limited by the assumptions that the optimal transformation of the fea-ture space is linear and orthogonal, which is not true in gen-eral. Even so, PCA has been successfully applied in many domains, the best known of which is face recognition (i.e., Eigenfaces [42]) and gene mapping dimensionality reduc-tion [23]. The use of PCA on audio spectrograms is not new. Pinkowski successfully used PCA to develop a model of the spectrogram for different English vowels sounds [35]. Our work also uses PCA on spectrograms, except our model is made for coughing sounds.

PHYSIOLOGY OF COUGHING This section provides a background on the physiology be-hind the cough reflex and the generation of cough sounds. We also discuss how coughs manifest in an audio stream using spectrograms, motivating the design of our model.

Algorithm (Author) Sensing Subjects Recording

Environment Automation Initial Calibration?

Mean True Positive Rate

Mean False Positive Rate

Mean False Alarms / Hr

LifeShirt Throat Mic. +sensor array N=8 Lab, 24 hours Automatic Yes 78% 0.4% Not reported

VitaloJak Piezo Sensor N=10 Lab, 24 hours Automatic Yes 97.5% 2.3% Not reported HACC Lapel Mic. N=15 Clinic, 1 hour Semi Yes 80% 4% Not reported LCM

(Matos) Lapel Mic. N=19 In Wild, 6 hours Semi Yes 71-82% Not reported 13

LCM (Birring) Lapel Mic. N=19 In Wild,

2-6 hours Semi Yes 91% * <1% 2.5ǂ

Our algorithm

Phone Mic. on necklace N=17 In Wild,

2-6 hours Automatic No 92% 0.5% 17

Table 2. Summary of related work in audio based cough detection. *It is not clear if these rates are reported with or without a 95% energy threshold. ǂThese rates are reported after review by an annotator.

��

��

��

��

��

��

��

��

history of sensingcough

Health & Medicine

Accurate and Privacy Preserving Cough Sensing from a Low Cost Microphone