Upload
eric-larson
View
467
Download
4
Embed Size (px)
DESCRIPTION
Citation preview
cough
UbiComp LabDesign Use BuildUniversity of Washington
Laboratory of Ubiquitous ComputingUniversity of Washington
Seattle Children’s Hospital
Accurate and Privacy Preserving Cough Sensing using a Low-cost
Microphone
Eric LarsonTienJui Lee, Sean Liu, Margaret Rosenfeld, Shwetak Patel
most common symptom in the world
40% of people have or will experience chronic cough
cough
fear of illness
loss of appetite
loss of sleep
hurts to breathe
no motivation to leave home
and life
change in lifestyle, depression
self conscious
broken rib
cough
lung cancer
common cold
GERD
COPD
pneumonia
tuberculosis
transplant
chemotherapy
treatment and diagnosis of:
pulmonary embolism
post nasal drip
asthma
cystic fibrosis
heart failure
croup
bronchitis
vocal chordpalsy
smoking
allergies
obstruction
chronic cough
... and more
infection
ACE inhibitors
psychological
cough used as tool
cough sensing
our contributions in
accurate cough detection1
generalizes across subjects2
reconstructable cough audio3
privacy of speech4
leverages existing mobile phone5
cough sensing
history of
������������������������
���������������������
����������������
������ �����������������
�������������������������
����������������������
��� ��� ��� �� �� ���� ����
���������������������������
sensingcough
������������������������
���������������������
����������������
������ �����������������
�������������������������
����������������������
��� ��� ��� �� �� ���� ����
���������������������������
Woolf & Rosenberg
������������������������
���������������������
����������������
������ �����������������
�������������������������
����������������������
��� ��� ��� �� �� ���� ����
���������������������������
������������������������
���������������������
����������������
������ �����������������
�������������������������
����������������������
��� ��� ��� �� �� ���� ����
���������������������������
������������������������
���������������������
����������������
������ �����������������
�������������������������
����������������������
��� ��� ��� �� �� ���� ����
���������������������������
-1994: asthma
-1997: cold
-2002: cystic fibrosis
-2003: pneumonia
-2004: COPD
-2008: GERD
-2009: transplants
������������������������
���������������������
����������������
������ �����������������
�������������������������
����������������������
��� ��� ��� �� �� ���� ����
���������������������������
-custom software
-new sensors:holter, throat mic
������������������������
���������������������
����������������
������ �����������������
�������������������������
����������������������
��� ��� ��� �� �� ���� ����
���������������������������
-classification on sensorstreams
-lifeshirt-vitaloJAK-EMG
������������������������
���������������������
����������������
������ �����������������
�������������������������
����������������������
��� ��� ��� �� �� ���� ����
���������������������������
-2006: HACC (semi)-2008: LCM (semi)
New guidelines: *
-unobtrusive-mobile-private-processing-24 hour-specific-automatic
*Decalmer et al. 2007 Morice et al. 2008 Dispinigatis 2011
audio in
classify sound
is cough?
save audio
yes
no
human listener
is cough?
save to database
yes
saved audio
user examples
cough database
no
existing audio cough sensing
70-85% 2-3%
audio in
classify sound
is cough?
save audio
yes
no
human listener
is cough?
save to database
yes
saved audio
user examples
cough database
no
true positive false positive
0-0.5%false positive
~3-5 min/hr
audio in
classify sound
is cough?
save audio
yes
no
human listener
is cough?
save to database
yes
saved audio
user examples
cough database
no
70-85% 2-3%
audio in
classify sound
is cough?
save audio
yes
no
human listener
is cough?
save to database
yes
saved audio
user examples
cough database
no
true positive false positive
0-0.5%false positive
privacy
calibrationreviewtime
accuracy
existing audio cough sensing
~3-5 min/hr
audio in
classify sound
is cough?
save features
yes
no
human listener
is cough?
save to database
yes
saved features
cough features database
no
transformationreconstruction
more privategeneralizes
more accurate
our approach
audio in
classify sound
is cough?
save features
yes
no
human listener
is cough?
save to database
yes
saved features
cough features database
no
transformationreconstruction
more privategeneralizes
more accurate
our approach
transformation
classify soundcough
features database
reconstruction
data collectiongo back to daily routine
for 3-7 hours
pay attention to your cough frequency
come back and self-report cough
frequency
data annotation
one week pilot and set up guideline and shared wiki
4 weeks6 linguistic students
cough speech laughter breath
sneeze wind sniff noise
throat-clearingthroat-clearing others’ coughothers’ cough
annotate each sound type
come to office
coughing?
cough features database
17 participants
events durationcough 2558 12.2 min
speech 5404 15.8 hrlaughter 819 14 min
breathing 522 11.2 minthroat clearing 1210 10.2 min
sneezing 53 35 secnoise 7296 28.5 hr
sniffing 1289 9.2 minbystander cough 901 5.7 min
total 72 hrs72 hrs
difference from self reportfor 17 participants
6-139 coughs/hr 0.02 corr
cough features database
:
audio in
classify sound
is cough?
save features
yes
no
human listener
is cough?
save to database
yes
saved features
cough features database
no
transformationreconstruction
transformation
classify soundcough
features database
reconstruction
sniff noisethroat-clearingcough speech speech + noise
transformation
time
freq
uenc
y
transformation
time
freq
uenc
y
transformation
time
freq
uenc
y
• five-step process: • initial deep inspiration
• glottal closure• contraction of muscles
against the glottis
• sudden glottis opening, explosive expiration
• wheeze or “personal” sound
transformation
time
freq
uenc
y
transformation principal components analysis
transformation principal components analysis
transformation principal components analysis
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0
0.02
0.04
0.06
0.08
0.1
0.12
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0
0.02
0.04
0.06
0.08
0.1
0.12
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0
0.02
0.04
0.06
0.08
0.1
0.12
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0
0.02
0.04
0.06
0.08
0.1
0.12
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0.01
0.02
0.03
0.04
0.05
0.06
0.07
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
Time (s)
Fre
qu
en
cy (
kH
z)
0.01 0.02 0.03
2
4
6
8
10
12
14
16
0
0.02
0.04
0.06
0.08
0.1
0.12
com
pone
nt 1
com
pone
nt 2
com
pone
nt 3
com
pone
nt 4
com
pone
nt N
...
principal components analysistransformation
component weights
audio in
classify sound
is cough?
save features
yes
no
human listener
is cough?
save to database
yes
saved features
cough features database
no
transformationreconstruction
transformation
classify soundcough
features database
reconstruction
classify sound
P1 P2 P3 P4 P5 P6 P7 P8 P9
P10 P11 P12 P13 P14 P15 P16 P17
data folding
classify sound
P1
P2 P3
P4
P5 P6
P7
P8
P9P10
P11
P12
P13 P14
P15
P16
P17
data folding
training grouptest
classify sound
PCA
random forest
event extraction
decision
for each group of training folds:
algorithm
10 weights
1 2 3 4 5false positive rate (%)
100
90
80
70
true
pos
itive
rat
e (%
)classify sound results
fold 1
fold 2
fold 3
fold 4
fold 5
1 2 3 4 5false positive rate (%)
fold 1
fold 2
fold 3
fold 4
fold 5
100
90
80
70
true
pos
itive
rat
e (%
)classify sound
70%
78%
85%
93%
100%
A B C D E mean
92%
99%96%
91%89%
85%
A
B
CD
E
0%
0.2%
0.4%
0.6%
0.8%
A B C D E mean
0.5%
0.8%
0.5%
0.3%
0.6%
0.3%
true
pos
itive
rat
e (%
)fa
lse
posi
tive
rate
(%
)
results
100
90
80
701 2 3 4 5
false positive rate (%)
true
pos
itive
rat
e (%
)
70%
78%
85%
93%
100%
A B C D E mean
92%
99%96%
91%89%
85%
A
B
CD
E
0
7.5
15.0
22.5
30.0
A B C D E mean
17
12
29
1720
7
true
pos
itive
rat
e (%
)fa
lse
alar
ms
per
hour
classify sound results
audio in
classify sound
is cough?
save features
yes
no
human listener
is cough?
save to database
yes
saved features
cough features database
no
transformationreconstruction
transformation
classify soundcough
features database
reconstruction
reconstruction
is speech intelligible?
is cough high fidelity?
reconstruction
is speech intelligible?
is cough high fidelity?
-8 original segments of speech-four male, four femaleenter text
play audio
-5, 10, 15, 25, 50 components-48 audio segments-4 listeners per segment
play audio
very similar
Not Set
how do these cough sounds compare?
same cough sound
somewhat similar
somewhat different
very different
-12 original cough recordings-six male, six female
-72 audio segments-13 listeners per segment-810 subjective ratings
-5, 10, 15, 25, 50 components
0:00 / 4:59
experimental design
experimental results
5 10 15 25 50 baseline
50%
100%
75%
25%
wor
d er
ror
rate
(%
)
0%
sim
ilari
ty (
z-sc
ore)
5 10 15 25 50 baseline
1
-1
0
error bars = interquartile range
reconstruction
audio in
classify sound
is cough?
save features
yes
no
human listener
is cough?
save to database
yes
saved features
cough features database
no
transformationreconstruction
transformation
classify soundcough
features database
reconstruction
contributions
future workFW battery life to 24 hours
FW fidelity of processed features
-unobtrusive-mobile-private-processing-24 hour-specific-automatic
accurate cough detection1
generalizes across subjects2
reconstructable cough audio3
privacy of speech4
leverages existing mobile phone5
UbiComp LabDesign Use BuildUniversity of Washington
Laboratory of Ubiquitous ComputingUniversity of Washington Seattle Children’s
Hospital
Accurate and Privacy Preserving Cough Sensing using a Low-cost
Microphone
TienJui Lee, Sean Liu, Margaret Rosenfeld, Shwetak Patel
Eric [email protected]@ericcooplarsonubicomplab.cs.washington.edu
#privacycough
reconstruction
spectrogram
weights+phase
reconstruction
freq
uenc
yco
mpo
nent
in
dex
freq
uenc
y
predictive cepstral coefficients (LPCC). They applied a Neural Network classifier and achieved an 80% (55-100%) true positive rate and 4% (2-8%) false positive rate. How-ever, they recorded audio signals in an outpatient clinic for only one hour per person, which is a relatively controlled and noise-reduced environment.
Similarly, Matos et al. created a system called the Leicester Cough Monitor (LCM) [26], which uses a lapel microphone with a portable audio recorder. They used MFCCs (with derivatives) as features to a Hidden-Markov Model (HMM). Their average true positive rate was 71% (50% -99%) and a false alarm rate of 13 cough events per hour (false positive rate not reported). After applying an energy threshold to discard low intensity coughs, the average true positive rate for LCM could be boosted to 82% and false alarms reduced to 2.5 events per hour. However, the tra-deoff was to discard on average 29% (6-72%) of the cough events for each subject, and the energy thresholds were required to be computed per individual. Recently, LCM has reported a true positive rate of 91% and false positive rate ~1% [4]. However, this has received unfavorable criticism by the medical community [28], who point out that their most recent publications are not forthcoming about whether the true positives are reported with or without the energy threshold and the system is only evaluated on a small subset of their audio data. They also point out that to get such a low false positive rate, the system requires hired annotators to listen to the low confidence coughs and the annotators must provide a portion of hand segmented cough examples in order to prime the algorithm. As such, the system should actually be coined as semi-automated.
Our system uses principal components analysis (PCA) and a random forest classifier. It has comparable accuracies to existing detection algorithms (92% mean true positive rate), but does not require any automation in order to prime or retrain the models. We note that a direct comparison be-tween our approach and HACC or LCM is impossible. Many of these systems consider their algorithms as proprie-ty, so there is limited information on many of the actual details. Instead, we must opt to compare algorithms on the published accuracies, albeit different datasets. Table 2 summarizes and compares the classification rate of ambula-tory cough detection algorithms.
Audio Privacy Prior work in audio privacy has largely dealt with hiding certain cues about the speakers and conversations around them so that a machine learning algorithm cannot recon-struct valuable information from the feature sets. It is gen-erally accepted that MFCCs are poor features for maintain-ing privacy, as they reveal not only speech, but also inflec-tion, and prosody [46]. As such Wyatt et al. have devised audio features that can successfully hide speech intelligi-bility, while simultaneously providing cues for prosody and recognition of conversations [46]. Most of the related audio privacy work attempts to preserve certain quantities while providing poor features for modern speech recognizers [32]. Chen et al., on the other hand, use linear prediction to re-place vowels in speech, while keeping environmental noises such as cars and running water intelligible to subjects [9]. Our work in this paper, similar to [9], attempts to make the speech unintelligible, but also make it possible to recon-struct cough sounds. Our methodologies however, are quite different.
Eigenvector Feature Selection The most common application of eigenvectors in machine learning is called principal components analysis (PCA). PCA uses orthogonal components (i.e., eigenvectors) of a particular feature space to reduce dimensionality. Compo-nents can be sorted in terms of their corresponding Eigen-value, which ranks the components by how much variation they can explain in the data. Traditional PCA is limited by the assumptions that the optimal transformation of the fea-ture space is linear and orthogonal, which is not true in gen-eral. Even so, PCA has been successfully applied in many domains, the best known of which is face recognition (i.e., Eigenfaces [42]) and gene mapping dimensionality reduc-tion [23]. The use of PCA on audio spectrograms is not new. Pinkowski successfully used PCA to develop a model of the spectrogram for different English vowels sounds [35]. Our work also uses PCA on spectrograms, except our model is made for coughing sounds.
PHYSIOLOGY OF COUGHING This section provides a background on the physiology be-hind the cough reflex and the generation of cough sounds. We also discuss how coughs manifest in an audio stream using spectrograms, motivating the design of our model.
Algorithm (Author) Sensing Subjects Recording
Environment Automation Initial Calibration?
Mean True Positive Rate
Mean False Positive Rate
Mean False Alarms / Hr
LifeShirt Throat Mic. +sensor array N=8 Lab, 24 hours Automatic Yes 78% 0.4% Not reported
VitaloJak Piezo Sensor N=10 Lab, 24 hours Automatic Yes 97.5% 2.3% Not reported HACC Lapel Mic. N=15 Clinic, 1 hour Semi Yes 80% 4% Not reported LCM
(Matos) Lapel Mic. N=19 In Wild, 6 hours Semi Yes 71-82% Not reported 13
LCM (Birring) Lapel Mic. N=19 In Wild,
2-6 hours Semi Yes 91% * <1% 2.5ǂ
Our algorithm
Phone Mic. on necklace N=17 In Wild,
2-6 hours Automatic No 92% 0.5% 17
Table 2. Summary of related work in audio based cough detection. *It is not clear if these rates are reported with or without a 95% energy threshold. ǂThese rates are reported after review by an annotator.
������������������������
���������������������
����������������
������ �����������������
�������������������������
����������������������
��� ��� ��� �� �� ���� ����
���������������������������
history of sensingcough