Upload
lycong
View
217
Download
0
Embed Size (px)
Citation preview
WAVELETS: Data Analytic Perspective
Brani Vidakovic
Georgia Institute of Technology, Atlanta, GA, USA
Seminar Talk at Department of Industrial EngineeringSeoul National University
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 1 / 44
Overview
In this talk:
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 2 / 44
Overview
In this talk:Four Holy Grails of Wavelets or Why Wavelets
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 2 / 44
Overview
In this talk:Four Holy Grails of Wavelets or Why Wavelets
What are Wavelets?
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 2 / 44
Overview
In this talk:Four Holy Grails of Wavelets or Why Wavelets
What are Wavelets?
Wavelet Shrinkage via Statistical Inference
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 2 / 44
Overview
In this talk:Four Holy Grails of Wavelets or Why Wavelets
What are Wavelets?
Wavelet Shrinkage via Statistical Inference
BAMS Example
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 2 / 44
Overview
In this talk:Four Holy Grails of Wavelets or Why Wavelets
What are Wavelets?
Wavelet Shrinkage via Statistical Inference
BAMS Example
Wavelets and Scaling
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 2 / 44
Overview
In this talk:Four Holy Grails of Wavelets or Why Wavelets
What are Wavelets?
Wavelet Shrinkage via Statistical Inference
BAMS Example
Wavelets and Scaling
MATLAB Sessions
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 2 / 44
Overview
In this talk:Four Holy Grails of Wavelets or Why Wavelets
What are Wavelets?
Wavelet Shrinkage via Statistical Inference
BAMS Example
Wavelets and Scaling
MATLAB Sessions
MATLAB DEMOS:http://zoe.bme.gatech.edu/~bv20/isye6420/
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 2 / 44
Overview
In this talk:Four Holy Grails of Wavelets or Why Wavelets
What are Wavelets?
Wavelet Shrinkage via Statistical Inference
BAMS Example
Wavelets and Scaling
MATLAB Sessions
MATLAB DEMOS:http://zoe.bme.gatech.edu/~bv20/isye6420/
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 2 / 44
Time/Frequency or Time/Scale Domains
Echolocation Signal in Time/Scale Domain
0 50 100 150 200 250 300 350 400−0.25
−0.2
−0.15
−0.1
−0.05
0
0.05
0.1
0.15
Time
Fre
qu
en
cy
0 50 100 150 200 250 300 350 4000
20
40
60
80
100
120
140
160
180
200
Figure: Digitized 2.5 microsecond echolocation pulse emitted by theLarge Brown Bat, Eptesicus Fuscus and time/scale (Wigner-Ville)representation of the pulse. (Left) Echolocation principle; (Middle)Pulse in the time domain; (Right) Wigner-Ville transform.Data courtesy of Rich Baraniuk, DSP at Rice University.
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 3 / 44
Wavelets compress the data
time
Tim
e do
mai
n va
lues
0 100 200 300 400 500
-1.0
0.0
0.5
1.0
Haar
wav
elet d
omain
value
s
0 100 200 300 400 500
-4-2
02
p
L(p)
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Figure: Left: Normalized wind velocity [60 Hz, Duke Forest] timeseries and its wavelet transform. Right. Corresponding Lorentz curvesof “energies” (squared coefficients).
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 4 / 44
Wavelets whiten the data
(a)
AC
F in
tim
e do
mai
n0 5 10 15 20
-0.2
0.2
0.6
1.0
(b)
AC
F in
wav
elet
dom
ain
0 5 10 15 20
-0.2
0.2
0.6
1.0
Figure: The autocorrelations in the wind velocity time series [panel(a)] and its wavelet transform [panel (b)].
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 5 / 44
Wavelets filter the data
0 1 2 3 4 5 6
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
0 1 2 3 4 5 6
−1
−0.5
0
0.5
1
2 3 4 5 6 7 8 90
50
100
150
200
250
300
350
400
0 1 2 3 4 5 6−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Figure: Two functions with different frequencies/scale informationseparate in the wavelet domain.
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 6 / 44
Wavelets detect self-similarity in data
1 2 3 4 5 6
x 104
−1
−0.8
−0.6
−0.4
−0.2
0
4 6 8 10 12 14
−14
−12
−10
−8
−6
−4
−2
0
2
4
6
−1.9839
dyadic level
log
sp
ect
rum
log2(average(coefs2))
Figure: (Left) A path of Brownian motion; (Right) Wavelet basedlog-spectrum. Regular decay is a signature of monofractality.
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 7 / 44
What are Wavelets?
• IL2 o.n. bases of the form
{
ψjk(x) = 2j/2ψ(2jx− k), j, k ∈ ZZ}
f ∈ IL2 : f(x) ≈∑
j,k ∈ finite set
djkψjk(x).
j - index of scale with resolution/size 2−j. Frequency is 2j , areciprocal of scale.k - location, shift, translate, “time”. Measures the energy in theneighborhood of x = k/2j.
• {φJ0,k(x), ψjk(x), j ≥ J0, k ∈ ZZ}• Multiresolution Analysis (MRA) fully determined by φ
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 8 / 44
Three Ways To Think About DWT
y˜= (y1, . . . , yN)- data. There is an underlying function f , so
thatf(x) =
∑
k
ykφJk(x), J = log2(N)
1. f(x) =∑
j<J,k djkψjk(x) y˜−→ {djk = 〈f, ψjk〉} or d
˜.
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 9 / 44
Three Ways To Think About DWT
y˜= (y1, . . . , yN)- data. There is an underlying function f , so
thatf(x) =
∑
k
ykφJk(x), J = log2(N)
1. f(x) =∑
j<J,k djkψjk(x) y˜−→ {djk = 〈f, ψjk〉} or d
˜.
2. d˜=Wy
˜or y
˜= W ′d
˜
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 9 / 44
Three Ways To Think About DWT
y˜= (y1, . . . , yN)- data. There is an underlying function f , so
thatf(x) =
∑
k
ykφJk(x), J = log2(N)
1. f(x) =∑
j<J,k djkψjk(x) y˜−→ {djk = 〈f, ψjk〉} or d
˜.
2. d˜=Wy
˜or y
˜= W ′d
˜
3. d˜= (...G(H(H(Hy
˜))),G(H(Hy
˜)),G(Hy
˜),Gy
˜)
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 9 / 44
Second & Third!
2.d˜=Wy
˜(Short signals and images)
3.
h˜= (h1, . . . , hM) g
˜via gn = (−1)n h1−n.
Haar: h˜= (1/
√2 1/
√2) g
˜= (1/
√2 − 1/
√2)
H - filtering with h˜(low pass) + decimation (keep every 2nd)
G - filtering with g˜(high pass) + decimation
d˜= (H . . . (H(H(Hy
˜))) | G . . . (H(H(Hy
˜))) | . . . | G(H(Hy
˜)) | G(Hy
˜) | Gy
˜))
d˜= (smooth part | coarsest details | . . . | finest details )
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 10 / 44
Organization of Scales in Wavelet Domain
signal y˜
1024finest details Gy
˜512
fine details G(H(y˜)) 256
details G(H(H(y˜))) 128
coarse details G(H(H(H(y˜)))) 64
coarsest details G(H(H(H(H(y˜))))) 32
smooth H(H(H(H(H(y˜))))) 32
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 11 / 44
Mallat’s Algorithm
✲ ✲ ✲ ✲ ✲y˜
c˜(1) c
˜(2) . . . c
˜(k−1) c
˜(k)
✡✡✡✡✡✣
✡✡✡✡✡✣
✡✡✡✡✡✣
✡✡✡✡✡✣
✡✡✡✡✡✣
d˜(1) d
˜(2) . . . d
˜(k−1) d
˜(k)
G
H
G
H
G
H
G
H
G
H
d˜(3)
c˜(3)
. . .
. . .⊕
✲
❏❏❏❏❫
✲c˜(2)
d˜(2)
❏❏❏❏❫⊕✲ ✲c
˜(1)
d˜(1)
❏❏❏❏❫⊕✲ ✲ y
˜
G∗
H∗
G∗
H∗
G∗
H∗
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 12 / 44
Mallat’s Algorithm
H,G a single step in forward DWT filter + ↓H∗,G∗ a single step in inverse DWT ↑ + filter
H : cj−1,l =∑
k
hk−2lcj,k
G : dj−1,l =∑
k
gk−2lcj,k
H∗,G∗ : cj,k =∑
l
cj−1,lhk−2l +∑
l
dj−1,lgk−2l.
dwtr.m: wdata = dwtr(data, L, filterh)idwtr.m: data1 = dwtr(wdata, L, filterh)
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 13 / 44
DEMO 0
• Dem0a.m (dwtr.m, idwtr.m)• Dem0b.m (WavMat.m)
200 400 600 800 1000−0.5
0
0.5Doppler
200 400 600 800 1000−2
0
2Wavelet Transform of Doppler Signal by Symmlet 8
200 400 600 800 1000−0.5
0
0.5Doppler Recovered
Figure: Output of Dem0a.m
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 14 / 44
From h to φ, ψ
h → m0 → Φ(ω) → φ(x)
Transfer function and Φ(ω)
m0(ω) =1√2
∑
k∈ZZhke
−ikω [=1√2H(ω)].
Φ(ω) = F(φ(x)) =∫ ∞
−∞φ(x)e−iωxdx
φ(x) =∑
k hk√2φ(2x− k)
Φ(ω) = m0
(
ω2
)
Φ(
ω2
)
= · · · = ∏∞n=1m0
(
ω2n
)
. φ(x) = F−1(Φ(ω))
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 15 / 44
DEMO 1
• Dem1a.m (Symmlet 4, via Daubechies-Lagarias Algorithm, uses Phijk.m,Psijk.m)• h = [-0.07576571479 -0.02963552765 0.49761866763 0.803738751810.29785779561 -0.09921954358 -0.01260396726 0.03222310060];
• Dem1b.m (Pollen family: h = [(1 + cosφ− sinφ)/s,
(1 + cosφ+ sinφ)/s, (1− cosφ+ sinφ)/s, (1− cosφ− sinφ)/s] for s = 2√2
and φ = π/4)
0 2 4 6
0
0.2
0.4
0.6
0.8
1
1.2
−2 0 2 4
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
0 1 2 3
−0.5
0
0.5
1
1.5
−1 0 1 2
−1
−0.5
0
0.5
1
1.5
Figure: Scaling and wavelet functions for Symmlet 4 and Pollen ϕ = 45◦
bases
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 16 / 44
Why Wavelets in Data Analysis?
WT are local (in time; in scale/frequency)
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 17 / 44
Why Wavelets in Data Analysis?
WT are local (in time; in scale/frequency)
WT are orthogonal or near-orthogonal
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 17 / 44
Why Wavelets in Data Analysis?
WT are local (in time; in scale/frequency)
WT are orthogonal or near-orthogonal
WT are applicable to discrete data sets
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 17 / 44
Why Wavelets in Data Analysis?
WT are local (in time; in scale/frequency)
WT are orthogonal or near-orthogonal
WT are applicable to discrete data sets
WT preserve but disbalance the energy in data
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 17 / 44
Why Wavelets in Data Analysis?
WT are local (in time; in scale/frequency)
WT are orthogonal or near-orthogonal
WT are applicable to discrete data sets
WT preserve but disbalance the energy in data
WT whiten data
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 17 / 44
Why Wavelets in Data Analysis?
WT are local (in time; in scale/frequency)
WT are orthogonal or near-orthogonal
WT are applicable to discrete data sets
WT preserve but disbalance the energy in data
WT whiten data
WT are FAST! O(n)
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 17 / 44
Why Wavelets in Data Analysis?
WT are local (in time; in scale/frequency)
WT are orthogonal or near-orthogonal
WT are applicable to discrete data sets
WT preserve but disbalance the energy in data
WT whiten data
WT are FAST! O(n)
WT are versatile
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 17 / 44
Why Wavelets in Data Analysis?
WT are local (in time; in scale/frequency)
WT are orthogonal or near-orthogonal
WT are applicable to discrete data sets
WT preserve but disbalance the energy in data
WT whiten data
WT are FAST! O(n)
WT are versatile
WT are Bayes friendly
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 17 / 44
Why Wavelets in Data Analysis?
WT are local (in time; in scale/frequency)
WT are orthogonal or near-orthogonal
WT are applicable to discrete data sets
WT preserve but disbalance the energy in data
WT whiten data
WT are FAST! O(n)
WT are versatile
WT are Bayes friendly
WT are sensitive to self-similar phenomena
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 17 / 44
Why Wavelets in Data Analysis?
WT are local (in time; in scale/frequency)
WT are orthogonal or near-orthogonal
WT are applicable to discrete data sets
WT preserve but disbalance the energy in data
WT whiten data
WT are FAST! O(n)
WT are versatile
WT are Bayes friendly
WT are sensitive to self-similar phenomena
much more...
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 17 / 44
Where Wavelets in Data Analysis?
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 18 / 44
Where Wavelets in Data Analysis?
• Regression Problems♦ Equi- and Nonequi-spaced Regression♦ Dimension Reduction, Approximate PCA, Denoising♦ Pursuits and Data Driven Selection of Analyzing Tools
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 18 / 44
Where Wavelets in Data Analysis?
• Regression Problems♦ Equi- and Nonequi-spaced Regression♦ Dimension Reduction, Approximate PCA, Denoising♦ Pursuits and Data Driven Selection of Analyzing Tools
• Density Estimation♦ Functionals of a Density, Classification and Discrimination.
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 18 / 44
Where Wavelets in Data Analysis?
• Regression Problems♦ Equi- and Nonequi-spaced Regression♦ Dimension Reduction, Approximate PCA, Denoising♦ Pursuits and Data Driven Selection of Analyzing Tools
• Density Estimation♦ Functionals of a Density, Classification and Discrimination.
• Deconvolutions
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 18 / 44
Where Wavelets in Data Analysis?
• Regression Problems♦ Equi- and Nonequi-spaced Regression♦ Dimension Reduction, Approximate PCA, Denoising♦ Pursuits and Data Driven Selection of Analyzing Tools
• Density Estimation♦ Functionals of a Density, Classification and Discrimination.
• Deconvolutions
• Time Series♦ Approximate K-L Expansions.♦ Nonstationary TS, Wavelet-based Spectral Analysis
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 18 / 44
Where Wavelets in Data Analysis?
• Regression Problems♦ Equi- and Nonequi-spaced Regression♦ Dimension Reduction, Approximate PCA, Denoising♦ Pursuits and Data Driven Selection of Analyzing Tools
• Density Estimation♦ Functionals of a Density, Classification and Discrimination.
• Deconvolutions
• Time Series♦ Approximate K-L Expansions.♦ Nonstationary TS, Wavelet-based Spectral Analysis
• Long-Range Dependence, Self-similarity and Scaling in Data,(Multi-)Fractality.
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 18 / 44
Where Wavelets in Statistics?
• GLM, GAM, Censored Models, Functional Data Analysis,Change Point Analysis
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 19 / 44
Where Wavelets in Statistics?
• GLM, GAM, Censored Models, Functional Data Analysis,Change Point Analysis
• Theory of Shapes, Wavelet Based Bookmarks
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 19 / 44
Where Wavelets in Statistics?
• GLM, GAM, Censored Models, Functional Data Analysis,Change Point Analysis
• Theory of Shapes, Wavelet Based Bookmarks
• Biased Sampling
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 19 / 44
Where Wavelets in Statistics?
• GLM, GAM, Censored Models, Functional Data Analysis,Change Point Analysis
• Theory of Shapes, Wavelet Based Bookmarks
• Biased Sampling
• Medical Image Enhancement, Mammogramy, fMRI, CXR, CT
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 19 / 44
Where Wavelets in Statistics?
• GLM, GAM, Censored Models, Functional Data Analysis,Change Point Analysis
• Theory of Shapes, Wavelet Based Bookmarks
• Biased Sampling
• Medical Image Enhancement, Mammogramy, fMRI, CXR, CT
• Bayesian Applications. Bayesian Nonparametrics
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 19 / 44
Where Wavelets in Statistics?
• GLM, GAM, Censored Models, Functional Data Analysis,Change Point Analysis
• Theory of Shapes, Wavelet Based Bookmarks
• Biased Sampling
• Medical Image Enhancement, Mammogramy, fMRI, CXR, CT
• Bayesian Applications. Bayesian Nonparametrics
• Statistical Calculation, Simulation, Wavestrapping.
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 19 / 44
Where Wavelets in Statistics?
• GLM, GAM, Censored Models, Functional Data Analysis,Change Point Analysis
• Theory of Shapes, Wavelet Based Bookmarks
• Biased Sampling
• Medical Image Enhancement, Mammogramy, fMRI, CXR, CT
• Bayesian Applications. Bayesian Nonparametrics
• Statistical Calculation, Simulation, Wavestrapping.
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 19 / 44
Model Based Wavelet Data Processing
DATAW−→ Wavelet Coefficients: “Detail” & “Smooth”
‖
Processed DATAW−1
←− Process (Detail) Coefficients
Process ≡
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 20 / 44
Model Based Wavelet Data Processing
DATAW−→ Wavelet Coefficients: “Detail” & “Smooth”
‖
Processed DATAW−1
←− Process (Detail) Coefficients
Process ≡Shrink, Threshold, Split
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 20 / 44
Model Based Wavelet Data Processing
DATAW−→ Wavelet Coefficients: “Detail” & “Smooth”
‖
Processed DATAW−1
←− Process (Detail) Coefficients
Process ≡Shrink, Threshold, Split
Transform
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 20 / 44
Model Based Wavelet Data Processing
DATAW−→ Wavelet Coefficients: “Detail” & “Smooth”
‖
Processed DATAW−1
←− Process (Detail) Coefficients
Process ≡Shrink, Threshold, Split
TransformSimulate New, Construct
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 20 / 44
Model Based Wavelet Data Processing
DATAW−→ Wavelet Coefficients: “Detail” & “Smooth”
‖
Processed DATAW−1
←− Process (Detail) Coefficients
Process ≡Shrink, Threshold, Split
TransformSimulate New, Construct
Resample, Permute
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 20 / 44
Model Based Wavelet Data Processing
DATAW−→ Wavelet Coefficients: “Detail” & “Smooth”
‖
Processed DATAW−1
←− Process (Detail) Coefficients
Process ≡Shrink, Threshold, Split
TransformSimulate New, Construct
Resample, PermuteAssess “Energy” and “Fluxes”
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 20 / 44
Wavelet Shrinkage: Shrinkage Policies
Hard ThresholdingSoft ThresholdingSemi-Soft ThresholdingSmooth ShrinkageVarious Variable Selection Methods
-1 -0.5 0.5 1d
-1
-0.5
0.5
1
-4 -2 2 4
d
-4
-2
2
4
Figure: Examples of thresholding rules
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 21 / 44
Universal Threshold
Universal Threshold
λ =√
2 logn σ̂
σ̂ is an estimator of std of noise, σ. Many proposals – usuallyinvolving only the finest level of detail.
Rationale: If X1, . . . , Xn are i.i.d. N (0, σ2) thenEX(n) = −EX(1) =
√2 logn σ.
• Since the estimated expected range of noise is[ −√2 logn σ̂, √2 logn σ̂ ], any coefficient with the magnitudeoutside the range is attributed to the signal and thus retained inthe model.• Oversmooths in practice.
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 22 / 44
DEMO 2
• Dem2a.m, Dem2b.m (Universal Threshold, HardThresholding Policy)
0 0.2 0.4 0.6 0.8 1−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
0 0.2 0.4 0.6 0.8 1−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
Figure: Left: Doppler + Noise (data in red) and Doppler(green). Right: Doppler estimate by thresholding (black).
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 23 / 44
Shrinkage induced by statistical modeling in thewavelet domain
y˜= f
˜+ ǫ˜
W−→ d˜= θ
˜+ ǫ˜
Estimate θ by θ̂ and obtain f̂ as W−1(θ̂)
Location model on d, f(d− θ˜|parameters)
Dimensionality (Do not worry – wavelets decorrelate)Accounting for dependence (neighbors, parent-children),
Blocking strategies (classical), Many Bayes solutions (MCMC,hidden MC’s).
Model complexity/efficiency compromise. Simplemodels/Fast shrinkage ◦ Realistic? Complex models ◦Useful?
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 24 / 44
BAMS
Compromise between model reality and simplicity
BAMS (Bayesian Adaptive MultiscaleShrinkage/Smoothing)
Model (Likelihood)
[d|θ, σ2] ∼ N (θ, σ2); σ2 ∼ E(µ), µ > 0;[
µe−µσ2
, µ > 0]
Marginal Likelihood
d|θ ∼ DE(
θ,1√2µ
)
;
[
1
2
√
2µe−√2µ|d−θ|
]
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 25 / 44
Prior
[θ|ǫ] ∼ ǫδ0 + (1− ǫ)DE(0, τ), ǫ = ǫ(multiresolution level)
Predictive Distribution – Marginal
d ∼ m(d) = ǫDE(0, 1√2µ
) + (1− ǫ)τe−|d|/τ − 1√
2µe−
√2µ|d|
2τ 2 − 1/µ
−3 −2 −1 0 1 2 30
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 26 / 44
Bayes Rule: BAMS
δ(d) = (1− ǫ)m(d)δ∗(d)/
[
(1− ǫ)m(d) + ǫDE(0, 1√2µ
)
]
,
δ∗(d) =τ(τ2 − 1/(2µ))de−|d|/τ + τ2/µ(e−|d|√2µ − e−|d|/τ)
(τ2 − 1/(2µ))(τe−|d|/τ − (1/√2µ)e−|d|√2µ)
−6 −4 −2 0 2 4 6
−6
−4
−2
0
2
4
6
Figure: Bayes’ rule for selected values of τ , µ and ǫ.Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 27 / 44
BAMS
Specification of Hyperparameters:• Recall: σ2 ∼ E(µ), IEσ2 = 1/µ ⇒ µ = 1
pseudos ,
(Tukey) pseudos = |Q1 −Q3|/C where Q1 and Q3 are the firstand the third quartile of the finest level of details in thedecomposition and 1.3 ≤ C ≤ 1.5.• ǫ(j) = 1− 1
(j−coarsest+1)γ, γ = 3
2.
• τ = max{√
σ2d − 1
µ, 0}. (Information on selfsimilarity via τ)
0 0.2 0.4 0.6 0.8 1−15
−10
−5
0
5
10
15
0 0.2 0.4 0.6 0.8 1−15
−10
−5
0
5
10
15
Figure: Doppler signal: n = 1024, SNR=7; Noisy version (left) andVidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 28 / 44
DEMO 3
Dem3a.m (BAMS shrinkage, Enrico Caruso: E lucevan lestelle from Tosca, by G. Puccini, Recorded in February1904).
0 1 2 3 4 5
x 104
−5
0
5x 10
4
0 1 2 3 4 5
x 104
−5
0
5x 10
4
0 1 2 3 4 5
x 104
−5000
0
5000
Figure: Noisy Recording; “Denoised” Sound; Residuals
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 29 / 44
Scaling: It Started with Hurst and Nile Data
Harold Edwin Hurst was a poor Leicester boy who madegood, eventually working his way into Oxford, and later became aBritish “civil servant” in Cairo in 1906. He got interested in theNile River.
Hurst spent 62 years in Egypt mostly working on designand construction of reservoirs along the Nile.
By inspecting historical data on the Nile flows, Hurstdiscovered phenomenon (now called Hurst effect).
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 30 / 44
Hurst’s Problem
Optimal reservoir capacity R to accept the river flow in N unitsof time, X1, X2, . . .XN , with a constant withdrawal of X̄ per unittime. The optimal volume of the reservoir is adjusted range,
R = max1≤k≤N
(X1 + · · ·+Xk − kX̄)− min1≤k≤N
(X1 + · · ·+Xk − kX̄).
In order to compare 100 of years worth of data, Hurststandardized the adjusted ranges R, with sample standarddeviation
S =
√
√
√
√
1
N − 1
N∑
i=1
(Xi − X̄)2 ,
Dimensionless ratio R/S - rescaled adjusted range.Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 31 / 44
River Nile and Scaling
On basis of more that 800 records, Hurst found thatquantity R/S scales as NH , for H ranging from 0.46 to 0.93, withmean 0.73 and standard deviation of 0.09.
0.4 0.5 0.6 0.7 0.8 0.9 10
10
20
30
40
50
60
70
80
90
100
Easy: H = 1/2 for iid normal; Feller: any iid with finite variance;Barnard (1956) Markovian dependent variables.
Mandelbrot (1975), Mandelbrot and Van Ness (1968), andMandelbrot and Wallis (1968) associated the Hurst phenomenonwith the presence of long-memory (
∑
iCov(X1, Xi) =∞).Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 32 / 44
River Nile and Scaling
0 100 200 300 400 5009
10
11
12
13
14
15
years
Nile
riv
er
min
imu
m le
ve
l
0 1 2 3 4 5 6 7 8 9−3
−2
−1
0
1
2
3
4
slope=−0.80
Dyadic Scale
log
2 S
ca
le−
Ave
rag
ed
En
erg
y
Figure: (a) Nile Yearly Minimum Level Data for n = 512 ConsecutiveYears from 622 A.D.; (b) Wavelet Log-spectrum[0.80 = 2H − 1→ H = 0.90]
• But what is Wavelet Log-spectrum?Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 33 / 44
What is Wavelet (Log)Spectrum?
• y - data, n× 1, n = 2J .• d =Wy - discrete wavelet transform, n× 1, n = 2J .• d={cJ−m, dJ−m, dJ−m+1, . . . , dJ−2, dJ−1}.
•Wavelet(Log)Spectrum
{(
j, log2[12j
∑
k d2j,k]
)
, J −m ≤ j ≤ J − 1}
• SLOPE either −(2H + 1)(cumulative) or −(2H − 1)(differenced). For example, Brownian Motion and White Noiseboth share H = 0.5.• MATLAB’s function waveletspectra.m finds and plots waveletspectra.
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 34 / 44
“Ubiquitous” – The Epithet of Scaling
Atmospheric Turbulence
0 1 2 3 4 5 6
x 104
−3
−2
−1
0
1
2
3
4
time
U−
co
mp
on
en
t o
f ve
locity
0 1 2 3 4 5 6 7 8−20
−18
−16
−14
−12
−10
−8
−6
−4
−2
slope = −5/3
0 5 10 15−15
−10
−5
0
5
10
15
slope = − 5/3
Dyadic Scales
log
2 S
ca
le−
Ave
rag
ed
En
erg
y
Figure: (a) U Velocity Component; (b) Scaling in the Fourier Domain;(c) Scaling in the Wavelet Domain. [5/3 = 2H + 1→ H = 1/3]
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 35 / 44
DNA Scales
A DNA molecule consists of long complementary double helix ofpurine nucleotides (denoted as A and G) and pyrimidinenucleotides (denoted as C and T). [A,G→ +1; C, T → −1]
0 1000 2000 3000 4000 5000 6000 7000 8000
−50
0
50
100
150
200
250
300
350
400
index
DN
A R
W
3 4 5 6 7 8 9 10 11 12−15
−10
−5
0
5
10
15
20
slope=−2.24
Figure: (a) 8196-long DNA Walk for Spider Monkey, from EMLBNucleotide Sequence Alignment DNA Database; (b) Wavelet ScalingWith Slope −2.24. [2.24 = 2H + 1→ H = 0.62]
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 36 / 44
Money scales
Daily rates of exchange between Korean Won (₩) and Euro (€)as reported by the European Central Bank between 4 January1999 and 21 June 2013, http://sdw.ecb.europa.eu
0 500 1000 1500 2000 2500 3000 3500 4000800
1000
1200
1400
1600
1800
2000
0 2 4 6 8 10
24
26
28
30
32
34
36
38
40
−1.99392
dyadic level
log
sp
ect
rum
2048 days starting with 1/4/1999
0 2 4 6 8 10
24
26
28
30
32
34
36
38
40
−1.99413
dyadic level
log
sp
ect
rum
2048 days prior to 6/21/2013
Figure: (a) Daily exchange Rates of ₩ to € between 1/4/1999 and6/21/2013 (Source European Central Bank) (b) Scaling behavior in the“red interval” and (c) in the “green” interval. [2 ≈ 2H + 1→ H = 1/2]
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 37 / 44
Various measurements scale!
Other Examples
Various Geophysical High Frequency Measurements.Biometric ResponsesEconomic IndicesInternet Measurements.Industrial Measurements.Astronomy.Medicine. Brain and Cancer Research
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 38 / 44
DEMO 4
• Dem4a.m (Fractional Brownian Motion)• Dem4b.m (DNA RW)• Dem4c.m (Scaling of Exchange Rates ₩ vs €)• Dem4d.m (Gait Data)• Dem4e.m (Coca Cola Company Stocks)
Dem4b.m Output
1000 2000 3000 4000 5000 6000 7000 8000
−400
−350
−300
−250
−200
−150
−100
−50
0
0 2 4 6 8 10 12
0
5
10
15
20
−2.24122
dyadic level
log
spec
trum
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 39 / 44
DEMO 4
Demo4d.m Output
500 1000 1500 2000
−0.04
−0.02
0
0.02
0.04
0.06
500 1000 1500 2000
−6
−5
−4
−3
−2
−1
0
1 2 3 4 5 6 7 8 9 10
−13
−12
−11
−10
−9
−8
−7−1.02486
dyadic level
log
spec
trum
1 2 3 4 5 6 7 8 9 10
−10
−5
0
5
10
−2.88611
dyadic level
log
spec
trum
Figure: Gait Data: Timing between steps. Slope for the cumulativetime is −(2H + 1) = −2.81 → H = 0.905.
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 40 / 44
Case Study: Filtering Ambient Ozone
Katul, Ruggeri, and Vidakovic (2005), JSPI
0 5 10 15 20 2510
20
30
40
50
60
70
80
time (min)
me
asu
rem
en
ts
1 2 3 4 5 6 7 8 9 10−6
−4
−2
0
2
4
6
8
10
12
14
log scalelo
g e
ne
rgy
Figure: Raw data from a Gas-Analyzer (21Hz); Wavelet Spectra.
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 41 / 44
Ozone Case Study
Estimator of O3
1 2 3 4 5 6 7 8 9 10−15
−10
−5
0
5
10
15
log scale
log
en
erg
y
0 5 10 15 20 2510
20
30
40
50
60
70
80
time (min)e
stim
ate
of th
e o
zo
ne
sig
na
l
Figure: De-whitened Spectra; Estimator of O3 Concentration.
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 42 / 44
Ozone Case Study
0 5 10 15 20 25
−30
−20
−10
0
10
20
30
time (min)
resi
du
als
−30 −20 −10 0 10 20 3010
0
101
102
103
104
log
f(x
)
x
0 5 10 15 20 25 30−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
lag
acf
(o
zon
e)
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 43 / 44
Conclusions
Wavelets becoming standard tools (like Fourier transform)
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 44 / 44
Conclusions
Wavelets becoming standard tools (like Fourier transform)
Wavelet shrinkage and scaling assessment: Useful tools indata analysis.
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 44 / 44
Conclusions
Wavelets becoming standard tools (like Fourier transform)
Wavelet shrinkage and scaling assessment: Useful tools indata analysis.
Goal of the talk was to build intuition about wavelet dataprocessing and demonstrate fundamental operational concepts.
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 44 / 44
Conclusions
Wavelets becoming standard tools (like Fourier transform)
Wavelet shrinkage and scaling assessment: Useful tools indata analysis.
Goal of the talk was to build intuition about wavelet dataprocessing and demonstrate fundamental operational concepts.
MATLAB DEMOS:http://zoe.bme.gatech.edu/~bv20/isye6420/supporting.html;(Under April 14, 2015 entry).
Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 44 / 44