View
215
Download
0
Tags:
Embed Size (px)
Citation preview
Spatially clustered processes are very pervasive in nature
Can we do more to insure that our estimates are physically realistic?
How can we incorporate their intermittent structure into ensemble data assimilation?
Forest fire, Colorado Midwest thunderstorms(2D space, 1D time)
Algae bloom, Washington
Proposal replicates for spatially clustered porcessesRafal Wojcik, Dennis McLaughlin, Hamed Almohammad and Dara Entekhabi, MIT
Rainfall Data Assimilation – Merging Diverse Observations
• Develop Bayesian (ensemble) data assimilation procedures that can efficiently merge remote sensing and ground-based measurements of spatially clustered processes (e.g. rainfall).
• These procedures will be feature-based versions of particle filtering/importance sampling or MCMC.
Bayesian Perspective
Extend Bayesian formalism to accommodate geometric features
to integrate prior information w. new measurements :
Likelihood PriorPosterior
C
Feature
Measurement
Use ensemble representation:
Relationship between true and measured images:
Gives likelihood expression in terms of
observation error PDF:
Proposal
Requirements for feature-based Bayesian
Needed for feature-based Bayesian formulation:
1.Generate realistic clustered proposal images
-99 -98.5 -98 -97.5
37
37.5
38
38.5
0
20
40
60
80
-99 -98
37
37.5
38
0
20
40
60
80
-99.5-99-98.5-98-97.536.537
37.538
38.5
0
20
40
60
80
-99 -98 -9736
37
38
0
20
40
60
80
2. Define observation error probability measure over set of possible error images.
Is a relevant measure of similarity between observations and proposal replicates?
How can we define measurement error norm?
• should preserve spatially intermittent features of the real process (e.g. rainfall)
• metrics used to compare replicates and measurements should be sensitive to clustering.
How similar are these images?
Euclidean metric
Euclidean dist = 4Rain replicate (=1)
Meas rain (=1)
No rain (=0)
Euclidean dist = 4
Image characterization: cluster based image compression
Initial cluster centers and scattered rain pixels
Neural gas finds “best” locations for cluster centers
Center of rain pixel
Clustercenter
xi
yi
Image is concisely characterized by cluster centers’ coordinates (xi,yi)
Image characterization: cluster based image compression
NG algorithm identifies 10-D feature vector characterizing each image replicate
Image characterization: cluster based image compression
POOR RESULTS:
Numbering of neural gas centers has strong impact on aggregate distance measure.
1
2
54
3
1
23
4
5
4
5
21
3
5
31
4
2
Image characterization: Jaccard metric
For two binary vectors (images) A and B Jaccard similarity is defined as:
and Jaccard metric is defined as:
This can be generalized for real positive vectors using:
ABAA-AB BB-AB
AA+BB-AB
A B
Image characterization: Jaccard metric
Jaccard dist = 0.8Rain replicate (=1)
Meas rain (=1)
No rain (=0)
Jaccard dist = 0.7
Feature Ensembles – Training Images & Priors
Multipoint technique identifies patterns within a moving template that scans training image
Replicate generator
Number of times each template pattern occurs
Pattern probability
Template patterns
Training image Template
Replicate generation -- Unconditional simulation
Training image
Measurement
rain/no rain probabilities + cluster size distribution preserved
Replicates
Conditional simulation
Training image
Replicates
Measurement
Conditional ensembles approach analogous to “nudging” (van Leeuven, 2010)
Constructing ensembles of proposal replicates for Bayesian estimation
How do we generate a moderate-sized proposal (or prior) that properly represents uncertainty in the measurement while including a reasonable number of replicates that are "close" to the true image?
measurement
truth
Constructing ensembles of proposal replicates for Bayesian estimation
Conditional (5% of pixels)
Conditional (20% of pixels)
50
0 re
plica
tes
Conditional (1% of pixels)
Conditional ensemble (1% of pixels) – sorted using Jaccard metric
Measurement
BE
ST
WO
RS
T
JACCARD DISTANCE
WO
RS
T
Conditional ensemble (5% of pixels) – sorted using Jaccard metric
Measurement
BE
ST
WO
RS
T
JACCARD DISTANCE
Conditional ensemble (20% of pixels) – sorted using Jaccard metric
Measurement
BE
ST
WO
RS
T
JACCARD DISTANCE
Conclusions
Clustered processes require a feature-based approach to Bayesian estimation which does not rely on Gaussian assumptions.
One option is to use importance sampling over the space of possible features. This requires that we 1) generate appropriate proposal images and 2) define an observation error probability measure based on an appropriate norm.
The Jaccard metric is a promising choice for this norm that orders differing images in an intuitive fashion.
Conditional multi-point random field generators can be used to produce realistic clustered proposal replicates
Future work will combine these ideas to obtain a feature-based procedure for rainfall data assimilation
Characterizing random fields using multipoint statistics
Conclusions
Conclusions
Proposal Replicates for Spatially Clustered Processes
Rafal Wojcik , Dennis McLaughlin, Hamed Almohammad and Dara Entekhabi, MIT, U.S.
Proposal ensemble generator
Measurements
Microwave LEO satellite (e.g. NOAA, TRMM, SSMI)
Geostationary satellite(e.g. GOES)
Featu
re p
rese
rvin
g
data
ass
imila
tion
schem
e
UpdateMAP
estimate Truth
Radar(e.g. NEXRAD)
Long-term objectives
Rain gage
Short-term objective
• Identify ways to characterize and generate random ensembles of realistic spatially clustered replicates (images) for ensemble-based data assimilation
• These procedures will be feature-based versions of particle filtering/importance sampling or MCMC.
…..
Possible alternatives – summer rain storms
-99 -98.5 -98 -97.5
37
37.5
38
38.5
0
20
40
60
80
-99 -98
37
37.5
38
0
20
40
60
80
-99.5-99-98.5-98-97.536.537
37.538
38.5
0
20
40
60
80
-99 -98 -9736
37
38
0
20
40
60
80Replicate 1
Replicate 2
Replicate 3 Replicate 4
Particle filter
Common assumption in particle filters:
Is a relevant measure of similarity between observations and proposal replicates?
Image characterization
Feature represented as a vector of pixel values
How do we describe a feature ? -- Discretize over an n pixel grid
n
n
x
x
x
x
x
1
2
1
Feature support
2n possible features
Feature support + texture
∞ possible features
Geometric aspects of a typical NEXRAD summer rainstorm
texture (rain intensity) within support
boundary of feature support
no rain
rain
boundary of clouds