29
Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate their intermittent structure into ensemble data assimilation? Forest fire, Colorado Midwest thunderstorms (2D space, 1D time) Algae bloom, Washingto n Proposal replicates for spatially clustered porcesses Rafal Wojcik, Dennis McLaughlin, Hamed Almohammad and Dara Entekhabi, MIT

Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Spatially clustered processes are very pervasive in nature

Can we do more to insure that our estimates are physically realistic?

How can we incorporate their intermittent structure into ensemble data assimilation?

Forest fire, Colorado Midwest thunderstorms(2D space, 1D time)

Algae bloom, Washington

Proposal replicates for spatially clustered porcessesRafal Wojcik, Dennis McLaughlin, Hamed Almohammad and Dara Entekhabi, MIT

Page 2: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Rainfall Data Assimilation – Merging Diverse Observations

• Develop Bayesian (ensemble) data assimilation procedures that can efficiently merge remote sensing and ground-based measurements of spatially clustered processes (e.g. rainfall).

• These procedures will be feature-based versions of particle filtering/importance sampling or MCMC.

Page 3: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Bayesian Perspective

Extend Bayesian formalism to accommodate geometric features

to integrate prior information w. new measurements :

Likelihood PriorPosterior

C

Feature

Measurement

Use ensemble representation:

Relationship between true and measured images:

Gives likelihood expression in terms of

observation error PDF:

Proposal

Page 4: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Requirements for feature-based Bayesian

Needed for feature-based Bayesian formulation:

1.Generate realistic clustered proposal images

-99 -98.5 -98 -97.5

37

37.5

38

38.5

0

20

40

60

80

-99 -98

37

37.5

38

0

20

40

60

80

-99.5-99-98.5-98-97.536.537

37.538

38.5

0

20

40

60

80

-99 -98 -9736

37

38

0

20

40

60

80

2. Define observation error probability measure over set of possible error images.

Is a relevant measure of similarity between observations and proposal replicates?

Page 5: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

How can we define measurement error norm?

• should preserve spatially intermittent features of the real process (e.g. rainfall)

• metrics used to compare replicates and measurements should be sensitive to clustering.

How similar are these images?

Page 6: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Euclidean metric

Euclidean dist = 4Rain replicate (=1)

Meas rain (=1)

No rain (=0)

Euclidean dist = 4

Page 7: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Image characterization: cluster based image compression

Initial cluster centers and scattered rain pixels

Neural gas finds “best” locations for cluster centers

Center of rain pixel

Clustercenter

xi

yi

Image is concisely characterized by cluster centers’ coordinates (xi,yi)

Page 8: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Image characterization: cluster based image compression

NG algorithm identifies 10-D feature vector characterizing each image replicate

Page 9: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Image characterization: cluster based image compression

POOR RESULTS:

Numbering of neural gas centers has strong impact on aggregate distance measure.

1

2

54

3

1

23

4

5

4

5

21

3

5

31

4

2

Page 10: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Image characterization: Jaccard metric

For two binary vectors (images) A and B Jaccard similarity is defined as:

and Jaccard metric is defined as:

This can be generalized for real positive vectors using:

ABAA-AB BB-AB

AA+BB-AB

A B

Page 11: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Image characterization: Jaccard metric

Jaccard dist = 0.8Rain replicate (=1)

Meas rain (=1)

No rain (=0)

Jaccard dist = 0.7

Page 12: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Feature Ensembles – Training Images & Priors

Multipoint technique identifies patterns within a moving template that scans training image

Replicate generator

Number of times each template pattern occurs

Pattern probability

Template patterns

Training image Template

Page 13: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Replicate generation -- Unconditional simulation

Training image

Measurement

rain/no rain probabilities + cluster size distribution preserved

Replicates

Page 14: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Conditional simulation

Training image

Replicates

Measurement

Conditional ensembles approach analogous to “nudging” (van Leeuven, 2010)

Page 15: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Constructing ensembles of proposal replicates for Bayesian estimation

How do we generate a moderate-sized proposal (or prior) that properly represents uncertainty in the measurement while including a reasonable number of replicates that are "close" to the true image?

measurement

truth

Page 16: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Constructing ensembles of proposal replicates for Bayesian estimation

Conditional (5% of pixels)

Conditional (20% of pixels)

50

0 re

plica

tes

Conditional (1% of pixels)

Page 17: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Conditional ensemble (1% of pixels) – sorted using Jaccard metric

Measurement

BE

ST

WO

RS

T

JACCARD DISTANCE

WO

RS

T

Page 18: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Conditional ensemble (5% of pixels) – sorted using Jaccard metric

Measurement

BE

ST

WO

RS

T

JACCARD DISTANCE

Page 19: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Conditional ensemble (20% of pixels) – sorted using Jaccard metric

Measurement

BE

ST

WO

RS

T

JACCARD DISTANCE

Page 20: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Conclusions

Clustered processes require a feature-based approach to Bayesian estimation which does not rely on Gaussian assumptions.

One option is to use importance sampling over the space of possible features. This requires that we 1) generate appropriate proposal images and 2) define an observation error probability measure based on an appropriate norm.

The Jaccard metric is a promising choice for this norm that orders differing images in an intuitive fashion.

Conditional multi-point random field generators can be used to produce realistic clustered proposal replicates

Future work will combine these ideas to obtain a feature-based procedure for rainfall data assimilation

Page 21: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Characterizing random fields using multipoint statistics

Page 22: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Conclusions

Page 23: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Conclusions

Page 24: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Proposal Replicates for Spatially Clustered Processes

Rafal Wojcik , Dennis McLaughlin, Hamed Almohammad and Dara Entekhabi, MIT, U.S.

Page 25: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Proposal ensemble generator

Measurements

Microwave LEO satellite (e.g. NOAA, TRMM, SSMI)

Geostationary satellite(e.g. GOES)

Featu

re p

rese

rvin

g

data

ass

imila

tion

schem

e

UpdateMAP

estimate Truth

Radar(e.g. NEXRAD)

Long-term objectives

Rain gage

Page 26: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Short-term objective

• Identify ways to characterize and generate random ensembles of realistic spatially clustered replicates (images) for ensemble-based data assimilation

• These procedures will be feature-based versions of particle filtering/importance sampling or MCMC.

…..

Possible alternatives – summer rain storms

-99 -98.5 -98 -97.5

37

37.5

38

38.5

0

20

40

60

80

-99 -98

37

37.5

38

0

20

40

60

80

-99.5-99-98.5-98-97.536.537

37.538

38.5

0

20

40

60

80

-99 -98 -9736

37

38

0

20

40

60

80Replicate 1

Replicate 2

Replicate 3 Replicate 4

Page 27: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Particle filter

Page 28: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Common assumption in particle filters:

Is a relevant measure of similarity between observations and proposal replicates?

Page 29: Spatially clustered processes are very pervasive in nature Can we do more to insure that our estimates are physically realistic? How can we incorporate

Image characterization

Feature represented as a vector of pixel values

How do we describe a feature ? -- Discretize over an n pixel grid

n

n

x

x

x

x

x

1

2

1

Feature support

2n possible features

Feature support + texture

∞ possible features

Geometric aspects of a typical NEXRAD summer rainstorm

texture (rain intensity) within support

boundary of feature support

no rain

rain

boundary of clouds