View
203
Download
0
Category
Tags:
Preview:
DESCRIPTION
Machine Learning for Scientific Applications at the European Space Agency Summer School on Remote Sensing
Citation preview
Machine Learning for Scientific Applications
http://davidlary.infoDavid Lary
Need: Accounting for complex multi-variate contextwhich is often not fully described by theory
Monday, August 11, 14
Long Term Data Sets: Uncertainty, Cross-Calibration,
Data Fusion & Machine Learning
Motivated by Data Assimilation
With examples from Land, Atmosphere & OceanMonday, August 11, 14
Bias Detection“Who may discern his errors, ....” Psalm 19:12
7
Monday, August 11, 14
Why is it an issue?
• With fusion of multiple datasets bias is often an issue (very relevant for climate variables).
• Data assimilation is a least squares or a Best Linear Unbiased Estimator (BLUE)
8
Monday, August 11, 14
.... runs deeper still• Instrument teams have a keen sense of faithfully reporting the
data, as it is, warts and all. They are naturally loath to empirically correct biases; they would like to theoretically understand the cause of the bias and data issues from first principles.
The Earth System is so complex, with many interacting processes, and often the instruments are also complex, this is not always possible.
Residual data issues can, and usually do, remain.
• Modelers know that data bias exist, but are very reticent to make changes to data products.
.... we therefore have a problem of closure.
9
Monday, August 11, 14
The problem!
• Biases are ubiquitous, not all of them can be explained theoretically. Yet, we typically need to fuse multiple datasets to construct long-term time series and/or improve global coverage.
• If the biases are not corrected before data fusion we introduce further problems, such as ...
• spurious trends, leading to the possibility of unsuitable policy decisions.
• when assimilation is involved, the suboptimal use of observations, non-physical structures in the analysis, biases in the assimilated fields, and extrapolation of biases due to multivariate background constraints.
10
Monday, August 11, 14
A Further Problem
The instruments whose data we would like to fuse are often not making coincident measurements in time or space.
Imperative to inter-compare observations in their appropriate context.
11
Monday, August 11, 14
Integrate multiple satellite datasets for applications
The comparison above shows the total ozone column observed by EP TOMS and Aura OMI. The high resolution coverage that Aura OMI provides is clearly seen. In the particular event shown there is a tropopause fold event over Texas.
12
Monday, August 11, 14
An Example
13 representativenessMonday, August 11, 14
14
Monday, August 11, 14
0.5 1 1.5 2 2.5 3 3.5 4
x 10ï6
0
0.02
0.04
0.06
0.08
0.1
0.12
O3 v.m.r.
Rel
ativ
e Fr
eque
ncy
All years 01 (1900 K<e< 2300 K, ï90o<qel< ï79o)
Aura MLS O3 (23)
CLAES v9 O3 (207)
ISAMS v10 O3 (19)
UARS MLS v5 183 GHz O3 (379)
UARS MLS v5 205 GHz O3 (490)
SAGE 2 v6.2 O3 (21)
SBUV v8 O3 (33)
15
Monday, August 11, 14
Geophysical Insights
(a) (b)
(c) (d)
Figure 2: N2O Equivalent PV latitude - potential temperature cross sectionsof (a) representativeness uncertainty (v.m.r.), (b) observational uncertainty(v.m.r.), (c) obvservation (v.m.r.), and (d) analyses uncertainty (v.m.r.). Thedata used is from the Upper Atmosphere Research Satellite (UARS) CryogenicLimb Array Etalon Spectrometer (CLAES) version 9 for January 1992.
3
16
Monday, August 11, 14
Bias is Spatially Dependent
−75 −60 −45 −30 −15 0 15 30 45 60 75
250
300
350
400
500
600
700
1000
1200
1500
2000
2500
Equivalent PV Latitude
Pote
ntia
l Tem
pera
ture
(K)
% Bias (UARS MLS v5 183 GHz O3 − HALOE v19 O3) for January of all years
−30
−20
−10
0
10
20
30
−75 −60 −45 −30 −15 0 15 30 45 60 75
250
300
350
400
500
600
700
1000
1200
1500
2000
2500
Equivalent PV Latitude
Pote
ntia
l Tem
pera
ture
(K)
% Bias (UARS MLS v5 183 GHz O3 − HALOE v19 O3) for January of all years
−30
−20
−10
0
10
20
30
17
Monday, August 11, 14
So what can we do about this?
.... we do not have a theoretical explanation
18
Monday, August 11, 14
Machine Learningfor when our understanding is incomplete
19
... and that is quite often!
Monday, August 11, 14
What is Machine Learning?
• Machine learning is a sub-field of artificial intelligence that is concerned with the design and development of algorithms that allow computers to learn the behavior of data sets empirically.
• A major focus of machine-learning research is to produce (induce) empirical models from data automatically.
• This approach is usually used because of the absence of adequate and complete theoretical models that are more desirable conceptually.
20
Monday, August 11, 14
What is Machine Learning?
The use of machine learning can actually help us to construct a more complete theoretical model, as it allows us to determine which factors are statistically capable of providing the data mappings we seek— e.g. the multi-variate, non-linear, non-parametric mapping between satellite radiances and a suite of ocean products.
21
Monday, August 11, 14
Machine Learning
Is for:
Regression
➡ Multivariate, non-linear, non-parametric
Classification
➡ Supervised and unsupervised
22
Monday, August 11, 14
Machine Learning
Comes in Several Flavors, for example:
• Neural Networks
• Support Vector Machines
• Gaussian Process Models
• Decision Trees
• Random Forests
23
Monday, August 11, 14
Machine Learning Regression
x1 x2 x3 x4 x5 � xn y
Inpu
ts
Out
put(
s)
Inpu
ts
Inpu
ts
Inpu
ts
Inpu
ts
Inpu
ts
Inpu
ts
y = f (x1, x2, x3, x4 , x5,…, xn )
Multivariate, non-linear, non-parametricn can be very large
Training Data
Monday, August 11, 14
Machine Learning Supervised Classification
x1 x2 x3 x4 x5 � xn y
Inpu
ts
Out
put(
s)
Inpu
ts
Inpu
ts
Inpu
ts
Inpu
ts
Inpu
ts
Inpu
ts
Multivariate, non-linear, non-parametricn can be very large
Training Data
Monday, August 11, 14
Machine Learning Unsupervised Classification
Multivariate, non-linear, non-parametricn can be very large
x1 x2 x3 x4 x5 � xn
Inpu
ts
Inpu
ts
Inpu
ts
Inpu
ts
Inpu
ts
Inpu
ts
Inpu
ts
Training Data
Monday, August 11, 14
Neural Networks
In a neural network model simple nodes (neurons), are connected together to form a network of nodes. Its practical use comes with algorithms designed to alter the strength (weights) of the connections in the network to produce a desired signal flow.
27
Monday, August 11, 14
Support Vector Machines
Support vector machines (SVMs) are a set of related supervised learning methods used for classification and regression.
Intuitively, an SVM model is a representation of the training examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible.
Vladimir Vapnik
28
Monday, August 11, 14
Gaussian Process ModelsGaussian processes (GPs) (Rasmussen and Williams 2006) fit a multivariate Gaussian probability distribution to any set of regressors, allowing for analytic inference. As a principled Bayesian technique, GPs go beyond SVMs by allowing us to supply a full posterior distribution for our regressors, giving us both mean estimates as well as an indication of the uncertainty in them.
29
Monday, August 11, 14
Random ForestRandom forests are an ensemble learning method for classification (and regression) that operate by constructing a multitude of decision trees, hence a forest. The approach was developed by Leo Breiman and Adele Cutler.
Monday, August 11, 14
A key issue is training dataset size, the bigger
the better!
..... until we run out of memory
31
Monday, August 11, 14
Variations in Stratospheric Cly Between 1991 and the present
David Lary, Anne Douglass, Darryn Waugh, Richard Stolarski, Paul Newman, Hamse Mussa
• Data can be biased, maybe as a function of many parameters.
• May be observing a proxy for what we really want to know.32
Monday, August 11, 14
ozone reductions there (SOCOL and E39C), and the modelwith the largest cold bias in the Antarctic lower strato-sphere in spring (LMDZrepro) simulates very low ozone.
CCMs show a large range of ozone trends over thepast 25 years (see left panels in Figure 3-26 of Chapter 3)and large differences from observations. Some of thesedifferences may in part be related to differences in the sim-ulated Cly, e.g., E39C and SOCOL show a trend smallerthan observed, whereas AMTRAC and UMETRAC showa trend larger than observed in extrapolar area weightedmean column ozone. However, other factors also con-tribute, e.g., biases in tropospheric ozone (Austin andWilson, 2006).
The CCM evaluation discussed above and in Eyringet al. (2006) has guided the level of confidence we placeon each model simulation. The CCMs vary in their skillin representing different processes and characteristics ofthe atmosphere. Because the focus here is on ozone
recovery due to declining ODSs, we place importance onthe models’ ability to correctly simulate stratospheric Clyas well as the representation of transport characteristicsand polar temperatures. Therefore, more credence is givento those models that realistically simulate these processes.Figure 6-7 shows a subset of the diagnostics used to eval-uate these processes and CCMs shown with solid curvesin Figures 6-7, 6-8, 6-10 and 6-12 to 6-14 are those thatare in good agreement with the observations in Figure6-7. However, these line styles should not be over-interpreted as both the ability of the CCMs to representthese processes as well as the relative importance of Cly,temperature, and transport vary between different regionsand altitudes. Also, analyses of model dynamics in theArctic, and differences in the chlorine budget/partitioningin these models, when available, might change this evalu-ation for some regions and altitudes.
21st CENTURY OZONE LAYER
6.26
Figure 6-8. October zonal mean values of total inorganic chlorine (Cly in ppb) at 50 hPa and 80°S from CCMs.Panel (a) shows Cly and panel (b) difference in Cly from that in 1980. The symbols in (a) show estimates of Clyin the Antarctic lower stratosphere in spring from measurements from the UARS satellite in 1992 and the Aurasatellite in 2005, yielding values around 3 ppb (Douglass et al., 1995; Santee et al., 1996) and around 3.3 ppb(see Figure 4-8), respectively.
50 hPa 80°S October 50 hPa 80°S October
Cl y–
Cl y
(198
0) (
ppbv
)
Cl y
(ppb
v)
Year Year
33
Monday, August 11, 14
ozone reductions there (SOCOL and E39C), and the modelwith the largest cold bias in the Antarctic lower strato-sphere in spring (LMDZrepro) simulates very low ozone.
CCMs show a large range of ozone trends over thepast 25 years (see left panels in Figure 3-26 of Chapter 3)and large differences from observations. Some of thesedifferences may in part be related to differences in the sim-ulated Cly, e.g., E39C and SOCOL show a trend smallerthan observed, whereas AMTRAC and UMETRAC showa trend larger than observed in extrapolar area weightedmean column ozone. However, other factors also con-tribute, e.g., biases in tropospheric ozone (Austin andWilson, 2006).
The CCM evaluation discussed above and in Eyringet al. (2006) has guided the level of confidence we placeon each model simulation. The CCMs vary in their skillin representing different processes and characteristics ofthe atmosphere. Because the focus here is on ozone
recovery due to declining ODSs, we place importance onthe models’ ability to correctly simulate stratospheric Clyas well as the representation of transport characteristicsand polar temperatures. Therefore, more credence is givento those models that realistically simulate these processes.Figure 6-7 shows a subset of the diagnostics used to eval-uate these processes and CCMs shown with solid curvesin Figures 6-7, 6-8, 6-10 and 6-12 to 6-14 are those thatare in good agreement with the observations in Figure6-7. However, these line styles should not be over-interpreted as both the ability of the CCMs to representthese processes as well as the relative importance of Cly,temperature, and transport vary between different regionsand altitudes. Also, analyses of model dynamics in theArctic, and differences in the chlorine budget/partitioningin these models, when available, might change this evalu-ation for some regions and altitudes.
21st CENTURY OZONE LAYER
6.26
Figure 6-8. October zonal mean values of total inorganic chlorine (Cly in ppb) at 50 hPa and 80°S from CCMs.Panel (a) shows Cly and panel (b) difference in Cly from that in 1980. The symbols in (a) show estimates of Clyin the Antarctic lower stratosphere in spring from measurements from the UARS satellite in 1992 and the Aurasatellite in 2005, yielding values around 3 ppb (Douglass et al., 1995; Santee et al., 1996) and around 3.3 ppb(see Figure 4-8), respectively.
50 hPa 80°S October 50 hPa 80°S October
Cl y–
Cl y
(198
0) (
ppbv
)
Cl y
(ppb
v)
Year Year
A large range of Cly in the model simulations
Constrained by a limited number of Cly observations
33
Monday, August 11, 14
• We need to know the distribution of inorganic chlorine (Cly) in the stratosphere to:
• Attribute changes in stratospheric ozone to changes in halogens.
• Assess the realism of chemistry-climate models.
34
Monday, August 11, 14
Cly=HCl+ClONO2+ClO+HOCl+2Cl2O2+2Cl2
Long time-series
SporadicLong time-series
Since 2004
Estimating Cly is hampered by lack of observations
Estimating Cly is hampered by inter-instrument biases35
Monday, August 11, 14
Using PDFs for Bias Detection
0.8 1 1.2 1.4 1.6 1.8 2x 10ï9
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
HCl v.m.r.
Rel
ativ
e Fr
eque
ncy
2005/01 (460 K<e< 590 K, 49o<qel< 61o)
ACE v2.2 HCl (75)Aura MLS HCl (1544)HALOE v19 HCl (101)
http://www.pdfcentral.info/
HALOE -Aura
HCl
If we now repeat this globally for all periods of overlap
36
Monday, August 11, 14
0 1 2 3 40
1
2
3
4
HALOE HCl (ppbv)
ATM
OS
HCl (
ppbv
)
Slope = 1.05Intercept = 0.23 ppbv
Data1:1Weighted Fit
HCl Inter-comparisons
37
Monday, August 11, 14
0 1 2 3 40
1
2
3
4
HALOE HCl (ppbv)
ACE
HCl v
2.2
(ppb
v)
Slope = 1.18Intercept = −0.050 ppbv
Data1:1Weighted Fit
HCl Inter-comparisons
37
Monday, August 11, 14
0 1 2 30
1
2
3
HALOE HCl (ppbv)
MLS
HCl
(ppb
v)
Slope = 1.09Intercept = 0.070 ppbv
Data1:1Weighted FitFit
HCl Inter-comparisons
37
Monday, August 11, 14
0 1 2 30
1
2
3
HALOE HCl (ppbv)
MLS
HCl
(ppb
v)
Slope = 1.09Intercept = 0.070 ppbv
Data1:1Weighted FitFit
0 1 2 30
1
2
3
HALOE HCl (ppbv) NN adjustedM
LS H
Cl (p
pbv)
Slope = 0.995Iintercept = 0.0093 ppbv
Data1:1Weighted Fit
HCl Inter-comparisons
37
Monday, August 11, 14
Neurological algorithmsInputsOutputs
Process
38
Monday, August 11, 14
An example neural network
Inputs
Outputs
Process
39
Monday, August 11, 14
An example neural network
Inputs
Outputs
Process
39
Objective design of neural networksusing genetic algorithms
Monday, August 11, 14
An example neural network
40
Monday, August 11, 14
Re-calibration using a Neural Network
0.5 1 1.5 2 2.5 3 3.5
x 10ï9
0.5
1
1.5
2
2.5
3
3.5
x 10ï9
Targets T
Out
puts
A,
Line
ar F
it: A
=(0.
97)T
+(5eï1
1)
HCl Training Outputs vs. Targets, R=0.98739
Training Data PointsBest Linear FitA = T
0.5 1 1.5 2 2.5 3 3.5x 10ï9
0.5
1
1.5
2
2.5
3
3.5
x 10ï9
Targets T
Out
puts
A,
Line
ar F
it: A
=(0.
98)T
+(2.
9eï1
1)
HCl Validation Outputs vs. Targets, R=0.99232
Validation Data PointsBest Linear FitA = T
41
Monday, August 11, 14
Re-calibration using a Neural Network
0.5 1 1.5 2 2.5 3 3.5
x 10ï9
0.5
1
1.5
2
2.5
3
3.5
x 10ï9
Targets T
Out
puts
A,
Line
ar F
it: A
=(0.
97)T
+(5eï1
1)
HCl Training Outputs vs. Targets, R=0.98739
Training Data PointsBest Linear FitA = T
0.5 1 1.5 2 2.5 3 3.5x 10ï9
0.5
1
1.5
2
2.5
3
3.5
x 10ï9
Targets T
Out
puts
A,
Line
ar F
it: A
=(0.
98)T
+(2.
9eï1
1)
HCl Validation Outputs vs. Targets, R=0.99232
Validation Data PointsBest Linear FitA = T
Totally independentvalidation
41
Monday, August 11, 14
Long-term continuity42
Monday, August 11, 14
Long-term continuity
Applied Neural NetworkRe-calibration to HALOE
42
Monday, August 11, 14
1995 2000 20050
0.5
1
1.5
2
2.5
3
3.5
4 x 10ï9
Year
Cl y
Monthly average 2o
800 K525 K6 Year Age5 Year Age4 Year Age3 Year Age2 Year Age
1995 2000 20050
0.5
1
1.5
2
2.5
3
3.5
4 x 10ï9
Year
Cl y
Monthly average 61o
800 K525 K6 Year Age5 Year Age4 Year Age3 Year Age2 Year Age
October
Use neural networks to infer Cly from HCl, CH4, ϕpv, and θ.
Long-term continuity for Cly43
Monday, August 11, 14
1995 2000 20050
0.5
1
1.5
2
2.5
3
3.5
4 x 10ï9
Year
Cl y
Monthly average 2o
800 K525 K6 Year Age5 Year Age4 Year Age3 Year Age2 Year Age
1995 2000 20050
0.5
1
1.5
2
2.5
3
3.5
4 x 10ï9
Year
Cl y
Monthly average 61o
800 K525 K6 Year Age5 Year Age4 Year Age3 Year Age2 Year Age
October
Use neural networks to infer Cly from HCl, CH4, ϕpv, and θ.
Long-term continuity for Clyozone reductions there (SOCOL and E39C), and the modelwith the largest cold bias in the Antarctic lower strato-sphere in spring (LMDZrepro) simulates very low ozone.
CCMs show a large range of ozone trends over thepast 25 years (see left panels in Figure 3-26 of Chapter 3)and large differences from observations. Some of thesedifferences may in part be related to differences in the sim-ulated Cly, e.g., E39C and SOCOL show a trend smallerthan observed, whereas AMTRAC and UMETRAC showa trend larger than observed in extrapolar area weightedmean column ozone. However, other factors also con-tribute, e.g., biases in tropospheric ozone (Austin andWilson, 2006).
The CCM evaluation discussed above and in Eyringet al. (2006) has guided the level of confidence we placeon each model simulation. The CCMs vary in their skillin representing different processes and characteristics ofthe atmosphere. Because the focus here is on ozone
recovery due to declining ODSs, we place importance onthe models’ ability to correctly simulate stratospheric Clyas well as the representation of transport characteristicsand polar temperatures. Therefore, more credence is givento those models that realistically simulate these processes.Figure 6-7 shows a subset of the diagnostics used to eval-uate these processes and CCMs shown with solid curvesin Figures 6-7, 6-8, 6-10 and 6-12 to 6-14 are those thatare in good agreement with the observations in Figure6-7. However, these line styles should not be over-interpreted as both the ability of the CCMs to representthese processes as well as the relative importance of Cly,temperature, and transport vary between different regionsand altitudes. Also, analyses of model dynamics in theArctic, and differences in the chlorine budget/partitioningin these models, when available, might change this evalu-ation for some regions and altitudes.
21st CENTURY OZONE LAYER
6.26
Figure 6-8. October zonal mean values of total inorganic chlorine (Cly in ppb) at 50 hPa and 80°S from CCMs.Panel (a) shows Cly and panel (b) difference in Cly from that in 1980. The symbols in (a) show estimates of Clyin the Antarctic lower stratosphere in spring from measurements from the UARS satellite in 1992 and the Aurasatellite in 2005, yielding values around 3 ppb (Douglass et al., 1995; Santee et al., 1996) and around 3.3 ppb(see Figure 4-8), respectively.
50 hPa 80°S October 50 hPa 80°S October
Cl y–
Cl y
(198
0) (
ppbv
)
Cl y
(ppb
v)
Year Year
43
Monday, August 11, 14
1995 2000 20050
0.5
1
1.5
2
2.5
3
3.5
4 x 10ï9
Year
Cl y
Monthly average 2o
800 K525 K6 Year Age5 Year Age4 Year Age3 Year Age2 Year Age
October
Use neural networks to infer Cly from HCl, CH4, ϕpv, and θ.
Long-term continuity for Cly
ozone reductions there (SOCOL and E39C), and the modelwith the largest cold bias in the Antarctic lower strato-sphere in spring (LMDZrepro) simulates very low ozone.
CCMs show a large range of ozone trends over thepast 25 years (see left panels in Figure 3-26 of Chapter 3)and large differences from observations. Some of thesedifferences may in part be related to differences in the sim-ulated Cly, e.g., E39C and SOCOL show a trend smallerthan observed, whereas AMTRAC and UMETRAC showa trend larger than observed in extrapolar area weightedmean column ozone. However, other factors also con-tribute, e.g., biases in tropospheric ozone (Austin andWilson, 2006).
The CCM evaluation discussed above and in Eyringet al. (2006) has guided the level of confidence we placeon each model simulation. The CCMs vary in their skillin representing different processes and characteristics ofthe atmosphere. Because the focus here is on ozone
recovery due to declining ODSs, we place importance onthe models’ ability to correctly simulate stratospheric Clyas well as the representation of transport characteristicsand polar temperatures. Therefore, more credence is givento those models that realistically simulate these processes.Figure 6-7 shows a subset of the diagnostics used to eval-uate these processes and CCMs shown with solid curvesin Figures 6-7, 6-8, 6-10 and 6-12 to 6-14 are those thatare in good agreement with the observations in Figure6-7. However, these line styles should not be over-interpreted as both the ability of the CCMs to representthese processes as well as the relative importance of Cly,temperature, and transport vary between different regionsand altitudes. Also, analyses of model dynamics in theArctic, and differences in the chlorine budget/partitioningin these models, when available, might change this evalu-ation for some regions and altitudes.
21st CENTURY OZONE LAYER
6.26
Figure 6-8. October zonal mean values of total inorganic chlorine (Cly in ppb) at 50 hPa and 80°S from CCMs.Panel (a) shows Cly and panel (b) difference in Cly from that in 1980. The symbols in (a) show estimates of Clyin the Antarctic lower stratosphere in spring from measurements from the UARS satellite in 1992 and the Aurasatellite in 2005, yielding values around 3 ppb (Douglass et al., 1995; Santee et al., 1996) and around 3.3 ppb(see Figure 4-8), respectively.
50 hPa 80°S October 50 hPa 80°S October
Cl y–
Cl y
(198
0) (
ppbv
)
Cl y
(ppb
v)
Year Year
43
Monday, August 11, 14
44
Monday, August 11, 14
45
Monday, August 11, 14
Other uses of machine learning
• Cross calibration of vegetation indices from AVHRR, MODIS, SPOT and SeaWIFS
• Inferring CO2 fluxes from vegetation indices and surface temperature
• Inferring ocean pigment concentrations and other parameters
• Inferring drought stress and endophyte infection in cacao (coffee)
• Learning the chaotically tumbling orbit of the Hubble space telescope
• Detecting online ebay fraud
• Acceleration of expensive code elements
46
Monday, August 11, 14
Another applicationdissolved organic carbon
47
Monday, August 11, 14
48
Monday, August 11, 14
48
Monday, August 11, 14
48
Monday, August 11, 14
48
Monday, August 11, 14
48
Monday, August 11, 14
48
Monday, August 11, 14
48
Monday, August 11, 14
48
Monday, August 11, 14
Method used to estimate DOC R
SeaWiFS bands GP NL 0.99977
MODIS bands GP NL 0.9997
All bands GP NL 0.99901
UV & SeaWiFS bands GP NL 0.99899
All bands NN 0.95859
UV & SeaWiFS bands NN 0.94609
MODIS bands NN 0.92585
SeaWiFS bands NN 0.91653
49
Monday, August 11, 14
5
10
15
0
5
0
10
0
15
1
0.99
0.95
0.9
0.8
0.7
0.6
0.50.4
0.30.20.10
Stan
dard
dev
iatio
n
Co r r e l a t i o n Co e f f i ci e n
tR
MS
D
A
B
C
D
E
F
G
HGaussian Process Models50
Monday, August 11, 14
Relative Importance of the Inputs
Wavelength Relative Importance
Rrs490 0.00087123
Rrs555 0.011976
Rrs670 1.5876
Rrs510 9.8423
Rrs443 13.0898
Rrs412 20.2553
The GPM hyper-parameters give an indication of the relative importance of the inputs. For the DOC SeaWiFS bands the best inputs
are those with the smallest values, here they are sorted in order of importance
Mos
t Im
port
ant
Leas
t Im
port
ant
51
Monday, August 11, 14
−0.5 0 0.5 1 1.5 20
5
10
15
20
25
30
35
40
a412−a443
Salin
itySalinity
DataPolynomial (r2=0.928)NN (r2=0.933)SVM (r2=0.933)
52
Monday, August 11, 14
Visibility
Variable R
Td
q
T
U
RH
SLP
-0.29
-0.26
-0.19
-0.18
0.13
0.05
53
Monday, August 11, 14
High Resolution Identification of Dust Sources Using Machine
Learning and Remote Sensing DataAnnette Walker and David J. Lary
A42A-08
Monday, August 11, 14
NRL High-resolution Dust Source Database
20030820 NRL DEP20030820 NRL DEP
Iran
Pakistan
Iran
Pakistan
• 10 years of DEP (2 yr MSG/RGB) imagery• COAMPS 10 m wind overlays • Surface weather plots • ENVI (Gis-like software)• NGDC topographical 10ºX10º tiles• Overlay 0.25º grid or use Google Earth (GE)
• Dust source area entered into database (cursor location tool = 1km precision)• Cross-correlate land and water features using maps, atlases, Landsat images (detailed topographic, geographic, and geomorphic information, GE) • Technical and governmental reports
Approach and Methodology
20110630 NRL MSG/RGB
Saudi Arabia
20030820 MODIS True Color
Monday, August 11, 14
NRL High-resolution Dust Source Database
20030820 NRL DEP20030820 NRL DEP
Iran
Pakistan
Iran
Pakistan
• 10 years of DEP (2 yr MSG/RGB) imagery• COAMPS 10 m wind overlays • Surface weather plots • ENVI (Gis-like software)• NGDC topographical 10ºX10º tiles• Overlay 0.25º grid or use Google Earth (GE)
• Dust source area entered into database (cursor location tool = 1km precision)• Cross-correlate land and water features using maps, atlases, Landsat images (detailed topographic, geographic, and geomorphic information, GE) • Technical and governmental reports
Approach and Methodology
20110630 NRL MSG/RGB
Saudi Arabia
20030820 MODIS True Color20030820 NRL DEP
Iran
Pakistan
Monday, August 11, 14
NRL High-resolution Dust Source Database
Solid red and purple shapes identify dust source areas located using DEP and MSG.
SW Asia DSD East Asia DSD
Mongolia
Saudi Arabia
Monday, August 11, 14
Self-Organizing MapSelf-organizing maps (SOMs) are a data visualization and unsupervised classification technique invented by Professor Teuvo Kohonen (Kohonen 1982; 1990) that reduce the dimensions of data through the use of self-organizing neural networks.
They help us address the issue that humans simply cannot visualize high dimensional data.
Monday, August 11, 14
Self-Organizing MapSOMs reduce dimensionality by producing a map that objectively plots the similarities of the data by grouping similar data items together.
SOMs learn to classify input vectors according to how they are grouped in the input space.
SOMs learn both the distribution and topology of the input vectors they are trained on. This approach allows SOMs to accomplish two things, reduce dimensions and display similarities.
Monday, August 11, 14
Detecting Dust Sources
Monday, August 11, 14
Self Organizing Map Classification
7 BandsMODIS MCD43C3
bihemispherical reflectance
Monday, August 11, 14
All 1000-Classes mapped for North Africa
Monday, August 11, 14
Libyan Dust Event: May 9, 2010 (8Z – 12Z)Jabal al Akhdar (الجبل األخضر Al Ǧabal al 'Aḫḍar, English: Green Mountains)A coastal mountain range with height 1.0-1.5 km.
Monday, August 11, 14
Libyan Dust Event: May 9, 2010 (8Z – 12Z)Jabal al Akhdar (الجبل األخضر Al Ǧabal al 'Aḫḍar, English: Green Mountains)A coastal mountain range with height 1.0-1.5 km.
Monday, August 11, 14
Libyan Dust Event: May 9, 2010 (8Z – 12Z)Jabal al Akhdar (الجبل األخضر Al Ǧabal al 'Aḫḍar, English: Green Mountains)A coastal mountain range with height 1.0-1.5 km.
Monday, August 11, 14
Libyan Dust Event: May 9, 2010 (8Z – 12Z)Jabal al Akhdar (الجبل األخضر Al Ǧabal al 'Aḫḍar, English: Green Mountains)A coastal mountain range with height 1.0-1.5 km.
Monday, August 11, 14
Libyan Dust Event: May 9, 2010 (8Z – 12Z)Jabal al Akhdar (الجبل األخضر Al Ǧabal al 'Aḫḍar, English: Green Mountains)A coastal mountain range with height 1.0-1.5 km.
Monday, August 11, 14
Libyan Dust Event: May 9, 2010 (8Z – 12Z)Jabal al Akhdar (الجبل األخضر Al Ǧabal al 'Aḫḍar, English: Green Mountains)A coastal mountain range with height 1.0-1.5 km.
Monday, August 11, 14
Libyan Dust Event: May 9, 2010 (8Z – 12Z)Jabal al Akhdar (الجبل األخضر Al Ǧabal al 'Aḫḍar, English: Green Mountains)A coastal mountain range with height 1.0-1.5 km.
Monday, August 11, 14
Libyan Dust Event: May 9, 2010 (8Z – 12Z)Jabal al Akhdar (الجبل األخضر Al Ǧabal al 'Aḫḍar, English: Green Mountains)A coastal mountain range with height 1.0-1.5 km.
Monday, August 11, 14
Libyan Dust Event: May 9, 2010 (8Z – 12Z)Jabal al Akhdar (الجبل األخضر Al Ǧabal al 'Aḫḍar, English: Green Mountains)A coastal mountain range with height 1.0-1.5 km.
Monday, August 11, 14
Plumes originate on leeward side ofAl Jabal al Akhdar where drainage occurs along slopes.
Corresponding SOM-Classes: 49, 93, 94
Libyan Dust Event: May 9, 2010 (6Z – 8Z)Jabal al Akhdar (الجبل األخضر Al Ǧabal al 'Aḫḍar, English: Green Mountains)A coastal mountain range with height 1.0-1.5 km.
Monday, August 11, 14
Chad: Bodélé Depression Dust Event: March 16, 2010 (7Z -12Z)
Located at the southern edge of the Sahara Desert in north central Africa, is the lowest point in Chad. Dust storms from the Bodélé Depression occur on average about 100 days per year. The Bodélé depression is a single spot in the Sahara that provides most of the mineral dust to the Amazon forest.
Monday, August 11, 14
Chad: Bodélé Depression Dust Event: March 16, 2010 (7Z -12Z)
Located at the southern edge of the Sahara Desert in north central Africa, is the lowest point in Chad. Dust storms from the Bodélé Depression occur on average about 100 days per year. The Bodélé depression is a single spot in the Sahara that provides most of the mineral dust to the Amazon forest.
Monday, August 11, 14
Chad: Bodélé Depression Dust Event: March 16, 2010 (7Z -12Z)
Located at the southern edge of the Sahara Desert in north central Africa, is the lowest point in Chad. Dust storms from the Bodélé Depression occur on average about 100 days per year. The Bodélé depression is a single spot in the Sahara that provides most of the mineral dust to the Amazon forest.
Monday, August 11, 14
Chad: Bodélé Depression Dust Event: March 16, 2010 (7Z -12Z)
Located at the southern edge of the Sahara Desert in north central Africa, is the lowest point in Chad. Dust storms from the Bodélé Depression occur on average about 100 days per year. The Bodélé depression is a single spot in the Sahara that provides most of the mineral dust to the Amazon forest.
Monday, August 11, 14
Chad: Bodélé Depression Dust Event: March 16, 2010 (7Z -12Z)
Located at the southern edge of the Sahara Desert in north central Africa, is the lowest point in Chad. Dust storms from the Bodélé Depression occur on average about 100 days per year. The Bodélé depression is a single spot in the Sahara that provides most of the mineral dust to the Amazon forest.
Monday, August 11, 14
Chad: Bodélé Depression Dust Event: March 16, 2010 (7Z -12Z)
Located at the southern edge of the Sahara Desert in north central Africa, is the lowest point in Chad. Dust storms from the Bodélé Depression occur on average about 100 days per year. The Bodélé depression is a single spot in the Sahara that provides most of the mineral dust to the Amazon forest.
Monday, August 11, 14
Chad: Bodélé Depression Dust Event: March 16, 2010 (7Z -12Z)
Located at the southern edge of the Sahara Desert in north central Africa, is the lowest point in Chad. Dust storms from the Bodélé Depression occur on average about 100 days per year. The Bodélé depression is a single spot in the Sahara that provides most of the mineral dust to the Amazon forest.
Monday, August 11, 14
Chad: Bodélé Depression Dust Event: March 16, 2010 (7Z -12Z)
Located at the southern edge of the Sahara Desert in north central Africa, is the lowest point in Chad. Dust storms from the Bodélé Depression occur on average about 100 days per year. The Bodélé depression is a single spot in the Sahara that provides most of the mineral dust to the Amazon forest.
Monday, August 11, 14
Chad: Bodélé Depression Dust Event: March 16, 2010 (7Z -12Z)
Located at the southern edge of the Sahara Desert in north central Africa, is the lowest point in Chad. Dust storms from the Bodélé Depression occur on average about 100 days per year. The Bodélé depression is a single spot in the Sahara that provides most of the mineral dust to the Amazon forest.
Monday, August 11, 14
Chad: Bodélé Depression Dust Event: March 16, 2010 (7Z -12Z)
Located at the southern edge of the Sahara Desert in north central Africa, is the lowest point in Chad. Dust storms from the Bodélé Depression occur on average about 100 days per year. The Bodélé depression is a single spot in the Sahara that provides most of the mineral dust to the Amazon forest.
Monday, August 11, 14
Selected SOM Classes
Chad: Bodélé Depression
NRL MSG-RGB 20110109
Source area is not designated in first pass of MODIS reflectance and land surface classification.
1000 SOM Classes
Monday, August 11, 14
Selected Classes with Class 137
Chad: Bodélé Depression
NRL MSG-RGB 20110109
Class 137 maps diatom sediment in depression.
1000 SOM Classes
Monday, August 11, 14
Selected Classes Without Class 137
Chad: Bodélé Depression
NRL MSG-RGB 201101091000 SOM Classes
Class 137 maps diatom sediment in depression.
Monday, August 11, 14
Solid black circles/ovals show plume source
Corresponding SOM Classes within open circles/ovals
Northern Sahara: 36, 40, 63, 100 Sahel: 147, 229, 230, 405
West Africa: Feb 2, 2011 13Z
Monday, August 11, 14
Selected Classes for North Africa (This involves 40 distinct classes)
Monday, August 11, 14
Jan 1, 2006 True Color
Jan 1, 2006 NRL DEP
Sources along New Mexico/Texas border
The North American sources have a different spectral signature than those we saw in SW Asia
Agricultural on high planesBlue dessert areas
Monday, August 11, 14
Sources in Arizona and Colorado
Apr 17, 2006 NRL DEP
Apr 17, 2006 True color
Monday, August 11, 14
Selected Classes for North America (n=64)
Monday, August 11, 14
All 1000-Classes mapped for South America
Monday, August 11, 14
All 1000-Classes mapped for South America
Blue colored SOM-Classes are concentrated in Atacama and Salar de Uyuni deserts
White areas are salt flats
Monday, August 11, 14
South America: Bolivia and Chile
July 18, 2010 MODIS Terra True Color
Monday, August 11, 14
South America: Bolivia and Chile
July 18, 2010 MODIS Terra True Color Selected SOM-Classes in 200s, 300s, and 400s
Monday, August 11, 14
• SOMs provide an effective mechanism for automating the identification of dust sources.
• Using the SOMs let us globally map dust sources at high resolution 1-10 km.
• Saved time in finding dust sources while comparing to satellite imagery.
• This can be done in real time to have dynamically changing dust sources.
Monday, August 11, 14
Model&
Exis+ng&New&
Model&
Exis+ng&New&
78
Monday, August 11, 14
Model&
Exis+ng&New&
Model&
Exis+ng&New&
• Personalized Health Care
• Proactive Health Care System
• Business Analytics
• Smart Logistics
• Disaster Response
• Fraud Detection
http://holistics3.comMonday, August 11, 14
Visualiza1on(
Decision(Support(
Machine(Learning(
Insight(&(Discovery(
Exis%ng(
• Social(Media(
• Socioeconomic,(Census(
• News(feeds(• Environmental(
• Weather(
• Satellite(• Sensors(• Health(• Economic(
New(
• Business(Analy%cs(2.0(• UAVs(• HyperHspectral(Imaging(
• Smart(Dust(
• Wearable(Sensors(
• Autonomous(Cars(
Simula%on(
• Global(Weather(Models(
• Economic(Models(
• Earthquake(Models(
GigaPop(Pipe(
TACC
Monday, August 11, 14
Recommended