Upload
cassandra-mccoy
View
213
Download
0
Embed Size (px)
Citation preview
1
Modeling Evolution in Spatial Datasets
Paul Amalaman2/17/2012
Dr Eick ChristophNouhad RizkZechun CaoSujing Wang
Data Mining and Machine Learning Lab Team Members
Anirup DuttaSwati GoyalTarikul IslamPaul Amalaman
3
Machine Learning Techniques are mostly used where• modeling implicit trends is possible (Regression)• stable patterns exist in dataset (Classification)
Simulation Systems are used when• a model is hard to establish• there is a great degree of randomness in the attribute values • there are a lot of interactions between objects • when attributes have to be predicted recursively over many steps
Example Applications of Simulation Systems:Traffic Modeling, Weather Forecasting, Social Networks, Urban Modeling
I-Background
4
I-Background continued(3)
Spatial Simulation Systems
Cellular Automata (CA)(Cell centered approach)
Continuous Agent SpaceOr Multi Agent System (MAS)
(Agent centered approach)
ABM
5
• Concept of neighborhood• Moore Neighborhood• Von Newman neighborhood
Moore Neighborhoodhttp://en.wikipedia.org/wiki/Moore_neighborhood
Von Newman Neighborhoodhttp://en.wikipedia.org/wiki/Von_Neumann_neighborhood
D(x-1,y-1) D(x-1,y) D(x+1,y-1)
D(x-1,y) P(x,y) D(x+1,y)
D(x-1,y+1) D(x-1,y+1) D(x+1,y+1)
D(x-1,y)
D(x-1,y) P(x,y) D(x+1,y)
D(x-1,y+1)
I-Background continued(3) Modeling with Cellular Automata
6
I-Background continued(4) Modeling with Cellular Automata
Cellular Automata • provides the programmer a cell-centered
programming style where the set of cells represents computing units that are regularly organized
• good efficiency with parallel architecture
7
II-Research Goals
Using Data Mining and Machine Learning Techniques to Enhance Simulation Systems
New approach= Machine Learning Techniques + Spatial Simulation Systems
Goal1: Grid-based Models for Progression in Spatial Datasets
Goal2: Development of Cluster-based Bias Removal Methods
8
?
yi,j,t+1= fij(x1,1,1,t,…, x1,n,n,t,… , xm,1,1,t,…, xm,n,n,t, y1,1,t,…,y,n,n,t)
II-Research Goal continued (1)Goal1:Grid-based Models for Progression in Spatial
Datasets
t t +1
X1(t)X2(t)
.
.Xn(t)Y(t)
X1(t+Δt)=?X2(t+Δt)=?
.
.Xn(t+Δt)=?Y(t+Δt)=?
Given that at t we know all the attribute values including the output variable Y, can we predict all attribute values at t+1?
Challenges:1. Many target variables to predict; different variables have to be predicted at different location 2. Target variables are not independent of each other (e.g. some are auto-correlated) 3. Models has to be used over multiple steps
9
EPA prediction models are meteorological and chemical transport models. Those models are derived from solving differential equations. Over time, the model bias grows larger
http://www.epa.gov/AMD/CMAQ/ch06.pdf
II-Research Goal continued (2)Goal2:Development of Cluster-based Bias Removal
Methods
ModelOutput + bias b(x)Input x
Whether pattern recognition
Model
Output Correction
(bias removal)
Inputx
Output h(b(x), group(x))
Bias removal based on whether pattern recognitionOur model, model h learn group(x), and b(x) and make better prediction
b(x)
group(x)
10
III-Case Study
Improving Ozone Forecasting For Houston-Galveston AreaGoal1: Development of a Grid-based Prediction Framework Goal2: Development of Cluster-based Bias Removal Methods
In Collaboration with UH-IMAQS Institute for Multidimensional Air Quality Studies (UH Department of Earth and Atmospheric Science) -Dr Rappenglueck, Bernhard-Dr Li, Xiangshang
11
III-Case Study Continued(1)
Ozone PredictionGoal 1:Improving Prediction for Spatial ProgressionGiven what happened at t, can we predict what happens at t+Δ, t+2Δ, ..?
13
III-Case Study Continued(2)
Status of Dissertation
• Methods to collect ozone data and to capture it in a relational database have been developed.
• The necessary knowledge for simulation-based prediction systems in general, and ozone prediction in particular has been obtained
• Started work on different modeling approaches for grid-based prediction