16
1 GEOG4340 GEOG4340 Q. Cheng Q. Cheng Geographic Information System Geographic Information System Lecture Nine Lecture Nine Data Integration Data Integration Spatial Decision Support Spatial Decision Support System (SDSS) System (SDSS) Mapping areas for Mapping areas for drilling in mining drilling in mining industry industry Multivariate Logistic Regression

Geographic Information System - York University notes/lecture9.pdf · 2010-03-14 · 1 GEOG4340 Q. Cheng Geographic Information System Lecture Nine Data Integration Spatial Decision

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Geographic Information System - York University notes/lecture9.pdf · 2010-03-14 · 1 GEOG4340 Q. Cheng Geographic Information System Lecture Nine Data Integration Spatial Decision

1

GEOG4340 GEOG4340 Q. ChengQ. Cheng

Geographic Information SystemGeographic Information System

Lecture NineLecture NineData IntegrationData Integration

Spatial Decision Support Spatial Decision Support System (SDSS)System (SDSS)

Mapping areas for Mapping areas for drilling in mining drilling in mining

industryindustry

Multivariate Logistic Regression

Page 2: Geographic Information System - York University notes/lecture9.pdf · 2010-03-14 · 1 GEOG4340 Q. Cheng Geographic Information System Lecture Nine Data Integration Spatial Decision

2

3D Simulation and Decision Making 3D Simulation and Decision Making Various Types of Data Various Types of Data Integration Integration

Combining multiCombining multi--layers of layers of geoinformationgeoinformation for decision making for decision making

MultiMulti--sourcessourcesMultiMulti--scalesscalesMultiMulti--formatsformatsMultiMulti--owners owners MultiMulti--temporal captures temporal captures

Spatial Decision Support System Spatial Decision Support System Multiple Map ModelingMultiple Map Modeling

Decision Theory is concerned with Decision Theory is concerned with the logic by which one arrives at a the logic by which one arrives at a choice between alternatives. choice between alternatives.

Alternative ActionsAlternative ActionsAlternative hypothesesAlternative hypothesesAlternative objectsAlternative objects

so onso on

Potential ApplicationsPotential Applications

Site Selection Site Selection Suitability Assessment Suitability Assessment Favorability AssessmentFavorability AssessmentProbability Assessment Probability Assessment

Spatial Decision Support System (SDSS)Spatial Decision Support System (SDSS)GIS Data Integration for PredictionGIS Data Integration for Prediction

Remote Sensing

Geological

Geochemical

Geophysical

.

. Potential

Evidential Layers (X)

Modeling (F)

Output Data (S)

Processing

GIS Data Sources

Data PreprocessingInterpretingInformation Extraction

Integration

DBMS

DBMS

DBMS

DBMS

Geographical

Suitability Map for Planning School Suitability Map for Planning School

Page 3: Geographic Information System - York University notes/lecture9.pdf · 2010-03-14 · 1 GEOG4340 Q. Cheng Geographic Information System Lecture Nine Data Integration Spatial Decision

3

A General Spatial Modeling Processes A General Spatial Modeling Processes

•Stating the problem

•Breaking the problem down

•Exploring input datasets

•Determining analysis processes

•Verifying the model’s result

•Implementing the result and reporting

Page 4: Geographic Information System - York University notes/lecture9.pdf · 2010-03-14 · 1 GEOG4340 Q. Cheng Geographic Information System Lecture Nine Data Integration Spatial Decision

4

Model of Processes for finding Distance from rec. facilities

BufferRecreational Site

Distance toRec. Site

Rec. SiteBuffer

Reclassify

Model of Processes for finding Distance beyond existing schools

BufferSchools

Distance toSchool

SchoolBuffer

Reclassify

Model of Processes for finding Relative flat area

SlopeElevation

Slope Classes

Slope Map

Reclassify

Model of Processes for finding Suitable landuse type

LanduseMap

LanduseClassesReclassify

Model Constraints Model Constraints Normalization:

1. Convert maps into comparable unit

⎜⎜⎝

⎛=enotsuitablabsentno

suitablepresentyesxi ,,,0

,,,1 10...,,2,1=ix

2. Assigning weights for each map as %

101...21

≤≤=+++

i

n

wwww

Model of Processes for combining Diverse maps

Landuse

Suitability Map

Calculator

Slope

Dist. Rec. Site

Dist. School

Page 5: Geographic Information System - York University notes/lecture9.pdf · 2010-03-14 · 1 GEOG4340 Q. Cheng Geographic Information System Lecture Nine Data Integration Spatial Decision

5

Combine Maps Combine Maps

Grid calculator with equation

S = 0.50 rec_site +0.25 dist_school +0.125rec_landuse + 0.125 rec_slope

Map S has values between 1 -10

Model Validation

Are the criteria reasonable?

Is the model valid?

Does the result meet the requirement?

Are there errors related to the result?

Are all data used necessary?

General Data Integration ModelGeneral Data Integration Modelfor SDSSfor SDSS

S = F(x1, x2,…, xn)S S –– Index map showing Index map showing

SuitabilitySuitabilityProbabilityProbability

xxii -- maps or evidencesmaps or evidenceswwii -- weights weights

Simple Linear ModelSimple Linear Model

nnxwxwxwS +++= ...2211

S S –– Index map showing Index map showing SuitabilitySuitabilityProbabilityProbability

xxii -- maps or evidencesmaps or evidenceswwii -- weights weights

Model Constraints Model Constraints Normalization: Normalization:

1. Convert maps into comparable unit1. Convert maps into comparable unit

⎩⎨⎧=

noyes

xi ,0,1

2. Weights showing relative importance 2. Weights showing relative importance of mapsof maps

101...21

≤≤=+++

i

n

wwww

⎪⎩

⎪⎨

⎧=

nounknown

yesxi

,0,5.0

,110...,,2,1=ix

Page 6: Geographic Information System - York University notes/lecture9.pdf · 2010-03-14 · 1 GEOG4340 Q. Cheng Geographic Information System Lecture Nine Data Integration Spatial Decision

6

Methods for Calculating Weights for Methods for Calculating Weights for Data IntegrationData Integration

Data Driven Methods:Data Driven Methods:Weights of evidenceWeights of evidenceLogistic regressionLogistic regression

Artificial Neural networkArtificial Neural network

Knowledge driven Methods:Knowledge driven Methods:Fuzzy logicFuzzy logic

Hybrid Methods:Hybrid Methods:Fuzzy weights of evidenceFuzzy weights of evidence

Model Types Model Types

1. Probabilistic1. Probabilistic

2. Deterministic2. Deterministic

S S –– random variable showing random variable showing probability 0 probability 0 ≤≤ S S ≤≤ 1 with uncertainty1 with uncertainty

S S –– Score 0 Score 0 ≤≤ S S ≤≤ 11

Relationships Between Different Models

Simple Overlay Model (Union, Intersect, Identity)

Linear Model (adding weights)

Logistic Model (Weights of Evidence, Logistic Regression)

Fuzzy Logic model (various operators)

Spatial Data Modeler Extension: Arc-SDM

Weights of Evidence

Logistic Regression

Fuzzy Logic

Neural Network

Page 7: Geographic Information System - York University notes/lecture9.pdf · 2010-03-14 · 1 GEOG4340 Q. Cheng Geographic Information System Lecture Nine Data Integration Spatial Decision

4

33112244PointsPoints

45453535131377AreaArea

0.060.06BBnotAnotA440.020.02nnot Bot BnotAnotA330.150.15notBnotBA A 220.570.57BBA A 11points/areapoints/areaPolyBPolyBPolyAPolyAIDID

A B

A

not A

not B

B

not A not B

A not B

not A B

A B

not A not B

A not B

not A B

0.57

0.15

0.02

0.06

Prior probability total number of point / total area10/100 = 0.10 (10%)

Posterior probability: number of point /pattern area

Prior probability: total number of point / total area10/100 = 0.10 (10%)

Posterior probability: number of point /pattern area (density of point/area) - P(D|AB)

Percentage of points: # points on pattern/total # of points P(AB|D)

Concept of Prior probability and Posterior probability

Page 8: Geographic Information System - York University notes/lecture9.pdf · 2010-03-14 · 1 GEOG4340 Q. Cheng Geographic Information System Lecture Nine Data Integration Spatial Decision

5

not A not B C

A not B C

not A B C

Three patterns: trees, lake and road buffer

A B C A notB notC

AB

notC

notAnotB notC

notA

B n

otC

5.0)|( =DABP

0.4(0.42)

110.40.40.60.6

0.30.30.10.1(0.12)(0.12)

0.20.2(0.18)(0.18)

nnot Bot B

0.70.70.30.3(0.28)(0.28)

0.40.4(0.42)(0.42)

BB

nnot Aot AAA

0.2(0.18)

0.1(0.12)

0.3 (0.28)

not A not B

A not B

not A B

A B

Percentage of points

110.40.40.60.6

0.30.30.10.1(0.12)(0.12)

0.20.2(0.18)(0.18)

nnot Bot B

0.70.70.30.3(0.28)(0.28)

0.40.4(0.42)(0.42)

BB

nnot Aot AAA

Percentage of points

P(AB|D) P(notA B|D)

P(A notB|D) P(notA notB|D)

P(B|D)

P(notB|D)

P(notA|D)P(A|D)

marginal probability

Joint probabilityPercentage of points of independent events

P(AB|D) = P(A|D) P(B|D)

marginal probabilityJoint probability

Percentage of points on AB = % points on A * % points on B

Page 9: Geographic Information System - York University notes/lecture9.pdf · 2010-03-14 · 1 GEOG4340 Q. Cheng Geographic Information System Lecture Nine Data Integration Spatial Decision

6

5.0)|( =DABP

0.07(0.10)

110.480.480.520.52

0.80.80.350.35(0.38)(0.38)

0.450.45(0.42)(0.42)

nnot Bot B

0.20.20.130.13(0.10)(0.10)

0.070.07(0.10)(0.10)

BB

nnot Aot AAA

0.45(0.42)

0.35(0.38)

0.13 (0.10)

not A not B

A not B

not A B

A B

Percentage of Areas

110.480.480.520.52

0.800.800.350.35(0.38)(0.38)

0.450.45(0.42)(0.42)

nnot Bot B

0.200.200.130.13(0.10)(0.10)

0.070.07(0.10)(0.10)

BB

nnot Aot AAA

Percentage of areas

P(AB) P(notA B)

P(A notB) P(notA notB)

P(B)

P(notB)

P(notA)P(A)

marginal probability

Joint probability

Percentage of areas of independent events

P(AB) = P(A) P(B)

marginal probabilityJoint probability

% Area of AB = % Area of A * % Area of B

Bayes’s Rule:

Probability map

P(D|A) = P(D)P(A|D)/P(A)

P(D|notA) = P(D)P(notA|D)/P(notD)

P(D|AB) = P(D) P(A|D)/P(A) P(B|D)/P(B)

P(D|AnotB) = P(D)P(A|D)/P(A) P(notB|D)/P(notB)

P(D|not AB) = P(D)P(notA|D)/P(notA) P(B|D)/P(B)P(D|not A notB) = P(D)P(notA|D)/P(notA)

P(not B|D)/P(not B)

Page 10: Geographic Information System - York University notes/lecture9.pdf · 2010-03-14 · 1 GEOG4340 Q. Cheng Geographic Information System Lecture Nine Data Integration Spatial Decision

7

Bayes’s Rule:

Log (Probability)

log[P(D|A)] = log[P(D)]+log[P(A|D)/P(A)]

= Log[P(D)] + WA+

Log[P(D|notA)] = log[P(D)] + log[P(notA|D)/P(notD)]

= Log[P(D)] + WA-

Where WA+ = log[P(A|D)/P(A)]

WA- = log[P(notA|D)/P(notA)]

Log[P(D|AB)] = log[P(D)] +WA++ WB

+

Log[P(D|A notB)] = log[P(D)] +WA++ WB

-

Log[P(D|notAB)] = log[P(D)] +WA-+ WB

+

Log[P(D|notA not B)] = log[P(D)] +WA-+ WB

-

If A, B, C are conditionally independent then

WA+ = log[P(A|D)/P(A)] = Log[ ]

= Log[ ]

P(A|D)

P(A)

% points on A

% Area of A

WA+ > 0 positive correlation between A and points

WA+ = 0 no correlation between A and points

WA+ < 0 negative correlation between A and points

Spatial Association Index

Contrast

C = WA+ - WA

-

(1) -∞ < C < ∞

(2) C = 0 A and D are independent

(3) C > 0 positive correlation between D and A

(4) C < 0 negative correlation between D and A

Page 11: Geographic Information System - York University notes/lecture9.pdf · 2010-03-14 · 1 GEOG4340 Q. Cheng Geographic Information System Lecture Nine Data Integration Spatial Decision

7

Logistic Model for SDSSLogistic Model for SDSS

......}|{ 0 +++= BA WWWABDLogit

...)|( ABDP

)(DP

)|(),|( BDPADP

Prior Probability

Posterior Probability

#(D) =20Area(T) = 7780P(D) = 0.0026Area(A) =3065#(D|A) = 15P(D|A) = 0.0049

Area(B) =4175#(D|B) = 19P(D|B) = 0.0045

Page 12: Geographic Information System - York University notes/lecture9.pdf · 2010-03-14 · 1 GEOG4340 Q. Cheng Geographic Information System Lecture Nine Data Integration Spatial Decision

8

Area(AB) =1624.27#(D|AB) = 13P(D|AB) = 0.008

Page 13: Geographic Information System - York University notes/lecture9.pdf · 2010-03-14 · 1 GEOG4340 Q. Cheng Geographic Information System Lecture Nine Data Integration Spatial Decision

9

Au, W, As,

Au- Sn- W- As Multiple Elements

Spatial Data Modeler Extension: Arc-SDM

Weights of Evidence

Logistic Regression

Fuzzy Logic

Neural Network

Page 14: Geographic Information System - York University notes/lecture9.pdf · 2010-03-14 · 1 GEOG4340 Q. Cheng Geographic Information System Lecture Nine Data Integration Spatial Decision

10

Logistic Model for SDSSLogistic Model for SDSS

......}|{ 0 +++= BA WWWABDLogit

...)|( ABDP

Posterior Probability Prediction of Potential Flowing Wells in the ORM

Flowing Wells and SpringsFlowing Wells and Springs Flowing Wells vs. Distance from Flowing Wells vs. Distance from ORMORM

-8

-4

0

4

8

12

0 5000 10000 15000

Spatial Correlation

Distance

Flowing Wells vs. Distance from Flowing Wells vs. Distance from ORMORM

Spatial Correlation

Distance

Flowing Wells vs. Distance Flowing Wells vs. Distance From High Slope ZoneFrom High Slope Zone

-8

-4

0

4

8

0 2000 4000 6000 8000

Spatial Correlation

Distance

Page 15: Geographic Information System - York University notes/lecture9.pdf · 2010-03-14 · 1 GEOG4340 Q. Cheng Geographic Information System Lecture Nine Data Integration Spatial Decision

11

Flowing Wells vs. Thickness of Flowing Wells vs. Thickness of DriftDrift

Flowing vs. Distance from Thick Flowing vs. Distance from Thick DriftDrift

-4

0

4

8

0 5000 10000 15000 20000

Spatial Correlation

Distance

Flowing vs. Distance from Thick Flowing vs. Distance from Thick DriftDrift

Potential Locations of Flowing WellsPotential Locations of Flowing Wells by by SDSS SDSS ––

Weights of EvidenceWeights of Evidence (Cheng, 2001)(Cheng, 2001)Theme Area

% Points% Contrast t-value

LR Coeff.

LRStd

Buffer zone (1~1km) around steep slope 63 80 0.89 6.54 0.88 0.14Buffer zone (1~5km) around the ORM 40.3 55.3 0.62 5.68 0.28 0.12Buffer zone around steep slope of lower sand / gravel top 10.1 52.7 0.62 5.68 1.77 0.12

Ratio of sand/gravel unit cumulative thickness in well depth (6~25%) 43.5 65.1 0.90 7.88 0.49 0.12

Buffer zone (1~2.5km) of thick drift area 16.2 29.3 0.79 6.56 0.45 0.13Elevation of the upper confined aqu ifers at 356~375 (m a. s. l.) 16.8 38.6 1.18 10.44 0.47 0.13

Elevation of the lower confined aquifers at 311~347 (m a. s. l.) 40.5 58.2 0.73 6.60 0.28 0.13

Steep slope of confined aquifer surface 22.9 54.7 1.45 13.19 0.97 0.12Buffer zone ( 0~2km) around the small ponds 61.1 74.9 0.67 5.31 0.43 0.13Intercept constant -11.62 0.52

Results obtained by Weights of Evidence Results obtained by Weights of Evidence and Logistic Regression Methodsand Logistic Regression Methods

Page 16: Geographic Information System - York University notes/lecture9.pdf · 2010-03-14 · 1 GEOG4340 Q. Cheng Geographic Information System Lecture Nine Data Integration Spatial Decision

12

Multivariate Logistic Regression