Applying Geostatistical Methods to Lattice Data: An Initial Examination of U.S. Presidential...

Preview:

Citation preview

Applying Geostatistical Methods to Lattice Data: An Initial

Examination of U.S. Presidential Elections in Iowa

A.C. ThomasStatistics 225

December 14, 2004

Sources/Guides

• Main source: “Hierarchical Models”, chapters 2 and 3 (geostatistical and spatial data)

• Data sources: http://www.sos.state.ia.us/elections/results/ (1996/2000)

• http://www.cnn.com/ (2004)• Special thanks: Brad Carlin (UMN),

Andy Gelman (Columbia), Paul Edlefsen (Harvard)

• GeoR: P.J. Ribeiro and P.J. Diggle

Motivation

• In this course, we have learned about three different methods of examining spatial data (depending on relevant conditions) with some interchangeabilities

• Often, we may not have the tools to examine data sets using one method (i.e. the shortcomings of R in manipulating lattice data)

• In this case, we will compare and contrast the effectiveness of a geostatistical method used on lattice data to a lattice method through self cross-validation

Interrelationship

• Geostats and kriging: using variograms and distance relationships to predict quantities across distances

• Lattices: using neighbour relationships to predict quantities across distances

• Direct similarities: some weighting schemes across distances directly resemble covariograms

Why election data?

• Why not?• Spatial organization is well understood

and constant in time (county borders have not changed across data sets) and built into R (maps library)

• While specific challengers change over time, parties are relatively constant, as are other control variables

• Ramifications are germane to the functioning of society (and the insatiable appetite of news junkies and policy wonks)

Questions:

• For this data set, does a geostatistical approximation produce a result comparable in error to a lattice model?

• If so, can we use fitted information from one election to predict the complete results of the next one? (And how much are we off?)

Chosen model: Iowa

Why Iowa?

• 99 counties which have roughly equal area, removing a possible nuisance (and are rectilinear, so easier to draw)

• Swing state, with a rough vote balance over time

• Not too big, not too small in either population or size

Simplification: No third parties

• For now, considering only the votes for Democrat and Republican candidates in presidential elections from 1996-2004

• Not so bad in 2000/2004, when independent vote was about 3% of total

• Worse in 1996 (Perot’s successful campaign drew a lot), up to 10% of total votes

Iowa in 1996 (Dole, Clinton)

Iowa in 2000 (Bush, Gore)

Iowa in 2004 (Bush, Kerry)

Initial impressions

• There seems to be a tendency to vote more Republican the further west we look

• (Observation, courtesy Matt Anthony: as we go east, we hit Illinois, a Democratic core.)

• What is the population distribution by county over time?

Iowa’s total voters, 1996

Iowa’s total voters, 2000

Iowa’s total voters, 2004

Quick-and-dirty non-spatial analysis

• Question: how does population size correlate with the Democratic vote?

• Correlation between blue vote and “total” vote:

• 1996: = 0.18• 2000: = 0.30• 2004: = 0.29.• So population would appear to be

an important covariate.

Geostatistical analysis

• Locations: centroids of each county (obtained through centroid.polygon function in maps library of R)

• Data: Republican percentage of vote (arbitrarily chosen, not necessarily personal political affiliation)

Initial data plots: Unaltered

Initial fitting

• Semivariogram appears to increase without bound, suggesting nonstationarity

• Plan: use Universal Kriging with this semivariogram

• Problem: Trend appears to be power law, with power greater than 2 (impossible to fit with conventional definitions

• Possible solutions: a) remove trend from data. b) don’t care.

Plan A: Remove trend from data

• What it does: lets us remove known spatial dependence, look at other trends

• Initial look: – major discrepancies.

Plan B: Don’t care.

• The goodness of fit only tails off at the end

• Preliminary results show the other option to be extremely inaccurate due to noise levels in residual data

Second trend removed, data centered

Exploratory Kriging

Meaningful Kriging

• Since we want to test the predictive power of this method, we should test it on our current data through cross-validation

• Key: remove one point, use semivariogram with remaining points to interpolate the value at each centroid

• Then, return trend to data and compare with original values

• Use universal kriging with second-degree trend

1996 Redux – Predicted Values

• In total, Dole “receives” 9,726 more votes than predicted.

• Absolute error: 43,526

• Total 2-party votes: 1,112,902

Fitting variograms between models

• For all, power model was appropriate choice ^2 + ^2 * t^

• 1996: ^2 = 9.24e-4, =1.98, ^2=0.031• 2000: ^2 = 9.93e-4, =2.00, ^2=0• 2004: ^2 = 1.16e-3, =2.00, ^2=0.025• All roughly identical, even with different

total averages

2000 Predicted

• Prediction: Bush gets 26,000 more votes

• Absolute error: 181,880

• Total Bush/Gore votes: 1,272,890

2004 Prediction

• Prediction: Bush gets 32,094 more votes

• Absolute difference: 74,458

• Total votes: 1,479,702

“Naïve Neighbour”

• For a baseline comparison, take the simplest (stupidest) lattice cross-validation test – “ask your neighbour”, trivial SAR weights

• Predicted value at a square is simply the mean of border-sharing neighbours (data is Republican percentage of vote)

“NN” 1996

• Dole: 10,819 more predicted

• Total deviation: 40,923

“NN” 2000

• Bush gets 28,535 extra in prediction

• Total deviation: 59,670

“NN” 2004

• Bush gets 37,175 more

• Total deviation: 76,926

Cross-validation summary

Geostat error

NN error

Geostat total error

NN total error

Voting pop.

1996

9,726 10,819 43,526 40,923 1,112,902

2000

26,000

28,535 61,485 59,670 1,272,890

2004

32,094

37,175 74,458 76,926 1,479,702

Conclusions

• Data is definitely not stationary, even after removing trends

• Good kriging is about as effective as “naïve neighbour”, both without covariates

• Prediction with these tools at this simple level is not yet accurate enough

• Each method overpredicts the Republican vote

• Fitting information for each year is very close

Future Developments and Unanswered Questions – New!

• I’ve since introduced universal co-kriging with population, past voting behavior and second-degree spatial dependences using the gstat package.

• Needed: data from the last 4 elections, conveniently packaged. Other prediction using spatial methods.

Recommended