26
Imputating snag data to forest inventory for wildlife habitat modeling Kevin Ceder College of Forest Resources University of Washington GMUG – 11 February 2008

Imputating snag data to forest inventory for wildlife habitat modeling

  • Upload
    kin

  • View
    29

  • Download
    0

Embed Size (px)

DESCRIPTION

Imputating snag data to forest inventory for wildlife habitat modeling. Kevin Ceder College of Forest Resources University of Washington GMUG – 11 February 2008. Why impute snag data?. Snags are an important habitat element and needed for habitat assessments. - PowerPoint PPT Presentation

Citation preview

Page 1: Imputating snag data to forest inventory for wildlife habitat modeling

Imputating snag data to forest inventory for wildlife habitat

modelingKevin Ceder

College of Forest ResourcesUniversity of Washington

GMUG – 11 February 2008

Page 2: Imputating snag data to forest inventory for wildlife habitat modeling

Why impute snag data?

• Snags are an important habitat element and needed for habitat assessments.

• These data are often not collected in forest inventory

• The Large-Landscape Wildlife Assessment models will need these data

Page 3: Imputating snag data to forest inventory for wildlife habitat modeling

Why use Nearest-Neighbor?

• Non-parametric requiring no assumptions of underlying functional form

• Retains the variance/covariance structure of the input data in the output data

Page 4: Imputating snag data to forest inventory for wildlife habitat modeling

The Questions

1) Can snag data be imputed using kNN techniques with stand and site data?

2) How well do the results fit observed data?

3) Which distance measure performs best? 4) What is the effect of increasing

neighborhood size?5) How do the results compare with random

sampling?

Page 5: Imputating snag data to forest inventory for wildlife habitat modeling

The Process

• The database– FIA integrated database version 2.1– Data for private forests in western Washington

(1510 plots)– Both tree and snag data collected between

1989 - 1991– Representative of the forest targeted for the

LLWA project

Page 6: Imputating snag data to forest inventory for wildlife habitat modeling

The Process

• The tool - – The yaImpute package for kNN imputation

• Raw, Euclidean, Mahalanobis, MSN, MSN2, ICA, and randomForest distance measures

• k = 1, 2, 3, 4, 5, 10• For k>1 imputed data are distance weighted means of

neighbors

– 9999 permutations of the data for comparisons with random sampling

• k = 1, 2, 3, 4, 5, 10• For k>1 imputed data are distance weighted means of

neighbors using Euclidean distance

Page 7: Imputating snag data to forest inventory for wildlife habitat modeling

The Statistics

• Goodness of fit • Comparison with random

1

)( 2

N

yyRMSD oi

1

N

yybias oi

M

mp

1

1

N

yyMAD oi

Page 8: Imputating snag data to forest inventory for wildlife habitat modeling

The Input Data – Tree and site data (xData)

N = 1510 Min Max Mean

Trees per Acre (TOT_TPA) 6.7 2920.5 475.8

Basal Area per Acre (TOT_BA, sqft/ac) 0.0 397.8 119.8

Quadratic Mean Diameter (QMD, in) 0.1 28.5 7.6

Mean Height (MEAN_HT, ft) 1.0 147.9 43.7

Stand Age (AGE, yr) 5 215 37

Site Index (SITE_INDEX_FIA, feet @ 50 yr) 44 180 112

Slope (SLOPE, %) 0 99 24

Aspect (ASPECT_DEG, deg) 0 130 155

Elevation (ELEV_FT, ft) 3 4724 869

Page 9: Imputating snag data to forest inventory for wildlife habitat modeling

The Input Data – Snag data (yData)

N = 1510 Min Max Mean

Snags per Acre (SNAG_TPA_TOTAL) 0.0 96.8 4.8

Basal Area (SNAG_BA, sqft/ac) 0.0 9.7 0.5

Quadratic Mean Diameter (SNAG_QMD, in) 0.0 10.5 2.7

Mean Height (SNAG_ MEAN_HT, ft) 0.0 161.0 14.9

• 695 of 1510 plots did not have snags present

Page 10: Imputating snag data to forest inventory for wildlife habitat modeling

Results

1) Can snag data be imputed using kNN techniques with stand and site data?

Yes!

Page 11: Imputating snag data to forest inventory for wildlife habitat modeling

Results

1) How well do the results fit observed data?

RMSD SPA BA QMD Mean Ht

Min 7.0 0.6 2.5 18.9

Max 11.2 1.0 3.7 28.4

Mean 9.0 0.8 3.0 23.4

Page 12: Imputating snag data to forest inventory for wildlife habitat modeling

Results

1) How well do the results fit observed data?

BIAS SPA BA QMD Mean Ht

Min -1.7 -0.2 -0.6 -4.0

Max -0.1 0.0 0.1 0.3

Mean -0.5 0.0 -0.1 -0.8

Page 13: Imputating snag data to forest inventory for wildlife habitat modeling

Results

1) How well do the results fit observed data?

MAD SPA BA QMD Mean Ht

Min 3.5 .3 1.8 11.3

Max 6.2 0.6 2.7 18.1

Mean 5.1 0.5 2.3 15.1

Page 14: Imputating snag data to forest inventory for wildlife habitat modeling
Page 15: Imputating snag data to forest inventory for wildlife habitat modeling

Results

1) How well do the results fit observed data?

Marginally…

• High RMSD and MAD relative to mean snag measures in the data

• Observed vs imputed plots show poor patterning

Page 16: Imputating snag data to forest inventory for wildlife habitat modeling

Results

2) Which distance measure performs best?

3) What is the effect of increasing neighborhood size?

Page 17: Imputating snag data to forest inventory for wildlife habitat modeling
Page 18: Imputating snag data to forest inventory for wildlife habitat modeling
Page 19: Imputating snag data to forest inventory for wildlife habitat modeling
Page 20: Imputating snag data to forest inventory for wildlife habitat modeling
Page 21: Imputating snag data to forest inventory for wildlife habitat modeling

Results

2) Which distance measure performs best?• All are generally similar• randomForest imputations provide lower

RMSD and MAD but under-predict more than others

3) What is the effect of increasing neighborhood size?

• Increasing k reduces RMSD and MAD• Little effect on bias• Slightly decreased range in imputed values

with k = 10

Page 22: Imputating snag data to forest inventory for wildlife habitat modeling

Results4) How do the results compare with random

sampling?

RMSD k 1 2 3 4 5 10

SNAG_TPA_TOTALNN 11.24 9.96 9.15 8.70 8.62 8.35

p 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001

SNAG_BA_TOTALNN 0.95 0.84 0.77 0.73 0.73 0.71

p 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001

SNAG_QMDNN 3.55 3.10 2.93 2.85 2.79 2.69

p 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001

SNAG_MEAN_HTNN 27.89 24.38 22.78 22.02 21.69 20.87

p 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001

Page 23: Imputating snag data to forest inventory for wildlife habitat modeling

Results4) How do the results compare with random

sampling?

MAD k 1 2 3 4 5 10

SNAG_TPA_TOTALNN 6.03 5.59 5.27 5.16 5.13 4.97

p 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001

SNAG_BA_TOTALNN 0.53 0.49 0.46 0.45 0.44 0.43

p 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001

SNAG_QMDNN 2.51 2.38 2.30 2.26 2.23 2.23

p 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001

SNAG_MEAN_HTNN 17.46 15.86 14.96 14.67 14.51 14.13

p 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001

Page 24: Imputating snag data to forest inventory for wildlife habitat modeling

Results4) How do the results compare with random

sampling?

• p-values of 0.001 suggest that there is some underlying very weak relationship between snags and overstory

• Imputation is better than just randomly assigning snags to stands

Page 25: Imputating snag data to forest inventory for wildlife habitat modeling

Why didn’t it work better?

• Very weak correlations between overstory and snags– Snags are from prior stand

• Many of the snags in the FIA database have advanced decay classes

• Often snags are larger than QMD

– Management history• Snags were removed at harvest• Thinning captures mortality

Page 26: Imputating snag data to forest inventory for wildlife habitat modeling

Future Direction

• Assessing the effects of imputed data on habitat model outputs– If there are big differences then what?