Generic Approaches to Model Validation Presented at Growth Model User’s Group August 10, 2005...

Preview:

Citation preview

Generic Approaches to Model Validation Presented atGrowth Model User’s GroupAugust 10, 2005

David K. Walters

2

Phases in a Modeling Project

Software Implementation

Arithmetic Specification

Topology

Algebraic Specification

Data Collection

Model Identification

Component Model – Equation Forms

Model Fitting with Data

3

Where does validation fit in?Ideal Case – as an integrated component of the model development process as a feedback mechanism

Reality –

Best Case - done once by modeler using a subset of the modeling data (or other, related techniques), then up to the user. Feedback is up to the persistence of the user and the receptiveness of the modeler. Not integrated…

Probable Case - done once by modeler using a subset of the modeling data (or other, related techniques), then up to the user. Modeler takes a new job, moves on…

4

Of what benefit is validation?Increase Comfort

The user better understands the situations in which the model can be reliably applied and those situations in which it cannot.

Model Improvements – facilitates calibrationTo make a model applicable to a new situation

different treatments/regions/situationsdifferent “scale”

Over/under runs – utilization issuesTo weight model output with other data for the purpose of decision making (weighting usually requires some estimate of variability)

5

Validating the overall appropriateness

Is the model flexible enough to reproduce desired management alternatives?

Does it provide sufficient detail for decision-making?

How efficient is the model in meeting these goals?

Everything should be made as simple as possible, but not simpler --Albert Einstein

Model Type & ResolutionWhole Stand Models

Individual Tree Models

Process Models

Distance Dependent - Independent

6

Validating a Model – Check the dataEvaluation Data

Application DataModeling Data

Differences inData Populations

Spatially

Temporally

Culturally

Some Research data may be collected with such a high degree of caution that resultant models will tend to overestimate growth and yield.

7

Validating the Component Models

Model Component SpecificationEquation “forms” – reasonable, consistent with established theory and/or user’s expectations.Statistical, or other, “fitting” of the component equations

8

Validating the Implementation – the computer software

Software ImplementationBugsAdequacy of outputs / interfaceEfficient

9

A couple of random thoughts – what else might make a model invalid?

Homogeneity – very few models operationally project plots. Most project stands. Stands are assumed to be homogenous with respect to exogenous or predictor variables.

Strategic Plan 1/99 Resources

Harvest Schedule - Stand

Year Age TPA Ht

1998 16 345 29.76 0.00 $ - 2000 18 338 36.24 0.00 $ - 2005 23 319 50.80 0.00 $ - 2010 28 299 63.86 0.53 $ (33)2015 33 280 75.92 15.50 $ 67 2020 38 261 86.34 17.05 $ 56 2025 43 243 95.78 36.67 $ 122 2030 48 227 104.86 41.33 $ 98 2035 53 213 112.79 44.64 $ 76

But....they are really full of

Holes

10

…misuseMatching Data Inputs with Model Specifications

Site IndexDBH Thresholds – all versus “merchantable” or other subsets of treesOthers

11

Statistics•So, we get to the point of wishing to conduct a data-based validation of some kind.

•what do we compare? Real Data vs. Predicted Data•Tree Variables – DBH, Height, Crown, Volume•Stand Variables – QMD, TPA, BA, Volume

If using Volume when comparing multiple

models…make sure the volume equations are

identical

12

What statistics to use?

or totaled averaged is nsobservatio of collection

a error when expected the..measures.....bias. 1 i

in dd

nobservatio 1any of prediction the with

associatederror average theuresacy...meas.....accur 1 i

in dd

outliers to

robust lessbut , to ealternativ acy...an.....accur ddi

inrmse d21

13

The overall project - two Approaches

Case 1 – We have repeat measurements (growth data)Using the observed inputs, run the real data through the model. Look at time 2(or 3, etc.) predicted versus real. Calculate Statistics

Case 2 – No repeat measurementsSimulation Study

Identify matrix of input variables (Site, density, stocking ,etc.) that cover the range of interest.

Run model for each row of input matrix

14

Validation – Patterns and TrendsIn either case, you will want to look for trends, how the predictions (or residuals if you have real data) change. Examples,

MAI over timeTPA over timeResults vs. predictor variables (Site, treatment, density)How do Long-term predictions compare to “laws”

Self-thinning, etc.How does the model prediction compare to other models

15

In Summary,

Identify the alternative “models”, establish a frame of reference

Examine the big picture Look at the sample used in the model calibration, the

presumed population, and a sample of “your” population. Identify the key component models. Compare predictions

with data – bias and accuracy. Examine these for trends against appropriate factors

Look at the overall model output…the computer code. Are there errors? Evaluate output with data (volume per acre – aggregated variables)

16

Final PointsRemember, there is always an alternative model. When evaluating a model, give careful thought to the alternative. How well a model performs in relation to the alternative is generally the most relevant question.

Validity is relative, as are other things.

17

All you need in this life is ignorance and confidence -- and then success is sure.

--Mark Twain

Recommended