43
1 A Core Course on Modeling The modeling process define conceptuali ze conclude execute formalize formulate purpose identify entities choose relations obtain values formalize relations operate model obtain result present result interpret result Week 1- No Model Without a Purpose Right problem? Right concepts? Right model? Right outcome? Right answer? Right answer?

1 A Core Course on Modeling The modeling process define conceptualize conclude execute formalize formulate purpose formulate purpose identify

Embed Size (px)

Citation preview

Page 1: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

1

A Core Course on Modeling

The modeling process

define

conceptualize

conclude

execute

formalize

formulate

purpose

formulate

purpose

identify

entities

identify

entities

choose

relations

choose

relations

obtain

values

obtain

values

formalize

relations

formalize

relations

operate

model

operate

model

obtain

result

obtain

result

present

result

present

result

interpret

result

interpret

result

Week 1- No Model Without a Purpose

Right problem?

Right concepts?

Right model?

Right outcome?

Right answer?Right answer?Right answer?Right answer?

Page 2: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

2

A Core Course on Modeling

Contents

• What do we mean by Confidence?

•Validation and Verification, Accuracy and Precision

•Distributions to Indicate Uncertainty

• Distance and Similarity

• Confidence in Black Box models

•Features from Data Sets

•Example of the Value of a Black Box Model

•Validating a Black Box Model

• Confidence in Glass Box Models

•Structural Validity Assessment

•Quantitative Validity Assessment• Summary

• References to lecture notes + book

• References to quiz-questions and homework assignments (lecture notes)

Week 6-Models and Confidence

Page 3: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

3

A Core Course on Modeling

What do we mean by Confidence?

‘96% of the contents of the universe

is unknown dark matter + energy’

so:‘we can’t have confidence

in cosmological models’

Week 6-Models and Confidence

blueberry marmalade?

Page 4: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

4

A Core Course on Modeling

What do we mean by Confidence?

Week 6-Models and Confidence

Not quite:

confidence only assessible when

• modeled system

• model

• modeling purpose

are all known

modelmodeled

system

purpose

confidenceconfidence needs

needs

needs

represented by

shou

ld fu

lfillw

ith respect to

Page 5: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

5

A Core Course on ModelingWeek 6-Models and Confidence

example 1:

elegant and simple model (elementary secondary school physics, say mechanics of levers and slides)

modeled system: not explicitly defined

purpose: to pass one’s exam

What do we mean by Confidence?

Page 6: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

6

A Core Course on ModelingWeek 6-Models and Confidence

example 1:

elegant and simple model (elementary secondary school physics, say mechanics of levers and slides)

modeled system: ship yard

purpose: to secure safe launch

What do we mean by Confidence?

Page 7: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

7

A Core Course on ModelingWeek 6-Models and Confidence

example 1:

elegant and simple model (elementary secondary school physics, say mechanics of levers and slides)

modeled system: ship yard

purpose: to find direction of moving ship (uphill or downhill?)

What do we mean by Confidence?

Page 8: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

8

A Core Course on ModelingWeek 6-Models and Confidence

example 2:

model: full event log

modeled system: Internet traffic

purpose: diagnose performance bottlenecks

What do we mean by Confidence?

Page 9: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

9

A Core Course on ModelingWeek 6-Models and Confidence

example 2:

model: full event log

modeled system: Internet traffic

purpose: document for archiving

What do we mean by Confidence?

Page 10: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

10

A Core Course on ModelingWeek 6-Models and Confidence

example 2:

model: aggregated data

modeled system: Internet traffic

purpose: document for archiving

What do we mean by Confidence?

Page 11: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

11

A Core Course on ModelingWeek 6-Models and Confidence

example 2:

model: aggregated data

modeled system: Internet traffic

purpose: analyse performance bottlenecks

What do we mean by Confidence?

Page 12: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

Terms in the literature to discuss confidence:

12

A Core Course on Modeling

Validation and Verification, Accuracy and Precision

Week 6-Models and Confidence

‘Valides’: strength

Validation: is it the right model?

•consistency model - modeled system

•e.g. are cat.-III values correct?

• does the model behave intuitively?

•consistency model - purpose

•e.g. are cat.-II values conclusive?

Page 13: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

Terms in the literature to discuss confidence:

13

A Core Course on ModelingWeek 6-Models and Confidence

‘Veritas’: truth

verification: is the model right?

•consistency conceptual - formal model

•e.g. are dimensions correct?

• is the graph a-cyclic?

• are values within admitted bounds cf. types?

Validation: is it the right model?

•consistency model - modeled system

•e.g. are cat.-III values correct?

• does the model behave intuitively?

•consistency model - purpose

•e.g. are cat.-II values conclusive?

Validation and Verification, Accuracy and Precision

Page 14: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

Terms in the literature to discuss confidence:

14

A Core Course on ModelingWeek 6-Models and Confidence

modelmodeled

system

purpose

confidenceconfidence needs

needs

needs

represented by

shou

ld fu

lfillw

ith respect to

verification: is the model right?

•consistency conceptual - formal model

•e.g. are dimensions correct?

• is the graph a-cyclic?

• are values within admitted bounds cf. types?

Validation: is it the right model?

•consistency model - modeled system

•e.g. are cat.-III values correct?

• does the model behave intuitively?

•consistency model - purpose

•e.g. are cat.-II values conclusive?

Validation and Verification, Accuracy and Precision

conceptual &formal

Page 15: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

Terms in the literature to discuss confidence:

validation

verification

accuracy

precision

15

A Core Course on ModelingWeek 6-Models and Confidence

… based on

Validation and Verification, Accuracy and Precision

Page 16: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

16

A Core Course on ModelingWeek 6-Models and Confidence

Validation and Verification, Accuracy and Precision

Terms in the literature to discuss confidence:

validation

verification

accuracy

precision

high accuracyhigh accuracy low precision low accuracy highhigh precisionprecision

low accuracy low precision high accuracyhigh accuracy highhigh precisionprecision

low biaslow bias (offset, systematic error), large spreading

low spreadinglow spreading (noise, randomness), large

bias

outlier (freak accident,

miracle, …)

large spreading, large bias

low spreading, low spreading, low biaslow bias

a single result gives no

information: look at ensembles

? ?

? ?……can only be assessed with ground truthcan only be assessed with ground truth

……assessment needs no ground truth assessment needs no ground truth (reproducibility) (reproducibility)

Page 17: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

17

A Core Course on Modeling

Distributions to Indicate Uncertainty

Week 6-Models and Confidence

these all lead to

uncertainty,

represented as

a distribution

giving the chance(density) of a particular but uncertain outcome

with some average and some spreading.

distribution …

Terms in the literature to discuss confidence:

validation

verification

accuracy

precision

Page 18: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

18

A Core Course on ModelingWeek 6-Models and Confidence

Gaussian (normal) distribution: the sum of sufficiently many uncorrelated numbers with average and spreading has a normal distribution. E.g.: de weight distribution of 18-year old Americans.

these all lead to

uncertainty,

represented as

a distribution

giving the chance(density) of a particular but uncertain outcome

with some average and some spreading.

Distributions to Indicate Uncertainty

Terms in the literature to discuss confidence:

validation

verification

accuracy

precision

Page 19: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

19

A Core Course on ModelingWeek 6-Models and Confidence

these all lead to

uncertainty,

represented as

a distribution

giving the chance(density) of a particular but uncertain outcome

with some average and some spreading

Distributions to Indicate Uncertainty

Terms in the literature to discuss confidence:

validation

verification

accuracy

precision

Uniform distribution: all outcomes in an interval between - and + have equal probability (e.g., dice: =3.5, =2.5).

Distributions can be continuous (measuring) or discrete (counting, e.g. dice)

Page 20: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

20

A Core Course on ModelingWeek 6-Models and Confidence

Uncertain model outcome and purpose:

Example 1.

model used for decision making (e.g., diagnosis; classification ‘good’ or ‘bad’.

Confidence for diagnosis support. Compare model outcome against threshold. Confidence is lower if areas left and right from treshold

are less different.

high confidencemedium confidencelow confidence

Distributions to Indicate Uncertainty

Validation: is the treshhold at the right place? Does checking with this treshhold mean anything w.r.t. the purpose?

Verification (for glass box): do we calculate the distribution correctly?

Accuracy: are we sure there is no bias?

Precision: can we obtain narrower distributions?

Page 21: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

21

A Core Course on ModelingWeek 6-Models and Confidence

Uncertain model outcome and purpose:

Example 2.

model used in design: computed uncertainty intervals should be small enough to assess if A or B is better.

Confidence for design decision support: compare one model outcome against a second model outcome. Confidence is lower if the areas of two distributions have larger overlap.

A BA A A Ahigh confidencemedium confidencelow confidence

Distributions to Indicate Uncertainty

Page 22: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

22

A Core Course on Modeling

Confidence in black box models

Week 6-Models and Confidence

The black box in aircraft, although colored orange for easier retrieval, is very much a black box model – in the sense that it only takes in data. Confidence is black boxes is essential, e.g. to reconstruct or diagnose the occurrences during an incident.

Black box models have empirical data as input.

Quantities try to capture essential behavior of this data.

Quantities typically involve aggregarion.

Most common aggregations:

average,

standard deviation,

correlation,

fit.

univariate: every item is a single quantity

bivariate: every item is a pair of quantities

Page 23: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

23

A Core Course on Modeling

Features from Data Sets

Week 6-Models and Confidence

Average:

What is the central tendency in a set?

(mathematical details: see datamodelling or statistics

courses)

‘Averages’ can be computed for all sorts of sets – provided that the properties of the elements allow averaging. The ‘average’ face is an important concept in automated face recognition.

Page 24: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

24

A Core Course on Modeling

Features from Data Sets

Week 6-Models and Confidence

Standard deviation (; variance is 2):

How closely packed is a set?

(mathematical details: see datamodelling or statistics courses)

Standard deviation is a measure for the amount of variation in a set of values.

Page 25: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

25

A Core Course on Modeling

Features from Data Sets

Week 6-Models and Confidence

Correlation ():

What is the agreement between two sets (=a measure for similarity)?

(mathematical details: see data modeling or statistics courses)

‘Correlation’ is a form of similarity. An interesting case is self-similarity: sometimes an object is similar to a scaled and perhaps transformed copy of itself. Mathematical objects called fractals are self-similar, but also some natural objects (Romanesco broccoli ) classify as (nearly) self similar.

Page 26: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

26

A Core Course on Modeling

Example of the Value of a Black Box Model

Week 6-Models and Confidence

fit: example of a extracting meaningful pattern from data:

Example: data set: (xi,yi), assume linear dependency y=f(x).

Intuition: find a line y=ax+b such that the sum of squares of

the vertical differences is minimal

(mathematical details: see data modeling or statistics courses).

Patterns in data are often more valuable than the unprocessed data. Hence the name ‘data mining’ for extracting this value.

……very badvery bad……still not goodstill not good……try againtry again……good (best?)good (best?)

Page 27: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

27

A Core Course on Modeling

Validating a Black Box Model

Week 6-Models and Confidence

A black box model should explain the essence of a body of data.

Subtracting the explained part of the data should leave little of the initial data.

For data (xi,yi), ‘explained’ by a model y=f(x),

the part left over is

(yi-f(xi))2.

This should be small compared to

(yi-y)2 (=what you would get assuming no

functional dependency).

Therefore: confidence is high iff

(yi-f(xi))2/ (yi-y)2 is <<1.

Residue literally means ‘left over’. To assess confidence of a black box model, one should check if there is not too much unexplained information left in the initial data.

Page 28: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

28

A Core Course on Modeling

Validating a Black Box Model

Week 6-Models and Confidence

A black box model should be distinctive, that is: it should allow to distinguish input sets that intuitively are distinct.

Average, variance and least squares may not be as distinctive as you would like.

Anscombe (1973) constructed 4 very distinct data sets with equal average, variance and least square fits.

Early conclusion: ‘these sets are similar’.

Page 29: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

29

A Core Course on Modeling

Validating a Black Box Model

Week 6-Models and Confidence

1. Raw data is reasonably well explained by lin. least squares fit (low residue). So what?

2. Challenge hypothesis that raw data stems from one set. Cluster analysis reveals two sets.

3. Conclusion 1: women will overtake men in 2050 ?

4. Conclusion 2: men will break 0 second record around 2200 ?

Get even lower residuals with 4 clusters, taking ‘Jamaica or not Jamaica’ into account.

Should Olympic Games have Jamaican athletes in a seperate category or not? What are the criteria for justifiable segregation? (categories in paralympics!)

What are the assumptions on which this conclusion is based? Seek an argument from probabilities, calculating error distributions of the coordinates of the intersection point

This is impossible for physical reasons. But not all black box models involve physics.

Page 30: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

30

A Core Course on Modeling

Confidence in Glass Box Models

Week 6-Models and Confidence

Glass box models computes values for output quantities in dependence on input quantities.

Claim: for every purpose, defined in terms of output quantities, fulfilling the purpose amounts to the uncertainty distribution on the output quantities to be sufficiently narrow.

We have seen an example on this sheet.

The value, produced by a glass box (model), can be assessed via its output quantities: these should have sufficiently narrow uncertainty intervals (given the purpose!).

Page 31: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

31

A Core Course on Modeling

Structural Validity Assessment

Week 6-Models and Confidence

Qualitative validation (structural confidence)

1: examine dependencies in the functional network

The value, produced by a glass box (model), can be assessed via its output quantities: these should have sufficiently narrow uncertainty intervals.

Page 32: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

32

A Core Course on ModelingWeek 6-Models and Confidence

Qualitative validation (structural confidence)

1: examine dependencies in the functional network

The value, produced by a glass box (model), can be assessed via its output quantities: these should have sufficiently narrow uncertainty intervals.select any pair of quantities, and

graphically compare their dependency with what you expect, tests the

dependencies in between …

output

input

output

inputcalculated

expected

Structural Validity Assessment

Page 33: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

33

A Core Course on ModelingWeek 6-Models and Confidence

Qualitative validation (structural confidence)

1: examine dependencies in the functional network

The value, produced by a glass box (model), can be assessed via its output quantities: these should have sufficiently narrow ncertainty intervals.… even if they involve multiple

parallel dependency routes …

output

input

output

input

calculated

expected

Structural Validity Assessment

Page 34: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

34

A Core Course on Modeling

Structural Validity Assessment

Week 6-Models and Confidence

Qualitative validation (structural confidence)

1: examine dependencies in the functional network

The value, produced by a glass box (model), can be assessed via its output quantities: these should have sufficiently narrow ncertainty intervals.… and if there is no

dependency, there is no graph.

output

input

output

inputcalculated?expected?

Page 35: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

35

A Core Course on ModelingWeek 6-Models and Confidence

Qualitative validation (structural confidence)

1: examine dependencies in the functional network

2: examine of long range behavior is right

Asymptotic behavior is often simpler to predict: a glass box model at least should behave right in the extremes

Structural Validity Assessment

Page 36: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

36

A Core Course on ModelingWeek 6-Models and Confidence

Qualitative validation (structural confidence)

1: examine dependencies in the functional network

2: examine of long range behavior is right

3: examine if singular behavior in isolated points is right

Singular behavior of a model means: the behavior in exceptional conditions (e.g., something is 0, two values are equal …)

Structural Validity Assessment

Page 37: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

37

A Core Course on ModelingWeek 6-Models and Confidence

Qualitative validation (structural confidence)

1: examine dependencies in the functional network

2: examine of long range behavior is right

3: examine if singular behavior in isolated points is right

4: examine if things that should converge, have converged

Many mathematical results cannot be calculated in closed form, but require contribution of many terms. This can only be approximated, but we must certify that at we include at least enough terms.

Structural Validity Assessment

validation

verification

validation

validation

Page 38: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

38

A Core Course on Modeling

Quantitative Validity Assessment

Week 6-Models and Confidence

Qualitative validation (structural confidence)Quantitative validation

Page 39: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

39

A Core Course on ModelingWeek 6-Models and Confidence

Quantitative validation:

a glass box as input output function may amplify or dampen uncertainties in its input.

Sensitivity: a function can be said to ‘react’ to changes in its input. In case a function is very sensitive, uncertainties in the input will amplify to larger uncertainties in the output

input uncertainty input uncertainty

outp

ut u

ncer

tain

ty

outp

ut u

ncer

tain

ty

Sensitivity: the opposite is, when the function hardly reacts on any changes in the input

Quantitative Validity Assessment

Page 40: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

40

A Core Course on ModelingWeek 6-Models and Confidence

Quantitative validation:

a glass box as input output function may amplify or dampen uncertainties in its input.

input uncertainty input uncertainty

outp

ut u

ncer

tain

ty

outp

ut u

ncer

tain

ty

Quantitative Validity Assessment For y=f(x), spreading in x causes spreading in y.

For small x , we have

y = (y / x) x (dy/dx) x

= f ’(x) x

So for relative spreading y/y and x/x (expressed in %), we have

(y/y) / (x/x) = f ’(x) x/y := c(x) (condition number).

c(x)=1: 5% spread in x causes 5% spread in y. Large c(x): instable!

Condition number is the ratio in relative spreading between output and input: the propagation of uncertainty.

Page 41: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

41

A Core Course on ModelingWeek 6-Models and Confidence

Quantitative validation:

a glass box as input output function may amplify or dampen uncertainties in its input.

For y=f(x), we have

(y/y)=c(x) (x/x)

What about y=f(x1,x2,x3,…)?

First try:

(y/y)=i | c(xi) | (xi/xi).

This is too pessimistic: if xi are independent, they

will not all be extreme at once. A better formula is:

(y/y)2= i c2(xi) (xi/xi)2.

Most glass box models are functions with several arguments. The uncertainties mix, by adding their spreadings squared.

Quantitative Validity Assessment

Page 42: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

42

A Core Course on ModelingWeek 6-Models and Confidence

Quantitative validation:

a glass box as input output function may amplify or dampen uncertainties in its input.

(y/y)2= i c2(xi) (xi/xi)2 .

Properties:

•All xi occur squared. Therefore, spreading

propertional to n rather than n for n arguments.

•All ci occur squared. So even if f/xi<0: no

compensation with ‘negative contributions’.

•One rotten apple …

•To seek room for improvement, search for xi

with large i and large ci.

Quantitative Validity Assessment

Room for improvement: sensitivity analysis helps to assess if adding a functional expression will improve the glass box model.

Page 43: 1 A Core Course on Modeling     The modeling process     define conceptualize conclude execute formalize formulate purpose formulate purpose identify

43

A Core Course on ModelingWeek 6-Models and Confidence

•Modeling involves uncertainty because of different causes:

•Differences between accuracy and precision;

•Uncertainty distributions of values rather than a single value (normal, uniform);

•The notions of distance and similarity;

•Confidence for black box models:

• Common features of aggregation: average, standard deviation and correlation;

• Validation of a black box model:

• Residual error: how much of the behavior of the data is captured in the model?

• Distinctiveness: how well can the model distinguish between different modeled systems?

• Common sense: how plausible are conclusions, drawn from a black box model?

•Confidence for glass box models:

• Structural validity: do we believe the behavior of the mechanism inside the glass box?

• Quantitative validity: what is the numerical uncertainty of the model outcome?

• Sensitivity analysis and the propagation of uncertainty in input data;

• Sensitivity analysis to decide if a model should be improved.

Summary