NASSP Masters 5003F - Computational Astronomy - 2010 Lecture 7 – chi squared and all that Testing for goodness-of-fit continued. Uncertainties in the fitted

NASSP Masters 5003F - Computational Astronomy - 2010

Lecture 7 – chi squared and all that

• Testing for goodness-of-fit continued.

• Uncertainties in the fitted parameters.

• Confidence intervals.

• The Null Hypothesis.

Hypothesis testing continued.• Procedure:

1. “Suppose the model is a perfect fit.”

2. Calculate survival function for χ2 of pure noise of N-M degrees of freedom..

3. Draw vertical at point of measured χ2.

4. Y value where this vertical intercepts the SF is the probability that a perfect model would have this χ2 value by random fluctuation.


Su

rviv

al f

un

ctio

n

Questions answered so far:• In fitting a model, we want:

1. The best fit values of the parameters;

2. Then we want to know if these values are good enough! Ie if the model is a good fit to the data.

3. If the model passes, we want uncertainties in the best-fit parameters.

• Number 1 is accomplished. √• Number 2 is accomplished. √


Uncertainties in the best-fit parameters• Usually what one gets is a covariance matrix

(mentioned in lecture 4):

• This is a symmetric matrix: σij2=σji

2 for all i,j.• For U=χ2, E=2(Hbestfit)-1, where Hbestfit is the Hessian,

evaluated at the best-fit values of the θi.• For U=-L, E=F-1, where F is the “Fisher Information

Matrix”:• These definitions are equivalent!

– For Gaussian data, identical.NASSP Masters 5003F - Computational Astronomy - 2009

22

221

212

21

E

ji

ji

LF

bestfit2

,ˆ

The Hessian or curvature matrix• The contours are

ellipses in the limit as the minimum is approached.– Ellipsoidal hypercontours

in the general case that M>2.

• Semiaxes aligned with the eigenvectors of H.

• Small semiaxis: large curvature;

small uncertainty in that direction.


Contours of U:

Arrows show the eigenvectors.

1-parameter example1) Gaussian data, U=χ2.

– For this simple model, we can find the best fit θ without numerical minimization:

– Setting this to zero gives:


N

i i

iyU1

2

2

N

i i

iyU

12

2

.1

ˆ

12

12

N

i i

N

i i

iy

Sidebar – optimum weighted average• A weighted average is:

• Since the yi are random variables, so is μ^.

• Therefore it will have a PDF and an uncertainty σμ.

• The smallest uncertainty is given for

– Exactly what we have from the χ2 fit.NASSP Masters 5003F - Computational Astronomy - 2009

i i

i ii

w

yw

21 iiw

Back to the1-parameter example.– Again, because this model is so simple, we

can calculate σθ by direct propagation of uncertainties.

• θ^ is a function of N uncorrelated random variables yi, so

• It is fairly easy to show that:


N

ii

iy1

2

2

2ˆ

N

i i12

2

11ˆ

What does the standard approach give?• Hessian is a 1-element matrix:

• Hence

• QED.


2

bestfit

2

2

1,1

12

i

UH

21,1

112

i

H

1-parameter example continued2) Poisson data, U=-L. (No point in using -L for gaussian

data, it’s then mathematically the same as chi squared.)

– Again it is simple to calculate the position of the minimum directly:

– Setting this to zero gives

Ie, the average of the ys.NASSP Masters 5003F - Computational Astronomy - 2009

N

iii yyU

1

!lnln

iyN

U

N

iiyN 1

1

Uncertainties in the Poisson/L case.– With our present simple model it is very easy

by propagation of uncertainties to show that

– Following the formal procedure for comparison:

– Inverting this gives the same result.


N

ˆˆ 2

ˆˆˆ

2bestfit

2

2

1,1

NyLF i

1-parameter example continued3) Poisson data, U=“chi squared”.

– There are two flavours of “chi squared” for Poisson data!

– Note that the following is simply incorrect:


N

i

iyU1

2

Pearson

N

i i

ii

y

yyU

1

2

Mighell 1

1,min

N

i i

i

y

yU

1

2

Don’t use Pearson’s for fitting.• It is not hard to prove it is biased.

– Eg, keeping our simple model,


– In his paper, Mighell calculates the limiting value of θ^Pearson as N->∞ and shows it is not θ.


2

2

iyN

U

N

ii

N

ii y

Ny

N 11

2Pearson

11

The Mighell formula is unbiased.– For this statistic,


– Some not-too-hairy algebra shows that the limiting value of θ^Mighell as N->∞ is equal to θ.


N

i i

iiN

i i y

yy

y

U

11 1

1,min

1

12

N

i i

N

i i

ii

y

y

yy

1

1Mighell

11

1

1,min

Goodness-of-fit:1. The Gaussian/χ2 case has been covered already.2. The Poisson/L case is a problem, because no

general PDF for L is known for this noise distribution.

– If we insist on using this, have to estimate SF via a Monte Carlo. Messy, time-consuming.

3. For the Poisson/”chi squared” case, where we have 2 competing formulae, we should do:

– Use Mighell to fit;– Use Mighell for uncertainties;– But use Pearson (with the best-fit values of θi) for

goodness-of-fit hypothesis testing.• Because it has the same PDF (thus also SF) as χ2.


Confidence intervals• There is a hidden assumption behind

frequentist model fitting: namely that it is meaningful to talk about p(θi^).


ip

i

• We already have some hints about its shape… and a Monte Carlo seems to offer a way to map it as accurately as we want.

Confidence intervals

NASSP Masters 5003F - Computational Astronomy - 2009i

ip

bestfit,i

2

Bayesians think this is nonsense.• Such a MC is like pretending that θ^ is the

‘true’ value, and then generating lots of hypothetical experimental data.

• But all we really know is the single set of data which we measure in the real experiment.– Plus possibly some ‘prior knowledge’.

• We don’t want p(θ^), we want p(θ).

• But we’ll continue with the frequentist way for the time being.


Confidence intervals• We also assume that p(θi) is approximately Gaussian (which may be entirely unwarranted!!)

– We interpret this to mean that there is a 68% chance that the interval

contains the truth value θ.


68.02

exp2

12

2

x

dx

ˆˆ,ˆˆ

Confidence intervals• Note that this is not the only

interval which contains 68% of the probability. We can move the interval up and down the θ axis as we please. The –σ to +σ version is just a convention.

• FYI

erf() is called the error function.


2

erf2

1

2exp

2

1

02

2 axdx

a

Confidence intervals• For more than 1 parameter

the q% confidence interval is the (hyper)contour within which the probability of the truth value occuring =q.

• Again, by convention, symmetrical contours are used.


When m=s+b (which is not always appropriate)• It is of interest to ask (probably before we

attempt to fit the parameters of s!):– Is there any signal present at all?

• In frequentist statistics this is again done via hypothesis testing. The hypothesis now is called the null hypothesis (‘null’ from Latin for ‘nothing’):– “Suppose there is no signal at all.”– and test what follows from this.


Testing the Null Hypothesis - details1) Gaussian data, U=χ2:

– Construct the survival function (SF).• Degrees of freedom?

– Depends whether we fit the background or not.– Suppose we have Mb and Ms.– If background fitted, υ=N-Mb.– If not (in this case need to know the background from other

information), υ=N.

– From the set of measurements yi, calculate

– From the SF read off that value of probability which corresponds to Umeas.

• That is the probability that background alone would generate >=Umeas. NASSP Masters 5003F - Computational Astronomy - 2009

N

i

ii-byU1

2i

2

Meas Note ONLY include background!

Testing the Null Hypothesis – details cont.2) Poisson data, U=“χ2”:

– The PDF, therefore the SF, are not known for the Mighell statistic.

– However the PDF and SF for the Pearson statistic are identical to χ2.

Use Pearson statistic for Poisson hypothesis testing.

3) Poisson data, U=-L:– PDF and SF not known.– But one can compare two models via the Cash

statistic. (Cash W, Ap J 228, 939 (1979).


The Cash statistic

• This is only valid providing the null model can be obtained by some combination of signal parameters.– This implies that one of the signal parameters

will be an amplitude (ie, a scalar multiplying the whole signal function).

– It also ensures that


nullbestfit2 LLC

nullbestfit LL 0Chence

The Cash statistic• Cash showed that the PDF of C was the

same shape as that of χ2, but with υ=Mfitted.

• Note that this is rather different from the usual p(χ2

), for which υ is approx. equal to the number of data values N.


Incomplete gamma functions - advice• Recall the survival function for χ2 is

– The incomplete gamma function can be calculated via scipy.special.gammainc.

• It is very small values of P that we are interested in however – ie where Г(υ/2,U/2)/ Г(υ /2) becomes close to 1.

• In this regime it is better to use the complementary (means, 1 minus) incomplete gamma function:– scipy.special.gammaincc <– note 2 cs.

– But NOTE the definition carefully.NASSP Masters 5003F - Computational Astronomy - 2009

2

2,21,

UUP


General problems with fitting:

• When some of the θs are ‘near degenerate’.– Solution: avoid this.

• When several different models fit equally well (or poorly).– Solution: F-test (sometimes). Supposedly

restricted to the case in which 2 models differ by an additive component.


Degenerate θs

Data

Model: two closegaussians – 2 par-ameters: the ampof each gaussian.

Valley in U is long and narrow. Manycombinations of θ1 and θ2 give about asgood fit; parameters strongly correlated.

Overview of the grand plan


FrequentistFrequentist Bayesian

T B D…

Bayesian

T B D…

χ2

υ=N-Mχ2

υ=N-M

Null H

Poisson -LPoisson χ2Gaussian(χ2 -L)

GOF

Uncert

Fit

No formula(MC)

No formula(MC)

UPearson

υ=N-M

UPearson

υ=N-M

χ2

υ=Nχ2

υ=NCashυ=MCashυ=M

UPearson

υ=N

UPearson

υ=N

Minimize χ2Minimize χ2 Minimize -LMinimize -LMinimize UMighellMinimize UMighell

E=2H-1E=2H-1 E=-F-1E=-F-1

E=2H-1E=2H-1

Flowchart to disentangle the uses of χ2:


Minimize χ2 to get best-fit θ.

Test the hypothesis that it is.Test the Null Hypothesis:

Compare to theoretical χ2 surviv-al function (num deg free = N).

P<Pcut?

Yes – there is a

signal

No – no signal.

P<Pcut?

No – model is

good.

Yes – model is

bad.

Compare to theoretical χ2 surviv-al function (num deg free = N-M).

Is there any signal at all?

Is the model an accurate des-

cription?

Decide on a cutoff probability Pcut.

Calculate χ2 for the best fit θ.

Decide on a cutoff probability Pcut.

Calculate χ2 for θ= bkg values.

Documents

NASSP Masters 5003F - Computational Astronomy - 2010 Lecture 7 – chi squared and all that Testing for goodness-of-fit continued. Uncertainties in the fitted