Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

Embed Size (px)

Citation preview

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    1/47

    De gree s of Free dom : A

    Correc tion to Chi S quare

    For P hysical Hypothese s

    by J ohn Micha el Williams

    2010-08-15

    jmm will@comcast .ne t

    Copyright (c) 2010, by J ohn Micha el William s

    All Rights Rese rved

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    2/47

    J . M. Williams Chi Squar e df v. 1.23 1

    Abstract

    In common practice, degrees of freedom (df) may be corr ected for t he n um ber of

    th eoretical free para meter s as though para meter s were th e same as data cat egories.However, a free physical pa ra meter generally is n ot equivalent to a dat a cat egory in

    terms of goodness of the fit.

    Her e we use syn th etic, nonr an dom da ta to show th e effect of choice of

    categorization and dfon goodness of fit. We th en explain th e origin of th e df

    problem an d show how to avoid it in a th ree-step pr ocess:

    First , the t heoretical curve is fit t o the dat a t o remove its

    varian ce, leaving what, u nder th e nu ll hypoth esis, should be

    structureless residuals.

    Second, the r esidua ls ar e fit by a set of ort hogona l polynomials u p

    to th e degree, should it occur , at which significan t va ria nce was

    removed.

    Third, th e nu mber of nonsignifican t polynomial term s in th e

    origina l + ort hogona l set become t he dfin a sta ndard chi square

    test.

    This process reduces a general dfpr oblem to one of polynomial dfand allows

    goodness of a fit t o be deter min ed by dat a cat egorizat ion a nd s ignifican ce level

    alone. An example is given of an evalu at ion of physical dat a on neu tr ino

    oscillation.

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    3/47

    J . M. Williams Chi Squar e df v. 1.23 2

    Table of Conten ts

    Abstract ______________________________________________________________________ 1

    Table of Contents ______________________________________________________________2

    I. Introduction ________________________________________________________________3

    Parametric Statistics_______________________________________________________________ 3

    Statistical Tests in Physics __________________________________________________________ 4

    Doing Physics by Category__________________________________________________________ 4

    Categories in NonRandom Chi Square ________________________________________________ 6

    II. Meaning of Chi-Square _____________________________________________________16

    Formalisms______________________________________________________________________ 16

    Categories in Random-Sample Chi Square ___________________________________________ 21

    Meaning of df____________________________________________________________________ 21

    Need for a Correction to df_________________________________________________________ 24

    III. The Parameter Correction __________________________________________________ 26

    Definition of Correction ___________________________________________________________ 26

    Application in Synthetic Examples __________________________________________________ 30

    Application in Neutrino Oscillation Theory___________________________________________ 42

    IV. Conclusion _______________________________________________________________46

    References___________________________________________________________________46

    Acknowledgements ____________________________________________________________46

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    4/47

    J . M. Williams Chi Squar e df v. 1.23 3

    I. Introduction

    Pa r a m et r ic S t a t i st i cs

    Hypoth eses ar e theoretical assu mpt ions. Scientific hypoth eses ma y be tested

    against empirical dat a. It is funda ment al to all science, that no hypoth esis ever can

    be proven by experimental dat a; an hypoth esis only may be r endered more or less

    likely, or disproven.

    The m ost fam ous m odern u se of forma l sta tist ics to test scient ific hypotheses

    dat es back to R. A. Fish er [1], who, in th e early twent ieth centu ry, published a

    meth od for an alysis of th e var iance in experimenta l data . Fisher developed

    sensit ive ways of detectin g differen ces a mong da ta as a fun ction of category of

    tr eatm ent. The treat ment s in his favorite exam ples were on agricultu ra l variables

    such as crop fert ilizat ion, light, water ing, an d so fort h. Fish er's meth ods werebroad in scope--the nonparametric Fisher Exact Testis na med for h im--but his

    an alysis of varian ce was based on th e apparen tly narr ow assumpt ion th at t he data

    were sampled independently an d would be normally distr ibuted with const an t

    variance.

    Defining a r an dom var iableXwith n orm ally distr ibuted probability density

    ( )N X; , 2 by

    ( )( )

    P x e

    x

    X = =

    1

    2 22

    2

    2

    , (1)

    it easily ma y be seen th at th e two par am eters of such a distribution are th e mean

    and t he variance 2 . The mean (expected value) of dat a ma y be ta ken to define th e

    effect of an experimenta l treat ment . A sam ple ta ken of dat a depending on a

    mixtu re of differen t effects will have a bigger varia nce tha n a sa mple dependin g on

    one; so, the var ian ce ma y be an alyzed to show differen ces in th e un derlying mea ns.

    If we use m ean values to represent cat egories of data, t hen hypoth esis testing by

    an alysis of var ian ce ma y be used to test differen ces am ong the cat egories. Fish er's

    app roach to ana lysis of var ian ce cam e to be kn own a s param etric statis tics .

    Fisher's a na lysis of var iance rat iona le has been su pplemented by Bayesianmet hods [2, 4]. Pr oblems not fitt ing Fish er's par adigm ha ve extended th e scope of

    hypothesis-testing statistics into the nonparametric realm [3, Ch. 7; 6, Sect. 9.2 &

    Ch. 12]. Nonpara metr ic sta tistics att empt to resolve questions for dat a in very

    sma ll samples, for mismat ched var iances, an d for experiment al r esults sometimes

    more qualitative than numerical.

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    5/47

    J . M. Williams Chi Squar e df v. 1.23 4

    Sta t i s t i ca l Tes t s in Physics

    A problem with hypoth esis testing in physics is th at mu ch of th e subject m at ter is

    mult iply par am eterized. In physics, a good hypoth esis ra rely sta tes anyth ing aboutcat egories; instea d, it provides an explan at ion of a set of one or more ph enomena

    which may var y continu ously. The dat a typically involve broad ra nges of ener gy or

    moment um , or avera ges of ma ny car efully winnowed events . Quest ions ar ise of, Is

    th e spin 1/2? Does th e field var y as 1/r? The conn ection of an y physical hypothesis

    with data typically is through multidimensional continua of complexity.

    Rath er t ha n an alyzing experimenta l variat ions t o discover differences in m eans

    on na rr ow doma ins of one or m ore cont inuu a, th e preferr ed way to demonst ra te

    supp ort for a physical hypoth esis is to show tha t it m ay be used u nequ ivocally to

    cont rol some ph ysical pr ocess in su ch a way a s no oth er kn own h ypoth esis might

    explain.

    For exam ple, a demonst ra tion of th e Ha ll effect would involve an unequivocal

    excess of cha rge along one side of a flat condu ctin g str ip. To achieve this

    confirmation of the predicted effect, superior instrumentation or measurement of

    previously ignored qua nt ities would be adopted. The gra dient of th e cha rge, th e

    effect's time-course of development, and other factors would be examined for

    consist ency with th e hypoth esis. Rar ely would physical meas ur emen t be on such a

    na rr ow ran ge of values tha t t he var iance would be const an t, independent of th e

    mea n. Also, if not all evidence su pported th e hypothes is, alt ern at ives would be

    explored inst ead. Merely showing th at one side of a condu ctin g str ip seemed more

    likely to have a char ge excess th an th e other, a test between two cat egories on one

    variable, would be a n un sat isfying r esult.

    Doing Ph ysics by Ca t egor y

    Let's consider t he special case of curve-fitt ing hypoth eses. The problem is to

    distinguish among various hypoth eses of th e shape of th e cur ves. We might sta rt

    by ar ra nging th e data a long one or more cont inua . However, on a cont inuu m, two

    data (probably) never coincide; this means we have slim bases for estimating the

    mean , var iance, or a ny oth er par am eter of th e assum ed underlying random

    variable.

    So, to use par am etr ic st at istics, we define cat egory boun dar ies (bins). By ma kingth e bins wide enough, t he n umber of dat a a vailable in each will be large enough t o

    provide a good estimat e of th e bin mean . We ma y assu me th e var iance in a bin to

    be const an t, but only if th e bins a re na rr ow enough. Sometimes riskily, we may

    assu me th e variance over all bins t o be the sa me, const an t value.

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    6/47

    J . M. Williams Chi Squar e df v. 1.23 5

    A simple example will illust ra te th is approach. Ima gine th at t wo different

    physical theories ar e to be test ed against a certa in data set. Perh aps, the th eories

    might r elate t o field int ensity Ias a fun ction of dista ncex.

    Hypothesis I: The dat a follow a quadr at ic fun ction. This would st ren gth en beliefin Theory I.

    Hypothesis II: The data follow a cubic function. This would support Theory II.

    Here a re th e two hypoth etical predictions on a da ta set which h as been generat ed

    purely artificially, with no randomization:

    Figure 1. Simulated data f i t by eye . Deviations from the data see m

    greater for the cubic than for the quadrat ic . The curves represent

    ( )I x= 7 1 22

    for the quadratic, and ( )I x= 25 1 23

    for the cu bic.

    Clear ly, visua l inspection tells us th at t he quadr at ic fit is better. So, we

    imm ediat ely should drop Theory II in favor of Theory I. Or, should we? How can wetest how well Theory I ha s been confirm ed? Is it possible Theory II might be more

    elegant to derive, simpler, or oth erwise more a tt ra ctive tha n Theory I, even th ough

    th is data set was not fit so well?

    We shall discuss t he special case in which we wish t o test Theory I again st th e

    dat a, leaving out Th eory II entirely. This kind of sta tistical test, which mea sur es

    th e fit r at her th an rejecting a competing fit, is called a goodness of fittest.

    Distance x (arbitra ry units)

    Field

    Intensity

    I

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    1.2

    0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

    Quadratic Fit

    Cubic Fit

    Data

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    7/47

    J . M. Williams Chi Squar e df v. 1.23 6

    The m ost popular goodness of fit t est is by ch i sq u a r e , a sta tistic first derived by

    Helmert [2, Sect. 3.4.3] and later independently derived and popularized by Karl

    Pear son; it is na med by Pearson after th e Greek letter chi (). Chi squa re simply

    is th e distribut ion of the su m of squar es of a set of stan dar dized, norma lly

    distributed data .

    Categor ies in NonRand om Chi Squa re

    Before pr esent ing form alities a bout chi squ ar e, we first develop its a pplicat ion t o

    th e cur ve-fit exam ple in F igure 1 above. We ignore th e cubic cur ve for a wh ile.

    One Cat egory

    Suppose we comput ed th e mean values of the dat a a nd of th e quadr at ic cur ve on

    th e interval of interest , 01 0 9. . x . We would get, usin g all 14 ava ilable dat a

    points sh own in F igur e 1, a sta tistically invalid ana lysis but a n inst ru ctive exam ple

    as sh own in Ta ble 1.

    Table 1. Comparison of data vs quadratic means on one

    category (1 degre e of freedo m)

    Distance x Data

    of I

    Quadratic

    0.151 0.73

    0.201 0.59

    0.251 0.44

    0.301 0.300.351 0.18

    0.401 0.08

    0.451 0.02

    0.501 0.00

    0.551 0.02

    0.601 0.08

    0.651 0.18

    0.701 0.31

    0.751 0.45

    0.801 0.60

    mean = 0.2855 I = 0.3733

    sa mple sd ofI= 0.2349 I = 0.3339

    sa mple sd of m ea n 0.0628 I

    = 0.0892

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    8/47

    J . M. Williams Chi Squar e df v. 1.23 7

    The mea n of th e qua dra tic was compu ted a s follows, using th e expected valu e

    operator,E,

    ( )( ) ( )

    I E Id x I x

    dx

    dx x

    x= =

    =

    0 1

    0 9

    0 1

    0 9

    2

    0 1

    0 9

    0 1

    0 9

    7 1 2

    03733.

    .

    .

    .

    .

    .

    .

    . . . (2)

    The operator ( )E means t he same a s the familiar ;E is used her e for symm etry

    with th e var iance operat or, ( )V .

    The standard deviation of the quadratic was computed as follows,

    ( ) ( )( ) ( ) ( )[ ]( ) ( )

    I V I E I E I E I E I sqrt d x I x

    dx

    d x I x

    dx

    = = = =

    2 2 2

    2

    0 1

    0 9

    0 1

    0 9

    0 1

    0 9

    0 1

    0 9

    2

    .

    .

    .

    .

    .

    .

    .

    .

    ( )( )[ ]I sqrt

    dx x

    dx

    =

    7 1 2

    0 373 0 3339

    2 2

    0 1

    0 9

    0 1

    0 9

    2.

    .

    .

    . . . . (3)

    The sta nda rd deviations of th e mean s were comput ed by dividing the sta nda rd

    deviations of the variables (data or quadratic) by the square root of the sample size,

    N = 14 3742. .

    Now, inst ead of just noticing in Table 1 tha t t he mea ns a re about 0.09 apa rt , and

    th at t he sta nda rd deviat ions of th e means a re about th e same size, suggesting no

    significan t difference between t he t heory a nd th e dat a, let's look a t th e chi squa re

    statistic. A unit-normal distribution is defined as one with m ean 0 an d varia nce 1,

    we describe such a distribution as ( )N 0 1, , a notat ion which may be relat ed to

    equa tion (1) in a n obvious way.We follow Brownlee [3, Sect. 1.27], an d other s in defining chi squ ar e as a sum of

    squar ed un it-normal deviat es: Given a nu mber NC of data cat egories,

    ( )Data ii

    N

    NC

    2 2

    1

    0 1==

    , , (4)

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    9/47

    J . M. Williams Chi Squar e df v. 1.23 8

    with comput ed or ta bulated cum ulat ive probability P at significan ce level xp ,

    ( )[ ]P df x pp2 , represented as ( )p df2 .

    In comm on usa ge, which we shall quest ion lat er, th e degrees of freedom var iable

    dfis the n umber of sta tistically independent cat egories of th e data , and t he un it-normal deviates are differences between theoretical (or otherwise anticipated)

    values an d data values, one for each cat egory. Becau se the differen ces are defined

    sta nda rdized, their mea ns a ll will be expected t o be 0 under t he nu ll hypoth esis tha t

    the fit is good.

    Usua lly, the var iance in ea ch dfcat egory would be th e varian ce un der t he nu ll

    hypoth esis; in our int roductory exam ples, th is is the var iance under th e assu mpt ion

    th at th e data were distr ibuted ran domly in each cat egory as determin ed by th e

    cur ve being fit. In t his intr oduction, we sha ll be looking at t he cur ve fit to dat a

    generat ed under a th eory an d exam ining the adequa cy of th e representa tion of th at

    th eory by th e fitt ed cur ve. So, we are looking at two differen t t heories, in a sen se,one a simplificat ion of th e other . In our int roductory examples, ther e is no ra ndom

    component other th an rounding error of th e nu merical values.

    In common u sage, we would estima te t he var iance in each cat egory u sing th e

    dat a, th us losing a degree of freedom [4, Sect. 10.11]. In t hese int roductory

    examples, we assume the appar atu s retur ns good data , and th at we are measuring

    th e lawful realization of a physical principle. We assu me we know the mean s and

    th e cha nge in th em within a cat egory; so, we need not use da ta to estimat e

    an yth ing; we mer ely ar e mea sur ing th e fit of a cur ve. So, in the first few examples

    following, we sha ll use th e varia nce of th e qua dra tic cur ve to sta nda rdize in each

    cat egory, rath er th an t he varian ce of th e equally th eoretically corr ect dat a. Theactua l difference in t he var iances is not lar ge anyway, as ma y be seen in th e

    tabulations below.

    So, to evalua te chi squa re, we ma y compu te t he following sum :

    ( )

    ( )

    ( )[ ]( )

    Datai i

    ii

    N

    i

    ii

    NX E X

    V X

    X E X

    V X

    C C

    2

    2

    1

    2

    1

    =

    =

    = =

    , (5)

    in which, in each category, the t heoret ically expected va lue ( )E X is subtracted from

    th e value of the observed datu m X. In genera l, each i-th cat egory r epresents a

    mea n of some observat ions, from which observed mea n will be subt ra cted t he i-th

    expected mean . Under t he nu ll hypoth esis th at t he fit is good, the difference in the

    nu mer at or of th e sum in (5) will, of cour se, be zero. Dividing each differen ce by the

    expected standard deviation ofX sta nda rdizes the var iance of each of th e i elements

    in the sum.

    We notice here, immediat ely, th at our chi squa re ignores t he t heoretically

    expected shape of th e cur ve: It just sums squar ed differences. This mean s tha t chi

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    10/47

    J . M. Williams Chi Squar e df v. 1.23 9

    squar e is not tr uly a para metr ic sta tistic. Ea ch cat egory is sta nda rdized

    individually by its own expected va ria nce; th erefore, chi squar e does not r equire a

    const an t varian ce over all cat egories tested. This is a great advant age in physics.

    The sha pe of the t heoretical curve enter s only locally, at th e gran ular ity of the

    individua l cat egories, when a differen ce ( X Ei i ) is taken an d a variance estima tedperha ps on th e assum ption t ha t t he varian ce will be const an t.

    Anyway, let's decide to reject t he fit at th e p = .001 level. Keep in min d th at we

    do not have a valid ran dom sam ple an d tha t t his is mean t as a th ought -provoking

    int roduction to a problem of sta tist ical inference in physics. The ta bles of chi

    square tell us tha t ( ).0012 1 11 . We tr ivially comput e chi squar e from the Table 1

    means t his way:

    ( ) ( )( )

    DataX E

    V2 1 1

    2

    1

    2

    2

    2855 3733

    089210=

    =

    . .

    .. ; (6)

    an d, confirmin g our visual impr ession, th e resu lt is not significan t, so th e fit ma y be

    considered good to th e extent t he overall averages of the dat a a nd qu adr at ic

    hypothesis in Table 1 are not shown differen t. And we ar e confident th e fit would

    have been shown good with likelihood of error p = .001, ha d th e stat istical

    requirements been fulfilled.

    We should ment ion h ere th at th e difference in m eans in Table 1 also might h ave

    been tested by oth er sta tistics, such a s Stu dent's t. Above, we have accepted t he

    mean an d varia nce of the qua dra tic fun ction a s th eoretically errorless, and ha ve

    accepted the quadratic's computed ( )N . ,.3733 0892 distribution a s th at of the data ,

    un der the nu ll hypoth esis. In cont exts oth er tha n th e present int roduction, this

    would be qu ite wrong sta tistically, becau se, visibly, the mean an d t he varian ce

    cha nge systemat ically over t he domain of interest (sta nda rd deviation is a measu re

    of slope)--an d we ha ve a qu adr at ic cur ve, not a ra ndom norma l variat e.

    As a check, ignoring t hese pr oblems, at p = .001 (two-ta iled), a norm ally

    distributed da ta mean would ha ve to lie about 3.3 stan dar d deviat ions of th e mean

    awa y from th e qua dra tic mea n to reject th e goodness of th e fit. This would be about

    0.30 distance units, clearly about 3 x greater than the 0.09 difference in Table 1.

    Usin g the n orm al significan ce value in a second check, if we subst itu te 0.30 in th e

    chi squ ar e nu mer at or of (6) above, we get 0 30 0 09 1112 2

    . . . , closely mat chin g th e chisqua re significan ce level of 11. This is a t au tology, confirm ing our ar ith met ic,

    because by definition ( ) ( )[ ]. . ,0012 0012

    1 0 1 N .

    Two Ca tegories

    Now, let us double the nu mber of cat egories an d th e sam ple size, too. We

    increase t he density of the dat a point s but keep the sa me qua dra tic fitting function:

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    11/47

    J . M. Williams Chi Squar e df v. 1.23 10

    Distance x

    Field

    Intensity

    I

    -0.1

    0.1

    0.3

    0.5

    0.7

    0.9

    1.1

    0.2 0.3 0.4 0.5 0.6 0.7 0.8

    x1x2

    Figure 2. Two-category fi t by eye . More of the same arti f icial data as

    in Fig. 1 , for the sam e qu adratic f i t ( ) ( )I a x b x= = 22

    7 1 2 .

    This time, we part ition t he domain of an alysis x int o two regions, or bins, called

    x1 and x2 , as shown in Figure 2. The horizont al lines show th e different m eans ofI

    in the two bins. We ta bulate th e data shown, an d recalculat e the para meter s of th e

    qua dra tic separ at ely for th e two ha lves, as was done above. We cha nge the problem

    slightly here, by integrat ing th is time on t he sm aller, more pr ecise doma in of

    ( )x . ,.15 85 instead of ( )x . ,.10 90 , mak ing the domain of evaluat ion of th e quadr at ic

    ma tch the doma in of th e sampled dat a more closely. Ea ch ha lf now ha s 14 dat a

    points. The result is in Table 2.

    Looking at Figure 2, with two categories, the obviously-changing data at least are

    more or less monotonic in ea ch bin . As confirmed in Ta ble 2, th e two bins, dividing

    th e data exactly in h alf at a point of symmetr y, mak e th e two quadr at ic cat egories

    almost ident ical. The quadra tic sta tistics all ar e lower in Table 2 th an in Table 1

    becau se of th e sma ll reduction in t he domain of int egration, which elimina ted t he

    largest values ofdI dx . The sam ple sta nda rd deviat ions and sta nda rd deviat ions of

    the sample means in Table 2 agree reasonably closely with those in Table 1, because

    we ha ve essent ially th e sam e functiona l cha nge in each "data " bin in both cases,

    ma king slopes all about t he same. However, the quadra tic cur ve-fit stan dar d

    deviat ions reflect th e nar rower doma in of integrat ion. If we calculat ed a stan dar d

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    12/47

    J . M. Williams Chi Squar e df v. 1.23 11

    deviation of th e gran d mea n in Table 2, using both bins, we would find it 2

    sma ller th an in Table 1, becau se of th e doubling of th e dat a (sam ple size).

    Table 2. Comparison of data vs quadratic mean s on two

    categories (2 degrees of free dom)

    x1: Low Half x2: High Half

    x Data Qu ad x Data Qu ad

    1 0.151 0.7338 0.501 0.0000

    2 0.175 0.6677 0.525 0.0053

    3 0.201 0.5926 0.551 0.0220

    4 0.225 0.5216 0.575 0.0471

    5 0.251 0.4445 0.601 0.0844

    6 0.275 0.3747 0.625 0.1273

    7 0.301 0.3023 0.651 0.18228 0.325 0.2396 0.675 0.2396

    9 0.351 0.1777 0.701 0.3077

    10 0.375 0.1273 0.725 0.3747

    11 0.401 0.0811 0.751 0.4504

    12 0.425 0.0471 0.775 0.5216

    13 0.451 0.0203 0.801 0.5985

    14 0.475 0.0053 0.825 0.6677

    mean (n=14) 0.3097 I = 0.2858 mean 0.2592 I = 0.2867

    sam ple sd ofI 0.2494 I = 0.2557 sd I 0.2304 I = 0.2556

    sa m ple sd of m ea n 0.0667 I

    = 0.0683 sd m 0.0616 I

    = 0.0683

    As before, ima gining th e two cat egories t o be sta tist ically indepen dent , we ma y

    look for a rejection of goodness of fit in Figure 2 at the p = .001 level at ( ). .0012 2 138 .

    We ma y compu te t he va lue of chi squ ar e from Ta ble 2 as follows:

    ( ) ( ) ( )( )

    ( )

    ( )Data

    X E

    V

    X E

    V2 1 1

    2

    1

    2 2

    2

    2

    2

    2

    2

    2

    3097 2858

    0683

    2592 2867

    06830 28=

    +

    =

    +

    . .

    .

    . .

    .. ; (7)

    Again, we get nowhere nea r a value a llowing us to reject goodness of fit; in fact,

    we are far th er away, becau se the slight adjustm ent of th e doma in away from th e

    largest values ofI (above) ha s r emoved th e worst deviations from a per fect fit.

    Two Cat egories with More Da ta

    Now, supp ose we complete our experimen ta l work an d ha ve a set of some 1,000

    dat a a t equa l 0.001 int ervals on t he doma in ( )x 0 1, . As in Figur e 2 an d (7) above,

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    13/47

    J . M. Williams Chi Squar e df v. 1.23 12

    we ha ve decided to keep just ( )x . ,.15 85 . This mean s we will keep just 700 usefuldata .

    With out ta bulating da ta , the r esult of a chi squar e test on two cat egories with 350

    dat a ea ch is shown in Table 2a:

    Table 2a. Comparison of num erous data vs quadratic

    means on tw o categories (2 degrees of freedom)

    x1: Low Half x2: Hig h Half

    Data Qu ad Data Qu ad

    mean (n=350) 0.2831 I = 0.2858 mean 0.28520 I = 0.2867

    sam ple sd ofI 0.2321 I = 0.2557 sd I 0.23290 I = 0.2556

    sample sd of mean 0.01241 I

    = 0.01367 sd m 0.01245 I

    = 0.01367

    Of cour se, the qua dra tic means an d var iances have not chan ged; however, we now

    must sta nda rdize th e cat egory means by the much lar ger sample size. The result is,

    ( ) ( ) ( )( )

    ( )

    ( )Data

    X E

    V

    X E

    V2 1 1

    2

    1

    2 2

    2

    2

    2

    2

    2

    2

    2831 2858

    01367

    2852 2867

    013670 05=

    +

    =

    +

    . .

    .

    . .

    .. ; (7a)

    an d th is is even fart her from the significan ce level of 13.8. We th us might conclude

    from chi squa re on two cat egories t ha t t he fit of th e qua dra tic was good.

    Ten Ca tegories

    As a bove, we ha ve 700 usa ble, equa lly-spaced dat a on ( )x . ,.15 85 . Let us test the

    fit of th e quad ra tic fun ction aga in, th is time dividing our x doma in int o 10

    cat egories. Using "box plot" repr esent at ion, th e result , aga in a fit by eye, is in

    Figur e 3. Ea ch of th e 10 bins ha s been processed as we did in th e preceding

    an alyses. Again, we ha ve such pr ecision of measu remen t of th e dat a in t his

    ar tificial exam ple, th at th ere is no legitima te r an domn ess.

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    14/47

    J . M. Williams Chi Squar e df v. 1.23 13

    Distance x

    Field

    Intensity

    I

    0.0

    0.2

    0.4

    0.6

    0.8

    0.15 0.45 0.85

    Mean sd of I

    Mean sd of Imean

    Figure 3. Ten-catego ry f i t by eye. Sum marized set of 700 of the same

    arti ficial data as in Figs . 1 and 2, for approximately the s ame quadratic

    fit, ( )I x= 7 1 22

    .

    Notice in Figure 3 t ha t t he var iance in ea ch bin clear ly increases with t he local

    slope of th e cur ve. Ea ch mea n repr esents 70 data ; th e equa l-spacing of th e data

    project on the local slopes to determine the standard deviations shown.

    Also notice th at th e large num ber of 70 data in each bin, tr eated a s th ough

    sam pled ran domly and independently, yield such a sma ll stan dar d deviat ion of th e

    bin mean , tha t for only one of th e dat a, th e second from th e left, does th e quadr at ic

    actua lly fall with in a sta nda rd deviation of th e mean .

    To prepa re our chi squa re t est of goodness of fit of th e qua dra tic, we sum ma rize in

    Table 3 the sta tistics plott ed in Figure 3:

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    15/47

    J . M. Williams Chi Squar e df v. 1.23 14

    Table 3. Comparison of data vs quadratic means on ten

    catego ries (10 degrees of freed om)

    Bin #(70 data)

    Binmean

    Bin sd sd o f Bin

    mean

    Quadmean

    Quad sd sd of Quad

    mean

    2

    e lement

    1 0.63680 0.05840 0.006976 0.695150 0.087730 0.010486 30.97

    2 0.43190 0.05945 0.007106 0.421240 0.068241 0.008156 1.71

    3 0.24020 0.05077 0.006069 0.215930 0.048726 0.005824 17.37

    4 0.09301 0.03404 0.004068 0.079220 0.029227 0.003493 15.58

    5 0.01348 0.01220 0.001459 0.011110 0.009933 0.001187 3.99

    6 0.01407 0.01253 0.001498 0.011599 0.010206 0.001220 4.10

    7 0.09468 0.03432 0.004102 0.080689 0.029524 0.003529 15.728 0.24270 0.05096 0.006091 0.218379 0.049003 0.005857 17.24

    9 0.43480 0.05901 0.007113 0.424669 0.068529 0.008191 1.53

    10 0.63970 0.05828 0.006966 0.699560 0.088010 0.010519 32.38

    Value of ( )Data2 10 141

    But, ( ).0012 10 is about 29.5; so, now, with a lar ger sam ple and df= 10, we easily

    reject th e nu ll hypoth esis and conclude t ha t t he fit by th e quadr at ic is not good to

    explain t he dat a.

    Fifty Categories

    Before leaving th is intr oduction, we look at our nonr an dom dat a once more to see

    wha t would ha ppen if we tried a chi squa re goodness of fit t est a s above, but with

    th e dat a subdivided into more nu merous categories, say, 50 of them.

    A priori , if we had some ra ndomn ess, we might hope tha t 50 categories would

    ma ke for fewer dat a per bin an d th us lar ger varian ces of th e mean in each bin; so,

    with higher dfat th e significan ce point , th e cha nce for a good fit m ight be im proved

    over what it was with 10.

    The r esult is plott ed in F igur e 4 a nd dr am at ically shows th e effect of systema tic

    versus ran dom variat ion. With 50 cat egories and 700 data, the real shape of th edat a cur ve becomes evident ; and, quit e cont ra ry to our conclusion ba sed on Figur e 1

    or 2 a bove, an d ma ybe to our first impression of Figure 3, we see tha t t he qua dra tic

    isn't even close to a good fit to the da ta . Perh aps a slightly bett er qua dra tic might

    ha ve been chosen by eye, but th e sha pe of th e dat a cur ve obviously prevent s chi

    squa re ever from acceptin g th e qua dra tic on 50 degrees of freedom.

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    16/47

    J . M. Williams Chi Squar e df v. 1.23 15

    Data: I = 1 - cos(x-.5) + cos( 3(x-.5) ) - cos( 5(x-.5) )

    Distance x

    Field

    Intensity

    I

    0.0

    0.2

    0.4

    0.6

    0.8

    0.15 0.50 0.85

    Mean sd ofI

    Mean sd ofImean

    Figure 4. Fifty-catego ry f i t by eye. Sum marized set of 700 arti ficial

    data for the qua dratic f i t , ( )I x=

    7 1 2

    2

    . The funct ion used to generatethe data throughout the Introduct ion i s g iven in the graph t i t le .

    Tabulation of the Figure 4 statistics has been omitted; however, the significance

    point is ( ).0012 50 87 ; the obtained value is comput ed at Data

    2 304= ; so, again , th e chi

    squar e test r eassur es us th at we should reject t he fit as n ot good.

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    17/47

    J . M. Williams Chi Squar e df v. 1.23 16

    II. Me an in g of Chi-Squ are

    At t his point, we leave th e intr oductory examples for some form alism a bout th e

    sta tist ics. Gener alizat ion of th ese forma lisms to th e mu ltidimen siona l case [4, 5] isnontrivial but rea sona bly str aightforwar d.

    For m a li sm s

    Relation to Gam ma Distribut ion

    The factorial of an int eger n , defined a s ( ) ( )Fact ! ...n n n n nii

    n

    = = =

    1 11

    , appears in

    combinat orial an alysis everywhere. The factorial ma y be expressed as a special

    case of the cont inu ous gamm a fun ction . The gamma fun ction is discussed in th e

    cont ext of Bayesian inferen ce in [6, Sect. 7.3 ff.]; the der ivat ions below follow th oseof Feller [7 vol II, Ch. 2]. It isn't mu ch of an overst at ement t o say tha t t he gamm a

    fun ction is as funda ment al t o stat istical inference as th e exponent ial fun ction is to

    th e solut ions of differen tia l equat ions.

    The gamm a function is defined as,

    gamma(x) ( ) =

    x d ex 1

    0

    . (8)

    Integrating (8) formally by parts,

    ( ) ( ) xx

    d ex

    xx= = +

    1 1

    10

    ; th erefore,

    ( ) ( ) x x x+ =1 ; an d so, (9)

    ( ) ( ) ( ) ( ) x x x x= 1 2 2 , (10)

    which clearly shows t he r ecur sive relat ion ma king ( ) ( ) x x= 1 !, for th e special case

    ofx an integer.

    Using th e gamma fun ction in (8), we ma y define t he gam m a probability density

    of random variable Xon ( )0, by,

    ( ) ( )( )

    P X x x e x= = = gamma ,

    1

    1

    . (11)

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    18/47

    J . M. Williams Chi Squar e df v. 1.23 17

    Looking at (11), we notice that the familiar exponential density then may be

    defined a s ( )gamma , 1 ; and s o, by ana logy, in (11) we will ha ve th e mea n ( )E X =

    and t he variance ( )V X = 2 .

    In a sma ll digression, recall th e combin at oria l express ion for t he n um ber of ways(un order ed) of ta king a sam ple ofj objects from a discret e populat ion ofm of th em:

    ( )( )( )

    ( )( ) ( )

    ( ) ( )m

    j

    m

    j m j

    j m j

    j m j

    j n

    j n

    j n

    j n

    =

    =

    +

    =

    +=

    + +

    + +

    !

    ! !

    !

    ! !

    !

    ! !

    1

    1 1. (12)

    The discrete case ma y be genera lized to the beta integral, in t erms of th e gam ma

    fun ction, sta rt ing this way:

    ( )( )

    ( ) ( )

    B ,

    =+

    1

    . (13)

    Oth er u ses of th e gamm a fun ction ar e discuss ed in [4, Ch. 4] an d [7 vol II, Ch. 2].

    Retur ning to th e subject, we may r edefine th e un it norm al density as in (1) above,

    in th e following way: Fr om (8) an d (11), settin g to 1 2 and to 1 2 yields,

    ( )gamma x ed e

    xe

    xx1

    2

    1

    2

    1

    1

    2

    1 21 1

    2

    112

    1

    2

    1

    21

    2

    0

    2,

    =

    =

    . (14)

    The definite integra l ( )1 2 evaluates t o ; so,

    ( )

    gammax

    e

    x1

    2

    1

    2

    1 1

    2 12

    1

    2

    0

    1

    2

    2

    ,

    =

    . (15)

    In th e brackets in (15), we have x in a unit norma l distribution. Becau se

    ( )1

    0 1x

    N X; , is the density of th e squar e of th e unit norma l ran dom var iableX [7

    vol II, p.47], we ma y say t ha t

    ( )[ ]gamma N 1

    2

    1

    20 1

    2

    , ,

    . (16)

    So, (16) mea ns t ha t ( )gamma ,1 2 1 2 represent s th e probability density of th e squar eof a ra ndom variable Xdistributed a s ( )N 0 1, . It is easy to see th e genera lizat ion of

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    19/47

    J . M. Williams Chi Squar e df v. 1.23 18

    (16) to a sum ofn variat es, each distr ibuted as ( )N 0 1, : Therefore, reca lling (4)above, we ma y write

    ( )gamman

    n1

    2 2

    2,

    ; an d, (17)

    th is is just th e definition of th e distr ibut ion of chi squar e.

    Thus, we have used the gamma density to show th at a sum ofn squared unit

    normal varia tes will be distributed a s ( )2 n . By an alogy to (11) above, th e mean of

    ( )2 n will be n , and th e varian ce will be 2n .

    Moment Generat ing Fun ction

    Not ma ny rea ders would see th e relat ion of (15) to (16) above. An ea sier way to

    show the r elation bet ween th e normal a nd chi squar e probability densities is bycompar ison of th eir moment genera tin g fun ctions (mgfs).

    Th e m om ent generating fun ction of a r an dom varia ble is a s eries expan sion of its

    densit y which, t erm by term , yields as coefficient s th e n -th moments of th e ra ndom

    variable about t he origin. The first moment is the mean , the second t he variance,

    an d so fort h. What is import an t for us here, is tha t two ra ndom var iables, un der

    very genera l conditions, ha ve the sa me density if and only if all th eir moments ar e

    identical [6, 6.7.8; cf. 6.3.3]. And, if th ey both h ave the sam e mgf, th eir moment s all

    will be identical.

    The mgf of a r an dom variable X is defined [6, Sect . 7.1] for cont inu ous

    distributions on ( ) , as

    ( ) ( ) ( )mgf ; f t X E e dx e X xtX tx= = =

    , (18)

    in wh ich ( )f X x= is th e probability density function ofX. The momen ts will be

    obta ined by a power series expan sion on t; the int erval of series convergen ce does

    not concern us here.

    The rest is very simple: Fr om (14),

    ( )( )f X x

    xe

    x

    2 1

    21

    2

    1

    2

    1

    2

    1=

    =

    gamma , ; so, from (18),

    ( )( ) ( )mgf ;t dx e xx t

    2 1 21

    211

    2=

    . (19)

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    20/47

    J . M. Williams Chi Squar e df v. 1.23 19

    Cha nging var iables in (19) with y x= 1 2 an d fixing t he lower limit of int egrat ion

    immedia tely yields, for chi squa re,

    ( )

    ( )

    ( )

    mgf ;t dy et

    y

    2

    1 2

    2

    0

    11

    2

    2

    =

    . (20)

    On th e oth er ha nd, the un it normal density (1) ma y be writt en,

    ( ) ( )f ,N

    x

    x e0 1

    21

    2

    2

    X = =

    . (21)

    So, th e mgf for X2 will be,

    ( )( )( )

    mgf ; ,t N dx e dx etx xt

    x2 2

    0

    1 2

    2

    0

    0 11

    2

    1

    2

    2 22

    = =

    , (22)

    which clear ly is th e sam e as for t he chi squa re in (20).

    While on m gfs, it sh ould be mentioned tha t t he mgf also is perhaps t he easiest

    way to derive the var ian ce of th e Poisson dist ribu tion [7 vol I, Sect.6.5 - 6.7] . The

    Poisson distr ibut ion, wh ich gives t he (discret e) probability ofk events each with

    small probability , has probability function [6, 6.3.7],

    ( )P k ek

    k

    X = =

    ;!

    , (23)

    an d may be shown to have mean = var iance = . For n repeat ed Poisson t rials, the

    mean will be ( )nE X n= , and th e varian ce will be ( )nV X n= . For n repeated

    tr ials, the sta nda rd deviation of th e Poisson m ean will be ( )nV X n = n n

    = ; so, by the centr al limit t heorem, we would expect a Poisson dist ribu tion sum

    ofn repeated trials with mean to approach a norma l distr ibution with m ean n

    an d stan dar d deviation of th e mean n (= stan dar d error).

    Sufficiency

    A sample of a ra ndom variable may be seen a s a measu re of th e inform at ionknown about t he un derlying physical process. The sample size and th e var iance of

    th e measu remen t determine the inform at ion in th e sam ple: Var iance adds unu sable

    inform at ion (possible inter pret at ions of th e physical pr ocess) and so redu ces th e

    precision of th e sam ple in represen tin g th e physics. Sam ple size decrea ses the

    var ian ce of th e mean of th e sam ple and so increases t he pr ecision of th e sam ple in

    repr esent ing the mean of th e physics. Becau se physical processes are of little value

    un less repeat able, an d becau se the mea n ma y be used as an expecta tion of fut ur e

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    21/47

    J . M. Williams Chi Squar e df v. 1.23 20

    repetitions, size of the sample and variance of the measure work inversely in

    determining th e inform at ion in a r an dom sam ple.

    Given a variety of sta tistics describing a ra ndom sam ple, a sufficientstatistic [6,

    Sect. 10.4] is one wh ich describes t he da ta with no loss of inform at ion, so th at

    inform at ion beyond th at of the su fficient sta tist ic does n ot im prove prediction of

    fut ur e samples. Given th e same ph ysical process, a set of measu rement s of sma ll

    var ian ce genera lly will be sufficientto predict a set with greater var iance: The

    latt er set m ay be viewed as th e form er set plus some extr a r an dom variat ion.

    Gener ally, sufficiency requires choice of a sta tist ic good en ough to repr esent th e

    un derlying ph ysical p rocess; oth erwise, err or in choice of sta tist ic will add

    system at ic var ian ce which will redu ce predictability. An example of th is was in the

    int roductory example a bove, in which t he wr ong choice of cur ve (qua dra tic) add ed

    systematic error--clearly systematic, because there was no random error.

    In t he present cont ext, we may see chi squa re a s a wa y to represent th e goodness

    of choice of cur ve, such goodness being an un avoidable pr erequ isite t o exam ina tion

    of th e ran dom err or in th e data . This goodness must be relat ive to th e cat egories of

    th e data , not t o compet ing cur ves. But , in an y case, we can not look at s ufficiency

    without goodness of fit.

    So, how can we know tha t chi squar e might be u seful generally as a test , when,

    perha ps, the chi-squar e assu mpt ion of a su m of squar es of unit n orm al deviat es

    might itself be bad?

    Centr al Limit Theorem

    The goodness of chi squ ar e as a test st at istic for goodness of fit of rea sonably

    large samples generally need not be questioned. The reason is th e central limit

    theorem [7 vol II, Sect. 8.4 & Ch . 14], which h olds u nder alm ost all condit ions in

    which st able appar at us produces consistent dat a:

    T he sum X of any set of ind epend ent ran dom variates, regardless

    of how distributed, will approach a n orm al distribution

    ( )N X X ,2 as the sam ple size approaches infinity.

    Of cour se, that being tr ue, the sa mple mean will represent ( )E X X and the

    sam ple varian ce will represent ( )V X X 2 .

    As app lied to th e chi squa re quest ion, if th e dat a in ea ch cat egory fulfil therequiremen ts of th e centr al limit t heorem, the mean in th at cat egory always may be

    sta nda rdized for u se as a valid element of a chi squar e sum.

    Fu rt herm ore, if errors of measu rement cau sed by instr um enta tion ar e themselves

    subject t o th e centr al limit t heorem, an d th erefore a re n orma lly distributed, it can

    be proven [7 vol II, Sect. 2.2] th at th e convolut ion of th e instr um ent at ion er ror with

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    22/47

    J . M. Williams Chi Squar e df v. 1.23 21

    th e physical process also will be subject t o th e cent ra l limit t heorem, as sum ing a

    reasonably consistent , repeat ably measu ra ble process.

    Categor ies in Ra nd om-Sa mp le Chi Squa re

    Nu mber of Cat egories

    Above, it was st at ed th at a su fficient st at istic will have less variance tha n one

    which is not, for a good fit an d the sam e data . Fr om (17) an d comm ent s

    imm ediat ely following, as t he dfof chi squ ar e increases, its va ria nce increa ses, too.

    However, t o increa se df, we cha nge th e experiment by cha nging the way th e

    cat egories for avera ging the dat a ar e defined. Chi squa re represents th e cat egories,

    not the data .

    In t erms of the dat a, a set of mean s on ma ny cat egories (in th e limit, one cat egory

    per dat um ) always may be combined to ma ke fewer categories. In th is sense, a chi

    squar e of great er dfalways will be sufficient relative to one of lesser df, for a given

    set of dat a. Looked at in ter ms of individua l cat egories, a wider cat egory mean s

    great er likelihood of a cha nge in t he ph ysics, ma king for a bigger var ian ce, given th e

    sam e set of dat a being cat egorized. So, in chi squa re test ing, more num erous,

    na rr ower cat egories (alwa ys fulfilling n orm ality) will be sufficient rela tive t o fewer,

    wider ones. The more th e df, th e closer to sufficiency.

    Number of Par ameters

    For a given da ta set, it seems obvious t ha t a dding free th eoretical para meter s will

    ma ke th e fit of th e theory bett er. In t he limit, one always may fit N dat a cat egories

    by an ( N 1)-th degree polynomia l, result ing in zero var ian ce in each cat egory a ndth erefore a perfectly good chi squa re fit. However, this would be very unint erest ing

    physics--to claim, in t he limit, th at everyth ing was a certa in n um ber of term s of its

    own, u nique polynomial.

    Clearly, a physical goodness of fit somehow should involve the complexity of the

    hypothesis, as well as t he cat egorizat ion of th e dat a. The issue her e is, how should

    we view th e degrees of freedom in a goodness of fit t est wh en we a re free t o var y

    both t he dat a categories and th e number of th eoretical par am eters? Is every theory

    with n um erous par am eters an d wide-sweeping flexibility bett er t ha n one which

    leaves some variance unaccounted?

    Mea n ing of d f

    Bock [5, Sect. 2.6.3] defines degrees of freedom in ter ms of the d imen sion of a

    subspace on which th e dat a ar e projected.

    Brownlee [3, Sect. 8.1] defines d egrees of freedom a s "th e n um ber of var iables

    minu s th e num ber of independent linear r elations or const ra ints between th em [an d

    equa l to th e degrees of freedom of th eir su m of squa res]."

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    23/47

    J . M. Williams Chi Squar e df v. 1.23 22

    In describing th e chi squa re a pproximat ion t o the mu ltinomial distr ibution,

    Brownlee coun ts th e degrees of freedom a s t he n um ber of degrees of freedom for t he

    chi squar e minus t he nu mber of par am eters estima ted in order t o perform a cur ve

    fit. In an exam ple [3, Sect. 5.2], estimat ion in the sam ple ma kes the dfof th e chi

    squar e equal to th e number of cat egories minus 1. He assum es a Poissondistribution for which he su btr acts a nother 1 from df, because the Poisson density

    ha s one free par am eter. Thus, Brownlee's final dfis equa l to the nu mber of

    categories minus 2.

    If Brownlee's appr oach wer e a pplied to our problem of coun tin g df, we would u se

    th e form ula,

    ( )df N N C P= 1 , (24)

    in wh ich NC is nu mber of cat egories an d NP is nu mber of free par am eters. At least

    one popula r st at istical softwa re pa cka ge uses form ula (24) for chi squ ar e goodness offit.

    For exam ple, consider th e ar tificial dat a in F igure 4b, an d th e effect of sequen tia lly

    fitt ing th em with a line of slope b an d then with a const an t function a , each t ime

    subtr acting away the fit from the dat a:

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    24/47

    J . M. Williams Chi Squar e df v. 1.23 23

    (a) Data( x )

    x

    0.00

    0.25

    0.50

    0.75

    1.00

    0 2 4 6 8 10 12 14

    (b) Data(x ) and fit of slope b = 0.058

    x

    0.00

    0.25

    0.50

    0.75

    1.00

    0 2 4 6 8 10 12 14

    f1

    (x) = 0.058 x

    (c) Data( x ) - f1(x) and fit of a = 0.112

    x

    0.00

    0.25

    0.50

    0.75

    1.00

    0 2 4 6 8 10 12 14

    f2

    (x ) = 0.112

    (d) Data(x) - f1

    (x) - f2

    (x )

    x

    -0.25

    0.00

    0.25

    0.50

    0.75

    0 2 4 6 8 10 12 14

    Figure 4b. Effect of l inear parame ters on residuals . The data we re

    fit by f(x) = a +bx = 0.112 + 0.058x before the success ive su btract ions .

    Using these two para meters a and b in a cur ve fit in some sens e does redu ce th eort hogona lity of th e remainder (residuals). The smaller vertical ra nge means t he

    varian ce was reduced, too. With 12 data an d 2 par am eters t o fit, it would seem

    reasonable in th is exam ple to use the r ule above to test goodness u sing ( )p2 9 , with

    dfcomputed as 9 = (12 - 1) - 2. For exam ple, in Figur e 4b, if theory forced a = 0.10,

    th en only b would be fit, a llowing a test on ( )p2 10 , and possibly ma king a ccepta nce

    of th e fit more likely for th e same dat a. Clearly, in such an exam ple, th e accepta nce

    would be m ore likely for a th eoret ical valu e ofa equal t o its value best fit to th e

    dat a. So, a "str onger" th eory which predicted par am eters a ccur at ely without dat a

    would get preferen ce in goodness. The Galahad Effect, one m ight sa y, in wh ich a

    cur ve gains st rength becau se its th eory is pure.

    The problem with th e appr oach of simply subtra cting t he n um ber of free

    par am eters is th at it tr eats everything as linear a nd ort hogona l. It gives no more

    import an ce to a pa ra meter of the curve tha n t o just one of th e possibly num erous

    cat egories. This appr oach seems unqu estionably valid when the cur ve to be fit is

    linear. In other cases, although the cat egories might be linearly independent a nd

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    25/47

    J . M. Williams Chi Squar e df v. 1.23 24

    cont ribute equa lly to the r an dom var iance removed by th e fit, const ra ints on t he

    par am eters in genera l will int roduce nonlinear const ra ints on th e cat egory mean s.

    Anoth er, en tir ely differen t problem, is th e efficiency of the estim at or us ed for a

    par am eter to be fit. Sur ely, we should subtr act a dfwhen t esting a polynomial

    hypoth esis, if we ar e fitting an intercept para meter u sing th e data mea n. But, what

    if we ar e using t he less efficient, nonpar am etric stat istic, th e median, inst ead of th e

    mean ? The median only accoun ts for th e ran k order of th e data a nd is invar iant

    un der an y strictly monotonic tr an sform of th e data . Should we subtr act 1 2 dfif we

    use th e median of th e data to estima te th e hypoth etical intercept?

    For a t heorist interest ed in an objective test of a h ypoth esis with a small nu mber

    rof free pa ra met ers, t estin g by (24) above on a lar ge nu mber of cat egories, sa y 10r

    or more, essentially ignores th e nu mber of par am eters. Yet, under the null

    hypoth esis that th e data ar e well fit by the cur ve, th e data in some sense should be

    expected t o be interdependent in a way similar to the wa y they respond to small

    chan ges in th e para meters.

    Each parameter is critical, to an h ypoth esis with two par am eters to fit. With 100

    cat egories of dat a, th ough , dropping ten categories might n ot ma ke mu ch differen ce

    in t he conclusion.

    Need for a Cor r ect ion t o d f

    The pr acticality of the pr oblem ma y be seen by compar ing two hypoth etical

    cur ves as shown h ere, one a sine

    cur ve with 3 free par am eters

    (vertical displacement, phase, andfrequency) and t he other a 2-term

    polynomial (cons ta nt + coefficient

    of square).

    Sup pose we were a llowed to

    adjust the param eters shown. If

    th e data ra nged only over a 1,

    th e sine fit might be bett er th an

    th at of the polynomial; with th e

    dat a a s shown, a good sine fit

    might be hopeless. By th e ru leabove, we would u se a more

    str ingent test (more likely to reject goodness) for t he sin e cur ve tha n for t he

    polynomial.

    We note t ha t a dding a free para meter to the sine (say, amplitude) or a couple of

    more ter ms to the polynomia l, might m ak e the fit good in both cases. Alas! In th e

    cont ext of a th eory, man y par am eters will be const ra ined by the physics a nd

    x

    -1.00

    -0.50

    0.00

    0.50

    1.00

    1.50

    2.00

    2.50

    0.0 0.3 0.5 0.8 1.0

    Data(x)

    a + cx2

    a + sin( bx + c )

    Figure 4a. Two il l-f it curve s.

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    26/47

    J . M. Williams Chi Squar e df v. 1.23 25

    un available for ar bitra ry fitting. The problem of th e present pa per is th at we

    can not know in advan ce what th e relation will be, for t hose tha t r emain free to fit.

    Clear ly, ignoring th e nu mber of par am eters in a goodness of fit t est will make t he

    test value of chi squa re such t ha t df df test categor ies= , so th at , for t he sa me curve fit an d

    dat a, a ccepta nce of the n ull hypoth esis (th at th e fit is good) will be favored.

    However, as above, merely subtra cting th e nu mber of par am eters from ( Ncategories 1)

    would seem likely t o under corr ect dfexcept in th e special case of linear regress ion.

    We seek a wa y of corr ectin g th e dfin a wa y equally fair , in some sen se, to all

    curves to be fit.

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    27/47

    J . M. Williams Chi Squar e df v. 1.23 26

    III. The P aram ete r Correc tion

    Defin i t ion of Cor r ect ion

    Development and Rationale

    Before st ar ting, we emphasize tha t we do not addr ess th e question of

    cat egorization of th e data : Pr esuma bly, an optimum categorization a lready has

    been achieved, or perh aps th e problem may not be flexible in these ter ms. We

    accept t he categories of the da ta set a s a given.

    We also accept a s fixed a priori, the level of significance p to be requ ired for a

    rejection of goodness of fit.

    A clue to th e solution su ggested her e ma y be seen in th e example in Figure 4babove: In th at example, th e data residua ls were just as ran dom as th e origina l dat a

    un less fit by a stra ight line. If th e residuals were fit by a stra ight line, the slope

    would be 0 and t he int ercept would be 0, becau se th ese ort hogona l componen ts wer e

    th e ones subt ra cted down to 0. But, th ey were the only ones subt ra cted away, and

    no oth er stru ctu re in the dat a was affected. If we wished to test a linear fit to the

    residuals by chi squa re, we would subt ra ct a way df= 2; however, we assu me h ere

    th at to test a tr igonometric fit, we would subt ra ct a way nothing from th e df. Even

    th ough t he varian ce of th e residuals in Figure 4b was less tha n t ha t of th e origina l

    dat a, th e residuals rema in just a s dimensionful as th e data , except as projected on

    th e intercept or t he slope of a linear basis set.

    Given cat egories C, NC in n um ber, we consider a set of ar bitrar y cur ves K to fit,

    each cur ve itself with a n a rbitra ry an d probably not u nique set of par am eters NP in

    nu mber . All such cur ves ar e ass um ed single-valued on th e doma in of th e cat egories

    of dat a. The quest ion is how to coun t th e dfto be used in a chi squa re test. This

    quest ion is equivalent to th e one of coun tin g how ma ny dfshould be subtracted from

    th e number of dat a categories. We know th at for rea listic problems we mu st ha ve

    N df NP C< < .

    The a nswer we propose her e is to accoun t for t he diversit y of possible par am eter s

    by ignorin g th eir num ber an d consider ing only their effect. We th erefore propose to

    countth e degrees of freedom in th e dat a by measu ring th e ort hogona lity of th e dat aafter t he curve has been fit.

    Here is how we prepare th is quant ificat ion: For data on ra ndom var iableXwith

    domain { }x , described categorized by some function, { }Data x , we can compu te t he

    total varian ce in the da ta cat egories, ( )V Datacat . We choose th e curve K to be fit, fit

    it, subtra ct it, and th en exam ine the residual var iance, ( )V Data K res , .

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    28/47

    J . M. Williams Chi Squar e df v. 1.23 27

    Under th e null hypoth esis tha t t he fit was good, the residua ls should represent

    solely ra ndom var iat ion; th ey of cour se ma y not be as sum ed constr ain ed by K, K

    ha ving been removed. Under th e nu ll hypoth esis, we assu me here th at th e

    ort hogona lity of th ese residu als m ay be seen as being equa l to th e degrees of

    freedom of th e cur ve fit t o th e origina l dat a, except wh en consider ed in r egard t o afit by K itself.

    Coun tin g Pr ocess

    So, having r emoved t he effect of th e curve K, we propose to count degrees of

    freedom by r emoving furth er var iance by fitting t he r esiduals with th e sum of a set

    ( ){ }L L x of linear ly ort hogona l fun ctions , leaving us with a n ew residua l var ian ce,

    ( )V Data K Lres , , . Sta tist ical significan ce of th e sum will be test ed after a dding each

    term ofL to it; any term removing statistically significant variance will be omitted

    from L .

    The size of th e set L th en will define th e dfof the chi square t o be used in t esting

    goodness of fit. We see no reas on n ot t o use a n ort hogona l set of polynomial

    functions; because we will enumerate terms in L beginning with L0 , we call t he s ize,

    n df+ =1 , the nomial of th e fit:

    ( ) ( ) ( ) ( )df Data K L n L n R Data K L L xcorrected jj

    n

    = + ==

    nomial , , , , ,1 00

    s. t. es , (25)

    in which t he j ta ke on values of th e degree ofL not foun d sta tist ically significan t.

    To see th e logic of this, if only a few terms in L were required to reduce theresidual varian ce ( )V Data K Lres , , below some th resh old , then th e residuals must

    ha ve had low-order str uctur e which was m issed by K; so, th e dat a cat egories were

    not orth ogona l in view of th e th eory an d indeed sh ould be test ed at lower df. On

    th e oth er ha nd, if nu merous term s were required in L to reduce the r esidua ls below

    , the r esiduals must ha ve had enough ort hogona lity to be tested at high df.

    St opping Cr iterion

    The one rema ining pr oblem is the st opping crit erion: How good a fit should we

    requ ire of th e ort hogona l fun ctions ( ){ }L x , none of which being significant?

    We find an an swer by again referring to the n ull hypoth esis tha t t he fit is good:

    Under th is hypoth esis, we assume t ha t t he initial cur ve-fit by Kmu st ha ve removed

    a st at istically significan t a moun t of var ian ce; if so, ( )V Datacat > ( )V Data K res , ; in fact,we may write this in para metric terms a s,

    ( ) ( )V Data F V Data K cat p res , , (26)

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    29/47

    J . M. Williams Chi Squar e df v. 1.23 28

    in wh ich Fp is variance rat io at th e one-tailed p-level for t he F isher an alysis of

    variance F-test. So, to calculat e where to term inat e th e nomial, we suggest a pplying

    th is criterion:

    ( ) ( ){ } ( ) ( )nomial , , , , , ,maxData K L Min n L V Data K L F V Data K Lres p res n 1 0s. t. . (27)

    This just sets a t hr eshold which pr events th e creat ion of nonexistent dfby the

    ort hogona l fun ction fitt ing process: If th e n -th polynomial leaves th e n -th residuals

    with significan tly less variance tha n t he original 0-th residuals, th en a ll degrees of

    freedom (beyond th e st ru ctu re evident ly overlooked by K in the dat a an d now foun d

    in th e original r esidua ls) ha ve been discovered.

    A similar rationale is used sometimes in factor analysis (as distinguished from

    principle components an alysis). It should be ment ioned tha t th e dffor t he

    significance level of F is not a difficult issue h ere, becau se we ar e looking a t

    residuals a fter our problematical cur ve Kha s been fit and gone. The df for

    computing ( )V Data K Lres n, , from a sum of squa res of deviations will be just t he dfofthe data , ( NC 1), minu s the polynomial const ra ints ( n + 1 ). This is becau se, in both

    cases, the r esidual variance will have the dimensiona lity of th e dat a u nder t he nu ll

    hypoth esis tha t t he fit ofKwas good an d th erefore t ha t L will be in effective in

    explaining the data.

    So, we ha ve descr ibed a pr ocess to redu ce an ar b i t ra ry problem in

    coun t in g p h ysica l p a r a m et er s a s d f, t o one of cou n t in g l inea r ly in d ep end en t

    p olynom ia l p a r a m et er s a s d f.

    The Corr ection Pr ocedur e

    In the following, ( )Vres refers to a variance between categories and is computed

    from a sum of squar es by dividing by a dfdependin g on t he nu mber of cat egories

    an d their linear in dependence. The sample var iance in the origina l dat a, as

    categorized, is ( )V Datacat an d is a varian ce am ong t he category m eans; in cont ext of

    an alysis of varian ce, this is called th e "between tr eatm ents" varian ce.

    We resta te t he pr oblem: To perform a chi squa re goodness of fit test in th e

    cont ext of a ph ysics experiment, a ssum ing predeterm ined dat a cat egories Cand

    significance levelp , the degrees of freedom dfma y have to be adjusted for t he free

    par am eters in the cur ve to be fit. The proposed an swer is:

    1. Fit th e cur ve K to th e cat egorized dat a. Record th e sum of squa res of th e

    residu als (differen ces or r ema inder s) after t he fit, stan dar dizing each differen ce by

    th e var ian ce in its cat egory. The resu lt will be th e valu e of chi squa re for th e fit:

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    30/47

    J . M. Williams Chi Squar e df v. 1.23 29

    ( )Data j

    j

    Nj j

    jj

    Nj j

    jj

    N

    SSX K

    V

    X K

    sd

    C C C

    2

    1

    2

    1

    2

    1

    = =

    =

    = = =

    . (28)

    The stan dar d deviation sd here is the st an dar d deviation of th e j-th cat egory mea n,also called the st an dar d err or, which sh ould be estimat ed from th e origina l data

    before th e fit by K. Note th at t he cat egory variance corr esponding to chi squar e,

    which will not be u sed in t he goodness of fit t est, is obtain ed by dividing chi squa re

    by th e df: ( )V Data K df res Data, = 2 , in which we a pproximat e th e dfconventionally by

    N NC P 1 .

    IfK is a polynomial, count the number NP of free pa ra meters in t he polynomial

    an d comput e df= ( NC 1) NP : This immediat ely is th e dfsought. IfK includes

    polynomial free parameters (a constant offset, for example), subtract the number of

    th em before per form ing th e goodness of fit t est.Also, esta blish a st opping criterion by looking up or calculat ing th e Fish er

    varian ce ra tio value Ffor one-ta iled rejection of a n ull hypoth esis tes t ofdfof

    ( )V Datacat vs. dfof ( )V Data K res , at significan ce levelp . This $Fp ma y be foun d in

    sta tist ics references in th e cont ext of an alysis of var ian ce. The significan ce test will

    not a ctu ally be perform ed; however it s dfwill set $Fp for th e stopping crit erion. One

    dfwill be th at of th e data , NC 1; th e oth er will be th at of th e data with const ra ints

    of th e curve fit Kuncorrected, N NC P 1 . So, find,

    ( ) ( )( )F df df F p V Data V Data K p,

    $

    , . (29)

    Note th at in general Kwill ha ve removed consider able var ian ce from th e dat a, so

    an immediate Frat io test with,

    ( )

    ( )F

    V Data

    V Data K

    cat

    res

    =,

    , (30)

    easily might exceed $Fp . This will be irr elevant , becau se (a) we ar e not merely

    testing whether Ksays someth ing about t he dat a; and, (b) we can not assum e tha t

    th e origina l var ian ce over th e whole set of cat egories was const an t.

    2. One sum a t a t ime, progressively fit th e residual data with th e sum of th e

    ter ms from an ort hogona l polynomia l set ( ){ }L x an d comput e the new residual

    varian ce from tha t fit, ( )V Data K Lres , , . When th e nomial equals NC 1, st op. Or,stop when t he st opping crit erion is invoked: This will be when t he ra tio of var ian ces

    am ong elements ofL no longer is below $Fp , na mely, when

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    31/47

    J . M. Williams Chi Squar e df v. 1.23 30

    ( )( )

    FV Data K L

    V Data K LF

    res

    res n

    p= , ,

    , ,$

    0. (31)

    For everyL yielding a statistically significant fit, there was a linear dependence, sosubt ra ct 1 from t he n omia l for ea ch su ch fit.

    To compu te t he first an d all subsequent values of ( )V Data K Lres , , , compu te t he onlyvarian ce present, na mely that between t he cat egories:

    ( )V Data K Lres n, , =

    1

    1dfresj

    j

    NC

    S S , (32)

    with ( )df N nres C = +1 1 , for Ln the highest-degree term in a polynomial ofn -th

    degree.

    3. Fina lly, to test t he goodness of th e fit, perform a st an dar d chi squar e test with

    dfequal to th e nomial. In cases not requiring a corr ection, this dfwill equa l NC 1 .

    Step 2 m ay be perform ed expeditiously by doing a pa ra met ric multiple regress ion

    of the residuals ( )V Data K res , on a convenien t set of ort hogona l polynomials. AChebyshev set of coefficient s, or a ny of severa l other s, ma y be foun d in [8] and used

    with P C softwar e.

    Ap p l ica t ion in S yn th et ic Exa m p les

    The value of using synthet ic examples is th at th e un derlying, corr ect fun ctiona lform of th e data will be known. Thu s, we can pr edict how a good chi squa re

    goodness of fit t est should beh ave.

    To insist on st at ing th e obvious, in th e following examples, compa red with th e

    int roductory ones above, we ha ve a qua litat ively differen t set of dat a. The next

    dat a below ar e simulat ed experiment al data , with real varian ce. Uniformly

    distributed ra ndom numbers were used to add th e var iance.

    In th e following exam ples, we sha ll assum e th e centr al limit t heorem so as t o be

    able to tr eat each bin mean as norma lly distribut ed. In th e first example below, we

    sha ll be comput ing th e sta nda rd deviation in each bin u sing 14 -1 degrees of

    freedom; one degree is lost becau se th e bin mea n was estima ted from t he sa me 14

    dat a. Knowing each bin mean and sta nda rd deviation, we sha ll comput e th e

    sta nda rd deviat ion of each data bin mean (stan dar d error of th e mean) and u se it,

    not t he corr esponding th eoretical fit pa ra meter, t o stan dar dize each element of th e

    chi squar e sum.

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    32/47

    J . M. Williams Chi Squar e df v. 1.23 31

    Fifty Cat egories in Ran dom Ch i Squar e

    Quadr at ic Fit

    Recall from F igure 4 th at th e qua dra tic fit with 50 cat egories visibly was notgood. Let's look at t he same dat a cat egories, but with th e data r an domized by

    add ition of some va ria nce.

    Distance x

    Field

    IntensityI

    -0.4

    -0.2

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    1.2

    0.15 0.50 0.85

    Mean sd of I

    Mean sd of Imean

    Bad Line:

    I = 7(x -1/2)

    Cubic Fit:

    I = 25| x -1/2|3

    Quadrat ic Fit:

    I = 7(x -1/2)2

    Quartic Fit:

    I = 70(x -1/2)4

    Figu re 5. Fifty-catego ry fits by eye. The 700 artificial data of Fig. 4

    here are added som e random error and again f it wi th the same

    categorized quadratic, ( ) ( )I a x b x= = 22

    7 1 2 . A purpose ly bad l inear

    fi t also is show n, as is the old cubic f i t by eye from Figure 1 above an d a

    new quart ic f i t .

    Computing Data2 as in (5) or (28) above yields a valu e of about 35.8 for t he s olid

    line in F igur e 5, which r epresents t he qua dra tic cur ve. We know from published

    ta bles or calculat ion th at ( ).0012 13 = 34.5 an d ( ).001

    2 14 = 36.1. We find in th e tables,

    also, th at our st opping crit erion will be ( )F. , .001 49 47 250= . The quadr at ic ha s just two

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    33/47

    J . M. Williams Chi Squar e df v. 1.23 32

    polynomial par am eter s, so our corr ection should allow us imm ediat ely to assign a

    corrected dfof ( NC 1) - 2 = 47; however, we proceed with t he complete correction

    an alysis, to show how it work s.

    If we compu ted our correction t o df= 14 or a bove, and if no ter m wa s significan t,

    an d no residual reached the L Ln0 var ian ce ra tio stopping crit erion of 2.50, we

    would have included sufficient dfto know we could a ccept t he n ull hypothesis t ha t

    th e fit was good.

    The resu lts of th ese compu ta tions ar e in Table 5. To redu ce possible err ors on

    th is first real t rial of th e corr ection, th e ta ble was const ru cted by repeat edly fitting

    nonorth ogona l polynomials with t erm s of th e form, a xnn , as described for t he

    correction above. The resu lt was verified by repea tedly run nin g mu ltiple regression

    on a n in crea sing nu mber of ter ms of th e Chebyshev ort hogona l polynomia ls of th e

    first k ind. In pr actice, of cour se, only the regr ession would be necessar y.

    The low, relatively unchanging $F ra tios for t he lowest-order nomials in Ta ble 5ar e a good indicat ion t ha t we should test a t h igh df.

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    34/47

    J . M. Williams Chi Squar e df v. 1.23 33

    Table 5. Correcte d d fcalculations for the 50-category qu adratic in

    Fig. 5 . All but the rightm ost two columns we re done by curve-fi t with

    nonorthogonal a xnn polyn omia ls, term by term . The "MR"column s

    we re done by mult iple regress ion on Chebyshev polynomials .

    Ra w

    Nom-

    ia l

    DescriptionPoly

    Residual

    SS

    Poly

    Residual

    Variance

    $F

    MR

    Residual

    SS

    MR

    Residual

    Variance

    - Data2

    = 35.82 - - - -

    0 ( )V Data K Lres , , 0 0.4725 0.00964 - 0.4725 0.00964

    1 ( )V Data K Lres , , 1 0.4695 0.00978 0.985 0.4695 0.00978

    2 ( )V Data K Lres , , 2 0.4130 0.00879 1.097 0.4128 0.00878

    3 ( )V Data K Lres , , 3 0.4127 0.00897 1.075 0.4127 0.00897

    4 ( )V Data K Lres , , 4 0.4082 0.00907 1.063 0.4079 0.00906

    8 ( )V Data K Lres , , 8 0.3643 0.00888 1.085 0.3613 0.00881

    20 ( )V Data K Lres , , 20 0.3597 0.01240 0.778 (n ot don e) (n ot don e)

    None of th e fits wa s significan t, a nd a ll compu ted va lues of $F sta yed well below

    2.50, so we ar e justified in t estin g at ( )p2 20 or perh aps higher df. We kn ow

    alrea dy that we can not reject t he nu ll hypoth esis at df = 14, so an y higher dfsurely

    will yield a sta tist ic sufficient t o th e sam e conclusion . Fu rt her a na lysis is not

    requ ired, an d accordin g to our corr ected chi squa re t est, we accept t he fit of th e

    quadratic as good.

    It sh ould be ment ioned her e th at a least -squar es or other similar fit of a 20-term

    polynomial techn ically is not tr ivial, even on a fast compu ter . It would be ra th er

    amazing that such a polynomial really could be fit optimally to a set of data small

    enough to appr oach the num ber of term s being fit. But, we assu me the best and

    look t o th e fut ur e for better .

    Linear Fit

    Table 5a sh ows th e sam e test as Ta ble 5, but for t he obviously bad fit of a st ra ight

    line passing on a st eep slope thr ough t he middle of th e data (Figure 5). Again, the

    two polynomial parameters should allow immediate correction to df= 47; we

    cont inue an yway, for demonst ra tion pur poses.

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    35/47

    J . M. Williams Chi Squar e df v. 1.23 34

    Table 5a. Computation s as in Table 5 for corrected ch i square

    goodne ss of fi t test of the bad ( ) ( )I a x b x= = 7 1 2 l ine in Fig. 5. Thestop on the third row makes the n omial = 1; how ever , the remainder i s

    supplied to show the result of a bad f i t . Terms s ignificant in regression

    are shown in parenthe ses .

    Ra w

    Nomial

    Descr ipt ion Residua l

    SS

    Residual Variance$F

    - Data2

    = 106.85 - -

    0 ( )V Data K Lres , , 0 101.58 2.073 -

    1 ( )V Data K Lres , , 1 2.8708 (*L 0,L 1) 0.05981 34.7

    2 ( )V Data K Lres , , 2 0.4128 (*L0-L 2) 0.00878 236.1

    3 ( )V Data K Lres , , 3 0.4127 (*L 0,L 1) 0.00897 231.1

    4 ( )V Data K Lres , , 4 0.4079 0.009064 228.7

    8 ( )V Data K Lres , , 8 0.3616 0.008819 235.1

    12 ( )V Data K Lres , , 12 0.3607 0.009749 212.1

    As before, ( )F.

    , .001

    49 47 250= . But, as obvious in Table 5a, the bad linear fit has

    left t oo mu ch st ru ctu re un accoun ted an d push es $F immediately out to 34.7, mak ing

    th e nomial 1; this would imply testing a gainst ( ). .0012 1 108= . In an y case, becau se

    ( ).0012 49 is only about 85, th e Data value of 106.85 quickly allows us to reject the fit

    of the bad line as not good no mat ter what th e df.

    Cubic Fit

    Once more, we look at a differen t curve fit to the da ta in Figur e 5: This tim e, it is

    th e cubic from Figur e 1, replotted in F igure 5 an d origina lly rejected a s a fit to the

    dat a before ran dom variat ion ha d been added. We again proceed with the ana lysis,

    alth ough t wo polynomial para meters permit an immediate corr ection t o df= 47.

    The raw Data2 for t he cub ic fit is foun d t o be 64.25; because ( ).001

    2 49 is about 85.3,

    th e dat a va lue from th e cubic might a llow th e fit t o be rejected, if our corr ection

    were to reduce th e dfby enough. Fr om published ta bles or calculat ion, we find th at

    ( ). .0012 33 63 9= and ( ). .001

    2 34 65 2= ; so, if we ret ain a corr ected dfof less t ha n 34, we

    will reject the fit of the cub ic as not good.

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    36/47

    J . M. Williams Chi Squar e df v. 1.23 35

    As before, the correction stopping criterion will be ( )F. , .001 49 47 250= . The resultsof th e an alysis ar e in Table 5b.

    Table 5b. Computation s as in Table 5 for corrected ch i square

    goodn ess of f it test of the cubic ( )I a x b x= = 33

    25 1 2 in Fig . 5.

    Terms s igni ficant in regress ion are show n in parentheses .

    Ra w

    Nomial

    Descr ipt ion Residua l

    SS

    Residual Variance$F

    - Data2

    = 64.25 - -

    0 ( )V Data K Lres , , 0 0.834828 0.017037 -

    1 ( )V Data K Lres , , 1 0.780223 0.016255 1.0482 ( )V Data K Lres , , 2 0.479143 (*L 0-L 2) 0.010195 1.671

    3 ( )V Data K Lres , , 3 0.476474 0.010358 1.645

    4 ( )V Data K Lres , , 4 0.395849 0.008797 1.937

    5 ( )V Data K Lres , , 5 0.394304 0.008961 1.901

    7 ( )V Data K Lres , , 7 0.364182 0.008671 1.965

    8 ( )V Data K Lres , , 8 0.360949 0.008804 1.935

    12 ( )V Data K Lres , , 12 0.360192 0.009735 1.750

    20 ( )V Data K Lres , , 20 0.352694 0.012162 1.400

    Alth ough Ta ble 5b omit s some result s, th e value of $F wandered up to the

    ma ximu m sh own a t a nomial of 7, th en declined gra dua lly to the n omial 20 value

    shown; this was the limit of th e aut hor's PC softwar e. There is no sign th e stopping

    crit erion would be invoked, an d only one n omia l ter m wa s dr opped becau se of

    significan ce, so the corr ected chi squa re dfwould be NC =1 1 2 46 , the fina lredu ction by two being becau se the fit was a t wo-par am eter polynomia l. We repea t

    that the immediate correction, because of the two polynomial terms, by subtraction

    of 2 from 49, would leave a proper corrected dfof 47; in effect, we pr eten ded t her e

    was at least one nonpolynomial para meter t o test t he procedure. But, 46 34>> still

    mean s th at th e cubic fit to th e Figure 5 da ta would be accepted as good.

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    37/47

    J . M. Williams Chi Squar e df v. 1.23 36

    Quar tic Fit

    Let's look once again a t a cur ve fit to th e data in Figure 5: This time, it is the

    qua rt ic, which wa s fit by eye.

    The raw Data2 for th e quar tic fit is foun d to be 83.01. Fr om published ta bles or

    calculat ion, we find t ha t ( ). .0012 47 82 7= and ( ). .001

    2 48 84 04= ; so, if we ret ain a

    corrected dfof less tha n 48, we will reject t he fit of th e quar tic as n ot good. The

    common practice in (24) above of setting dfat N NC P 1 , or our str ictly app lied

    correction procedure, would assign df= 47, cau sing an imm ediat e rejection of th e fit.

    We pretend th ere is at least one nonpolynomial par am eter a nd pur sue our corr ection

    here t o see how it perform s in t his int eresting ma rginal case.

    As before, the correction stopping criterion will be ( )F. , .001 49 47 250= . The resultsof th e corr ection a na lysis is in Table 5c.

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    38/47

    J . M. Williams Chi Squar e df v. 1.23 37

    Table 5c. Computation s for corrected chi square goodne ss of f i t test

    of the quartic ( ) ( )I a x b x= = 44

    70 1 2 in Fig. 5 . Terms s ignif icant in

    regress ion are show n in parentheses .

    Ra w

    Nomial

    Descr ipt ion Residua l

    SS

    Residual Variance$F

    - Data2

    = 83.01 - -

    0 ( )V Data K Lres , , 0 0.824809 (*L 0) 0.016833 -

    1 ( )V Data K Lres , , 1 0.766592 0.015971 1.054

    2

    ( )V Data K L

    res

    , ,2

    0.658320 0.014007 1.202

    3 ( )V Data K Lres , , 3 0.650998 0.014152 1.189

    4 ( )V Data K Lres , , 4 0.407871 (*L 0-L 4) 0.009064 1.857

    5 ( )V Data K Lres , , 5 0.406808 0.009246 1.821

    6 ( )V Data K Lres , , 6 0.383786 0.008925 1.886

    7 ( )V Data K Lres , , 7 0.366298 0.008721 1.930

    8 ( )V Data K Lres , , 8 0.361569 0.008819 1.909

    12 ( )V Data K Lres , , 12 0.360525 0.009744 1.728

    20 ( )V Data K Lres , , 20 0.352887 0.012169 1.383

    As sh own in Table 5c, th e value of $F wandered up to a ma ximum again at a

    nomial of 7, but t he st opping crit erion was not invoked. However, th e low-degree

    an alysis was enough to reveal two terms of significan t residuals; the pr etend-

    corr ected chi squa re dfwould be NC =1 2 2 45 , and t he quar tic fit to the Figure

    5 dat a is r ejected a s not good. Actu ally, we of cour se would h ave rejected t he fit at49 - 2 = 47 df, becau se th ere were just t wo polynomial para meters an d no oth ers.

    Fina lly, before leaving th is exam ple, we should rema rk t ha t t he sam e quadr at ic

    function wh ich failed misera bly in F igure 4 ea sily was s hown a good fit in Figur e 5.

    The dispersion in the Figur e 5 dat a clearly weakened the test. In fact, even th e

    cubic of Figur e 1 was a ccepted a s a good fit. When t he ra ndomn ess is grea t enough,

    alm ost a ny fit will be good en ough --but , not m uch confidence in t he h ypoth esis will

    be added by not rejectin g it, eith er.

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    39/47

    J . M. Williams Chi Squar e df v. 1.23 38

    We consider next a more realistic simulated example.

    Applicat ion in a Twenty Cat egory Decay Simu lat ion

    Assume t here is a t heory which predicts t ha t a certa in ma teria l, initially

    tr an slucent, will fluoresce if illumina ted with light at a certa in, short wa velength.Let's consider a h ypoth etical experim ent in which some of th is mat erial is enclosed

    in a spherical detector, a glowball, and irradiated with a pulse of short-wavelength

    light . We wish to predict the tim e cour se of th e glow inside th e sphere.

    The hypoth esis is tha t a 4 Pla nckian detector will respond a s th e convolut ion of

    th e Gaussian light exciting impulse with a decaying negative exponent ial of the

    form,

    ( )( ) ( )

    E t w a I I d e e

    tw t a; , | 0 0

    0

    1

    2

    2

    =

    , (33)

    which h as t wo free par am eters, pulse width w an d time-const an t a, the initial

    intensity I0 being fixed by theory.

    The resu lt of th e simulat ed experiment is given in Figure 6.

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    40/47

    J . M. Williams Chi Squar e df v. 1.23 39

    time t

    Energy

    E

    0

    5

    10

    15

    20

    25

    30

    0 5 10 15 20

    Mean sd of E

    Mean sd of Emean

    Figure 6. Glowb all decay expe riment. Time t and energy E are in

    arbitrary un its . The sol id line represe nts a fi t of formula (33) above,

    wi th I0 = 8 f ixed by the ory and w = 2.9 an d a = 17 f i t by eye to the 20 bin

    means s hown.

    Fr om published ta bles or calculat ion, we find th at ( ). .0012 19 438= and

    ( )F. , .001 19 17 4 81= . Computing Data2 , we get over 75. Ther efore, th e null hypoth esis

    of a good fit is r ejected at once.

    The analysis, although unnecessary for this particular decision, is given in

    Table 6. Looking at t he several low-degree valu es ofL in t he t able, it is obvious

    th ey are chan ging slowly, an d, clear ly, after compu tin g L2 , it would be very

    reasonable to conclude t ha t a test at df= NC 1 1 was just ified.

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    41/47

    J . M. Williams Chi Squar e df v. 1.23 40

    Table 6. Glow ball decay expe riment. Computation s for correcte d

    chi square goodne ss of fi t o f the c urve in Fig . 6, which w as text equat ion

    (33) f i t by eye . Two free paramete rs , w an d a , were avai lable to f i t .

    Terms in parenthes i s w ere s igni f icant in regress ion.

    Ra w

    NomialDescription

    Residual

    SS

    Residual

    Variance$F

    - Data2

    = 75.7 - -

    0 ( )V Data K Lres , , 0 9.551 (*L 0) 0.503 -

    1 ( )V Data K Lres , , 1 9.551 0.531 0.947

    2 ( )V Data K Lres , , 2 7.631 0.449 1.120

    3 ( )V Data K Lres , , 3 6.493 0.406 1.239

    4 ( )V Data K Lres , , 4 6.226 0.415 1.212

    8 ( )V Data K Lres , , 8 5.432 0.494 1.018

    12 ( )V Data K Lres , , 12 5.559 0.794 0.634

    None of th e compu ted va lues of $F exceeded even 2, alt hough t he L0 term would

    not be coun ted; so, if the initia l value of 75.7 for chi squa re ha d n ot been so obvious,we would have performed the corrected test against ( ) ( ) . .001

    2

    001

    21 1 18NC = .

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    42/47

    J . M. Williams Chi Squar e df v. 1.23 41

    But wha t if th eory allowed us t o adjust t he th ird para meter , I0 , for a bett er fit?

    The result of a t hr ee-par am eter fit is shown in Figure 7 an d is ana lyzed in Ta ble 7:

    t ime t

    Energy

    E

    0

    5

    10

    15

    20

    25

    30

    0 5 10 15 20

    Mean sd ofE

    Mean sd ofEmean

    Figure 7. Glow ball decay expe rimen t. Analysis w ith three freeparameters . Time t and energy E are in arbitrary units . The sol id l ine

    represe nts a f i t of formula (33) above, with I0 = 8.2, w = 2.95, an d a = 1 7

    as f i t by eye a gain to the sa me 20 bin mea ns as in Fig. 6 .

    As sh own below in Table 7, Data2 , with an additiona l free par am eter, now is just

    34.9, which is below ( ). .0012 19 438= . Ther efore, assum ing a corr ection for free

    par am eter s, no immedia te decision can be ma de. Becau se th e lowest dfallowing

    accepta nce of the fit is 14 ( ( ). .0012 14 361= ), we seek a decision a s to wheth er t he df

    might be 14 or more.

    The corr ection an alysis is done in Table 7. In t ha t Table, th e wandering of th e

    residual va rian ce an d t he lack of consistent increase of $F make it quite certain t hat

    th e critical Fvalue of 4.81 never will be reached. Ther efore, th e fit of th e theory

    using th ree free par am eters m ay be accepted as good.

    Compar e the a ppeara nce of th e two fits in Figures 6 and 7: The eye can ha rdly

    decide, whereas th e sta tistics a re u nequivocal, given t he category set an d th e

    significance level.

  • 8/9/2019 Degrees of Freedom: A Correction to Chi-Square for Physical Hypotheses

    43/47

    J . M. Williams Chi Squar e df v. 1.23 42

    Table 7. Glow ball decay data as in Table 6. Computations for

    corrected ch i square goodness of f i t of the curve in Fig. 7 . Three

    parameters ,I0, w , and a , were free to f i t .

    Ra w

    Nomial

    Descr ipt ion Residua l

    SS

    Residual

    Variance$F

    - Data2

    = 34.97 - -

    0 ( )V Data K Lres , , 0 9.179 0.4831 -

    1 ( )V Data K Lres , , 1 9.173 0.5096 0.948

    2 ( )V Data K Lres , , 2 8.868 0.5216 0.926

    3 ( )V Data K Lres , , 3 6.622 0.4139 1.167