27
1 ASTR633 Astrophysical Techniques STATISTICS PRIMER (original notes by Pat Henry, edited by Mike Liu and Jonathan Williams) Astronomers cannot avoid statistics: 1. We always deal with probabilities: observing time is limited, so we want to observe just long enough that there is a high probability we have seen what we sought. Sample sizes are inevitably finite. What is the probability that a particular interesting effect is real? 2. No experimentally determined quantity is of use unless it has an error associated with it. We need statistics to calculate errors. Here are some of the most common situations where astronomers use statistics: 1. Detection of a signal: is the gamma ray burst visible in the optical? Have I detected an emission line? 2. Are two quantities correlated? How significantly? 3. Estimate the parameters of a model. What are the errors on the parameters? Was the model reasonable in the first place? 4. Comparison of samples with (a) the predictions of a model (do they agree?) (b) each other (are they from the same population?) It generally comes down to common sense. 1. If it doesn’t look right, it probably isn’t. 2. There are lots of ways to screw up, but only one way to be right. 3. Most results are not revolutionary. Before you start drafting a press release, make sure you haven't made a mistake... 1. Sample and Parent Population Suppose we make N measurements, xi, of a quantity x. (e.g., multiple measurements of a stellar magnitude, the declinations of N stars in the galactic plane, etc.) These N measurements are called a sample. The parent population is a hypothetical infinite set of measurements of which our original N is assumed to be a random subset. The parent population is the “truth”, which we can never obtain. The fundamental task of statistics is to infer the properties of the parent population from the sample. (Note: this is the so-called “frequentist” interpretation of statistics. We’ll discuss the alternative Bayesian point of view later.)

ASTR633 Astrophysical Techniques STATISTICS PRIMER

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ASTR633 Astrophysical Techniques STATISTICS PRIMER

1

ASTR633AstrophysicalTechniques

STATISTICSPRIMER(originalnotesbyPatHenry,editedbyMikeLiuandJonathanWilliams)

Astronomerscannotavoidstatistics:

1. Wealwaysdealwithprobabilities:observingtimeislimited,sowewanttoobservejustlongenoughthatthereisahighprobabilitywehaveseenwhatwesought.Samplesizesareinevitablyfinite.Whatistheprobabilitythataparticularinterestingeffectisreal?

2. Noexperimentallydeterminedquantityisofuseunlessithasanerrorassociatedwithit.Weneedstatisticstocalculateerrors.

Herearesomeofthemostcommonsituationswhereastronomersusestatistics:

1. Detectionofasignal:isthegammarayburstvisibleintheoptical?HaveIdetectedanemissionline?

2. Aretwoquantitiescorrelated?Howsignificantly?3. Estimatetheparametersofamodel.Whataretheerrorsontheparameters?

Wasthemodelreasonableinthefirstplace?4. Comparisonofsampleswith

(a)thepredictionsofamodel(dotheyagree?)(b)eachother(aretheyfromthesamepopulation?)

Itgenerallycomesdowntocommonsense.

1. Ifitdoesn’tlookright,itprobablyisn’t.2. Therearelotsofwaystoscrewup,butonlyonewaytoberight.3. Mostresultsarenotrevolutionary.Beforeyoustartdraftingapressrelease,

makesureyouhaven'tmadeamistake...1.SampleandParentPopulationSupposewemakeNmeasurements,xi,ofaquantityx.(e.g.,multiplemeasurementsofastellarmagnitude,thedeclinationsofNstarsinthegalacticplane,etc.)TheseNmeasurementsarecalledasample.TheparentpopulationisahypotheticalinfinitesetofmeasurementsofwhichouroriginalNisassumedtobearandomsubset.Theparentpopulationisthe“truth”,whichwecanneverobtain.Thefundamentaltaskofstatisticsistoinferthepropertiesoftheparentpopulationfromthesample.(Note:thisistheso-called“frequentist”interpretationofstatistics.We’lldiscussthealternativeBayesianpointofviewlater.)

Page 2: ASTR633 Astrophysical Techniques STATISTICS PRIMER

2

1.1ProbabilityDensityFunctionTheparentpopulationleadstotheprobabilitydensityfunction,p(x),wherep(x)dxistheprobabilitythatanxiwillbeintherange[x,x+dx]

𝑝(𝑥)𝑑𝑥 = 𝐴(𝑥)𝐴)*)

Clearly,thetotalareaisunityprobability,i.e.∫ 𝑝(𝑥)∞

,- 𝑑𝑥 = 1Probabilitythata<x<b= ∫ 𝑝(𝑥)/

0 𝑑𝑥Probabilityispositive-definite:p(x)>=0

Allquantitiesofinterestareobtainedfromintegralsofp(x).Ifxisadiscretevariable,thentheintegralsturnintosums.1.2PropertiesoftheParentPopulation

Location Mean:

𝜇 ≡ 3 𝑥𝑝(𝑥)𝑑𝑥∞

,∞

= 1𝑁5𝑥6

7

689

Median

𝜇9/;,suchthat12 = 3 𝑝(𝑥)𝑑𝑥

DE/F

,∞

= 3 𝑝(𝑥)𝑑𝑥∞

DE/F

5 𝑝(𝑥6)

DE/F

689

=5 𝑝(𝑥6)7

DE/F

Mode

𝜇G0H,suchthat𝑃(𝜇G0H) ≥ 𝑃(𝑥 ≠ 𝜇G0H)

Page 3: ASTR633 Astrophysical Techniques STATISTICS PRIMER

3

ForasymmetricalPDF,these3areusuallyequal.ForanasymmetricalPDF,themedianismorestablethanthemean.Asinglebadpointcanbiasthemeanbyalargeamountbuthardlychangethemedian.(Thisisanexampleofa“robust”statistic.)

However,themedianisslightlynoisier(or“lessefficient”)thanthemean,withavariancethatisp/2=1.57xlarger(forlargeN).

Width Variance

𝜎; ≡ 3(𝑥 − 𝜇);𝑝(𝑥)𝑑𝑥∞

,∞

= N 3 𝑥;𝑝(𝑥)𝑑𝑥∞

,∞

O −𝜇;

≡ 1𝑁5

(𝑥6 − 𝜇);7

689

= 1𝑁5𝑥6;

7

689

−𝜇;

Standarddeviation𝜎 ≡ (variance)9/;

Moments

𝜇U ≡ 3(𝑥 − 𝜇)U𝑝(𝑥)𝑑𝑥∞

,∞

=1𝑁5

(𝑥6 − 𝜇)U7

689

μ0=1μ1=0μ2=σ2skewness≡ 𝛽9 ≡

DWDFW/F

deviationfromsymmetry=0forsymmetric,>0fortailextendingtopositive,andviceversa.

kurtosis≡ 𝛽; ≡

DXDFF− 3

degreeof"peakiness",wherethe-3makesthekurtosisofaGaussian=0

Page 4: ASTR633 Astrophysical Techniques STATISTICS PRIMER

4

1.3EstimatingPropertiesoftheParentPopulationfromaSampleSampleMean

�̅� ≡ 1𝑁[5𝑥6

7\

689

• Is�̅�agoodestimatorofμ?Whatisitsaveragevalue?

< �̅� >= 3 �̅�-

,-𝑝(𝑥)𝑑𝑥 = 3 _

1𝑁5𝑥6

7

689

`-

,-𝑝(𝑥)𝑑𝑥 =

1𝑁53 𝑥6𝑝(𝑥)𝑑𝑥

-

,-=

1𝑁5𝜇 = 𝜇

7

689

7

689

Thus�̅�isanunbiasedestimatorofμ.Onaverage,itgetstherightanswer.Whatisthevarianceof�̅�?

𝑉𝑎𝑟(𝑥) = 𝑉𝑎𝑟 N1𝑁[5𝑥6

7\

689

O = 1𝑁[;

5𝑉𝑎𝑟(𝑥6)7\

689

= 1𝑁[;

5𝜎;7\

689

= 𝜎;

𝑁[

Knownasthe“standarddeviationofthemean”orthe“standarderror”.SampleVariance

𝑠; ≡ 1

𝑁 − 15(𝑥6 − �̅�); = 1

𝑁 − 15𝑥6;7

689

7

689

−𝑁

𝑁 − 1 �̅�;

Notethe1/(N-1)insteadof1/N.Thisisbecausetheexpectedvalueofs2isσ2andisthereforeanunbiasedestimator(leftasanexerciseforthereader...)

Page 5: ASTR633 Astrophysical Techniques STATISTICS PRIMER

5

2.ErrorsNomeasurementofxiisinfinitelyprecise.Ithasanerrorassociatedwithit.Whataretheerrorsonvariouscomputedquantities?2.1AnalyticalconsiderationsConsidersomefunctionofuandv=f(u,v).Thepropagationoferrorscanbedeterminedasfollows: 𝑓(𝑢, 𝑣) = 𝑓(0,0) + 𝑢 jk

jl+ 𝑣 jk

jm+ ℎ𝑖𝑔ℎ𝑒𝑟𝑜𝑟𝑑𝑒𝑟𝑡𝑒𝑟𝑚𝑠

𝑓 −𝑓̅ = (𝑢 − 𝑢u) jk

jl+(𝑣 − �̅�) jk

jm+ ℎ𝑖𝑔ℎ𝑒𝑟𝑜𝑟𝑑𝑒𝑟𝑡𝑒𝑟𝑚𝑠,𝑤ℎ𝑒𝑟𝑒𝑓̅ = 𝑓(𝑢u, �̅�)

𝜎k; = lim7→-

97∑ (𝑓 − 𝑓̅); ≈ lim

7→-

97∑ |(𝑢6 − 𝑢u)

jkjl+(𝑣6 − �̅�)

jkjm};

𝜎k; = lim7→-

97∑ ~(𝑢6 − 𝑢u); �

jkjl�;+(𝑣6 − �̅�); �

jkjm�;+ 2 �jk

jl� �jk

jm� (𝑢6 − 𝑢u)(𝑣6 − �̅�)�

𝜎k; = �𝜕𝑓𝜕𝑢�

;

𝜎l; +�𝜕𝑓𝜕𝑣�

;

𝜎m; + 2 �𝜕𝑓𝜕𝑢� �

𝜕𝑓𝜕𝑣�𝜎lm

wherewehavedefinedthecovariance

𝜎lm = lim7→-

1𝑁5(𝑢6 − 𝑢u)(𝑣6 − �̅�)

Note:-Expansiontofirstorderonly,soonlytruefor“small”errors(e.g.σu/u~σv/v~10%),i.e.intheregimeof1storderTaylorseries.-Equationsaresimilarifmorethan2variablesareinvolved.-Firsttwotermsdominatesincepositivedefinite,whilethe3rdterm(covariance)canhavesomecancellation,asitcanbenegative.Itiszeroforuncorrelatedu&v(whichisoftenthecase).SeeProblemSet#2forexamples

Page 6: ASTR633 Astrophysical Techniques STATISTICS PRIMER

6

2.2Monte-CarloerrorpropagationEmpiricallydetermineerrorsbycreatingfakedatasets.Dothisifyouwishtoavoidmakinganyassumptionsabouttheunderlyingdistribution.

a) Iferrorsarewellcharacterized,jiggleeachdatapointusingGaussianrandomnumbers

b) Bootstrapanewsamplebypickingatrandomwithreplacement

Fitthemodeltothefakedataset.Dothismanytimesandmakeahistogramofthebest-fitparameters.Computethemean(ormedian)andstandarddeviationoftheparameters.Canreadilyextendtoanyfunctionoftheparameters.3.CommonlyusedProbabilityDensityFunctions(PDF)3.1UniformDistribution

𝑝(𝑥; 𝑎, 𝑏) = �1

𝑏 − 𝑎 , 𝑎 ≤ 𝑥 ≤ 𝑏

0, 𝑥 < 𝑎, 𝑥 > 𝑏

𝜇 = /�0

;𝜎; = (/,0)F

9;

Thissimpledistributionisusedasatoolinstudiesofgeneralcontinuousdistributionandisparticularlyvaluableinnon-parametricstatistics,e.g.,generatingrandomvaluesfromaspecificPDFasexplainedinthefollowing:Foranygivenfunction,𝑦 = 𝐹(𝑥),thePDFsofxandyarerelatedby

|𝑝(𝑥)𝑑𝑥| = |𝑝(𝑦)𝑑𝑦|fundamentaltransformationlawofprobability

Page 7: ASTR633 Astrophysical Techniques STATISTICS PRIMER

7

Nowconsiderthespecificcase:

𝑝(𝑥) = �1, 0 ≤ 𝑥 ≤ 10, 𝑥 < 0, 𝑥 > 1

then(for0<x<1),

𝑥 = 3 𝑝(𝑥′)𝑑𝑥′ = 3 𝑝(𝑦′)𝑑𝑦′�(H)

�(�)

H

Thisstatesthatthecumulativedistributionofp(y)isuniformlydistributed.Theutilityisbestshownbyexample(seeproblemset):ConsiderastellarIMF,𝜉(𝑀)~𝑀,;.��between1and100Msun.CalculatethecumulativePDF,

𝐶𝐷𝐹(𝑀) = ∫ 𝜉(𝑀)𝑑𝑀�9

∫ 𝜉(𝑀)𝑑𝑀9��9

= 𝐶(1 − 𝑀,9.��)

where𝐶[= (1 −100,9.��)]isaconstantthatnormalizestheCDFtounityat100Msun.Thishasauniformdistributionsowegeneratearandomsetofuniformlydistributednumbers{x0,x1,x2,…}andinverttoamassdistribution,{𝑀6} = (1 −{𝑥6}/𝐶),9/9.��

Page 8: ASTR633 Astrophysical Techniques STATISTICS PRIMER

8

3.2BinomialDistributionRecallthenumberofdifferentwaysnitemscanbetakenxatatimeis(“nchoosex”):

�𝑛𝑥� ≡

𝑛!𝑥! (𝑛 − 𝑥)!

wheren!=n(n-1)(n-2)…1and0!=1

Consideranobservationwithonly2possibleoutcomes(e.g.redgalaxiesorbluegalaxies;planetdetectionornon-detection).Lettheprobabilityofobtainingoneoutcome(redgalaxy,planetdetection=“success”)bep,andtheprobabilityofobtainingtheotheroutcome(bluegalaxy,planetnon-detection=“failure)beq=1-p.Theprobabilityofobtainingxsuccessesinnobservationsis=(#ofwaystogettoxsuccesses)x(probabilityofonesuchsetofxsuccesses)

𝑓(𝑥; 𝑛, 𝑝, 𝑞) ≡𝑛!

𝑥! (𝑛 − 𝑥)! 𝑝H𝑞U,H

Thisdoesindeedaddupto1,asitshould:

5𝑛!

𝑥! (𝑛 − 𝑥)! 𝑝H𝑞U,H

U

H8�

= (𝑝 + 𝑞)U = 1U = 1

Mean

𝜇 ≡ 3 𝑥𝑓(𝑥)𝑑𝑥-

,-¡⎯⎯£5 ~𝑥

𝑛!𝑥!(𝑛 − 𝑥)!𝑝

H𝑞U,H�7

H8�

𝜇 = 5𝑛!

𝑥!(𝑛 − 𝑥)!~𝑝𝜕𝜕𝑝𝑝

H� 𝑞U,H = 𝑝𝜕𝜕𝑝 �5 ~

𝑛!𝑥!(𝑛 − 𝑥)!𝑝

H𝑞U,H�7

H8�

¤7

H8�

𝜇 = 𝑝𝜕𝜕𝑝

(𝑝 + 𝑞)U= 𝑝𝑛(𝑝 + 𝑞)U,9 = 𝑛𝑝

Page 9: ASTR633 Astrophysical Techniques STATISTICS PRIMER

9

Variance

𝜎; ≡ 3 (𝑥 − 𝜇);𝑓(𝑥)𝑑𝑥-

,-¡⎯⎯£5~𝑥;

𝑛!𝑥!(𝑛 − 𝑥)!𝑝

H𝑞U,H� − 𝜇;7

H8�

¡⎯⎯£�𝑝

𝜕𝜕𝑝� �𝑝

𝜕𝜕𝑝�5~𝑥

𝑛!𝑥!(𝑛 − 𝑥)!𝑝

H𝑞U,H�7

H8�

−𝜇;

¡⎯⎯£�𝑝

𝜕𝜕𝑝�

[𝑝𝑛(𝑝 + 𝑞)U,9] −𝜇;

¡⎯⎯£ 𝑝𝑛 +𝑝;𝑛(𝑛 − 1) −𝑛𝑝; = 𝑛𝑝(1 − 𝑝) = 𝑛𝑝𝑞

E.g.,supposewerolltendice.Whatistheprobabilitythatxdicehavelandwiththe1up?Ifwethrowonedie,theprobabilityoflandingwith1upisp=1/6.Ifwethrow10dice,theprobabilityforxofthemlandingwith1upisgivenbythebinomialdistributionwithn=10andp=1/6:

𝑝 �𝑥; 𝑛 = 10, 𝑝 = 9¥, 𝑞 = �

¥� =

10!𝑥! (10 − 𝑥)!�

16�

H

�56�

9�,H

𝜇 = 𝑛𝑝 = 10 ×1 6© = 1.67𝜎 = «𝑛𝑝𝑞 = 10 ×1 6© ×5 6© = 1.181

3.3PoissonDistributionTheBinomialdistributiongetshardtoevaluateforlargen(becauseofthefactorial)andofteninsuchexperiments,neitherthenumberofpossibleeventsnnortheprobabilitypisknown.Weneedanexpressionthattellsusaboutthestatisticsofhavingdetectedanaveragenumberofeventspertimeinterval(μ=np).

Example:youareusingaGeigercountertomeasuretheemissionsfromablockofradioactivematerial.Youdon’tknowthetotalnumberofatoms(=n,thenumberoftrials)orthedecayprobability(=p),butyoudomeasurethemeancountrateμ.Youwanttoknowtheprobabilitydistributionassociatedwithμ.

Page 10: ASTR633 Astrophysical Techniques STATISTICS PRIMER

10

Lettheaveragecountrateatwhichphotonsarrivebeλpersecond.Let𝑃(𝑥, 𝜆𝑡)betheprobabilityofxphotonsarrivingduringanintervalt.Thentheprobabilityof1photonarrivingindtis

𝑃(1, 𝜆𝑡) = 𝜆𝑑𝑡

forverysmalldt.Theprobabilityof>=2arrivingisnegligiblysmallifdtissmallenough.So:

𝑝(0, 𝜆𝑡) = 1– 𝑃(1, 𝜆𝑡)– 𝑃(2, 𝜆𝑡)– 𝑃(3, 𝜆𝑡) −… .= 1– 𝜆𝑡

Nowconsideranarbitrarynumberofcountsintimeinterval(t+dt),whichcanbewrittenas2terms,basedonwhathappenedasthe“last”event(photonarrivedornophotonarrived):

𝑃°𝑥, 𝜆(𝑡 + 𝑑𝑡)± = 𝑃(𝑥 − 1, 𝜆𝑡)𝑃(1, 𝜆𝑑𝑡) + 𝑃(𝑥, 𝜆𝑡)𝑃(0, 𝜆𝑑𝑡)

𝑃(𝑥, 𝜆𝑡) +𝑑𝑃(𝑥, 𝜆𝑡)

𝑑𝑡 𝑑𝑡 = 𝑃(𝑥 − 1, 𝜆𝑡)𝜆𝑑𝑡 + 𝑃(𝑥, 𝜆𝑡)(1 − 𝜆𝑑𝑡)𝑑𝑃(𝑥, 𝜆𝑡)

𝑑𝑡 = 𝜆𝑃(𝑥 − 1, 𝜆𝑡) − 𝜆𝑃(𝑥, 𝜆𝑡)

Thesolutiontothisdifferentialequationis

𝑃(𝑥, 𝜆𝑡) =(𝜆𝑡)H

𝑥! 𝑒,²)

settingμ=λtgivesusthePoissondistribution

𝑝(𝑥; 𝜇) =𝜇H

𝑥! 𝑒,D

wherex=#ofevents(integernumber).μ=countrate.

Let’scheckthatthisisproperlynormalized:

5𝑓(𝑥; 𝜇) =∞

H8�

5 𝜇H

𝑥! 𝑒,D

H8�

= 𝑒,D 5𝜇H

𝑥!

H8�

= 𝑒,D𝑒D = 1

Page 11: ASTR633 Astrophysical Techniques STATISTICS PRIMER

11

Mean

𝜇 ≡ 3 𝑥𝑓(𝑥)𝑑𝑥-

,-¡⎯⎯£5 ~𝑥

𝜇H

𝑥! 𝑒,D� = 𝑒,D 5 ~𝑥

𝜇H

𝑥! �7

H8�

7

H8�

= 𝑒,D �0 +5~𝜇H

(𝑥 − 1)!�7

H89

¤ = 𝑒,D �𝜇5³𝜇H,9

(𝑥 − 1)!´7

H89

¤ = 𝑒,D𝜇𝑒D = 𝜇

Variance

𝜎; ≡ 3 (𝑥 − 𝜇);𝑓(𝑥)𝑑𝑥-

,-¡⎯⎯£5~𝑥;

𝜇H

𝑥! 𝑒,D� − 𝜇;

7

H8�

¡⎯⎯£𝑒,D𝜇(𝜇𝑒D +𝑒D) −𝜇; = 𝜇

Famousresultthat𝜎 = √𝜇.e.g.,ifwedetectNphotons,thentheerroris±√𝑁.Notethatsomecareisrequiredforverylowcountrates,whereN=0canoccurcommonly.WhenN=0,itwouldbesillytosaytheuncertaintyisalso0.TheuncertaintyinNcountsisthesquarerootoftheexpectednumberofcounts,𝜎(𝑁) = √𝜇where𝜇 = ⟨𝑁⟩.3.4Gaussian(orNormal)DistributionTheNormaldistributionisanapproximationtothebinomialdistributionforthelimitingcasewherethenumberofpossibledifferentoutcomesislargeandtheprobabilityofsuccessforeachisfinitelylarge,sonp>>1.ItIsalsothelimitingcaseforthePoissondistributionwhenμbecomeslarge.

𝑝(𝑥; 𝜇, 𝜎) = 1

𝜎√2𝜋𝑒,(H,D)F;ºF

Itisthemostimportantdistributioninstatistics!

Itisleftasanexercisetothereadertoshowthattheexpressionisindeedproperlynormalized,thatthemeanisμandthevarianceisσ2.BinomialandPoissonPDFstendtowardGaussiansastheirmeanincreases(μ≥20).You’llseereferencesto“log-normaldistribution”:thisiswhenthelogofavariablehasaGaussian(akanormal)distribution.

Page 12: ASTR633 Astrophysical Techniques STATISTICS PRIMER

12

3.4.1CentralLimitTheoremSupposethatnindependentrandomvariables,xi,ofunknownprobabilitydensityfunctionareidenticallydistributedwiththesamemeanμandvariance𝜎;(bothfinite).Asnbecomeslarge,thedistributionof�̅� = 9

U∑ 𝑥6 tendstoaGaussiandistributionwithmeanμ

andvariance𝜎;/𝑛.Alsoknownasthelawoflargenumbers,itallowsforquantitativeprobabilitiestobeestimatedinexperimentalsituationsinvolvinganaverage.3.4.2ConfidenceLimitsoftheGaussianDistributionTheprobabilitythatameasurementwillfallwithin±nσofthemeanis

𝑃(𝑛, 𝜎) = 3 𝑝(𝑥; 𝜇, 𝜎)𝑑𝑥

D�Uº

D,Uº

n P(-nσ<x<nσ)1 68.27%2 95.45%3 99.73%4 99.9937%5 99.999943%

Inotherwords,foraGaussian,100±20means:-Thereisa68%probabilitythat80≤μ≤120(with16%μislargerand16%μissmaller)-Thereisa95.5%chancethat60≤μ≤140-Thereisa99.7%chancethat40≤μ≤160Notethattheprobabilitytableisfora“2-tailed”probability,e.g.ifwewanttoknowtheprobabilityofanothertriallandingwithinagivenrange.Wealsocareabout“1-tailed”probability.

Forexample:whatisthechancethatthe100±20resultisconsistentwith0?Separationfrom0is(5×standarddeviation),sochanceis

1– 𝑃(5𝜎)

2 =1– 0.99999943

2 =5.7 × 10,½

2

Wesaythat100±20isa“5-sigmameasurement.”

Page 13: ASTR633 Astrophysical Techniques STATISTICS PRIMER

13

100±50isa2σmeasurementanditisconsistentwith0.The2σ(95.5%confidence)upperlimitis100+(2x50)=200.Thismeansthatthemeasurementis≤200atthe97.7%confidencelevel.Byconvention(thoughthisisfield-dependent),measurementswhichare<5σaretreatedwithcaution,andthosewhichare<3σareseenasconsistentwithzero.Whysuchstringentlimitsas3-5σ?Becauseanyobserver

• Isbiased,e.g.terminatedobservationwhenexpectedresultwasfound.• Can’testimateσ,e.g.chosea“nicequietpiece”ofthedataora“source-freeregion”

togetthebackground.YoucansimilarlycalculateconfidenceintervalsfordifferentdistributionssuchasBinomialandPoisson.Example:Detectionofasourceandmeasurementofitsbrightness.Wemeasure101countsina1''radiusaperturecenteredonthesourceand1800backgroundcountsinanannulusrangingfrom1''to5''radius.(a)Howconfidentarewethatthereisasourcethere?Weneedtoassesshowdifferentthesourceisfromthebackground,i.e.isthesourcestatisticallydistinctfromjustfluctuationsinthebackgroundcounts?

Expectedbackgroundin1′′aperture

= 1800

𝜋[(5ÆÆ); − (1ÆÆ);]𝜋(1ÆÆ); =

180024

= 75counts𝜎/Ç(1ÆÆ) = √75 = 8.66fromPoissonstatistics.canusePoissonherebecausehave>20counts.Netcountsabovebackground=101–75=26

Significanceofdetectionabovebackground=26/8.66=3.0σConfidence=(99.73+0.27/2)=99.87%(extra0.135isbecauseit’ssingle-tailed)

(b)Howbrightisit?Totalflux=101 ±√101

Page 14: ASTR633 Astrophysical Techniques STATISTICS PRIMER

14

Backgroundin1’ = 1800 ± √1800𝜋[(5ÆÆ); − (1ÆÆ);]𝜋

(1ÆÆ); =1800 ± 42.4

24 = 75 ± 1.8cts

Netflux=101–75=26Uncertaintyonnetflux=√101 +1.8; = 10.2(onlymarginallylargerb/cofuncertaintyinskybackground)Significanceofmeasurement=26/10.2=2.7σ

Itismoredifficulttomeasurethefluxthantodetermineexistence.E.g.measuringexistenceisonlya1-bitmeasurement(yesvsno),whereasmeasuringthefluxyouneedmoreinfo.3.4.3BivariateGaussianDistributionTestAsitsnamesuggests,it’sthejointGaussiandistributionoftwovariables.

𝑝°𝑥, 𝑦;𝜇H, 𝜇�, 𝜎H, 𝜎�, 𝜌± = 1

2𝜋𝜎H𝜎�(1 − 𝜌;)𝑒

,ËF;(9,ÌF)

where

𝑧; = (𝑥 − 𝜇H);

𝜎H;+

(𝑦 − 𝜇�);

𝜎�;−2𝜌(𝑥 − 𝜇H)(𝑦 − 𝜇�)

𝜎H𝜎�

andthePearsoncorrelation

𝜌 = 𝐶𝑜𝑟(𝑥, 𝑦) = 𝑉𝑎𝑟(𝑥, 𝑦)𝜎H𝜎�

𝑉𝑎𝑟(𝑥, 𝑦) = 𝜎H�; = 1𝑁5(𝑥6 − 𝜇H)(𝑦6 − 𝜇�)

7

689

Inmatrixnotation,

𝑝°𝑥, 𝑦;𝜇H, 𝜇�, 𝜎H, 𝜎�, 𝜌± = 1

2𝜋𝜎H𝜎�(1 − 𝜌;)exp �−

12𝐷

Î𝐶,9𝐷�

where

𝐷 =Ï𝑥 − 𝜇H𝑦 − 𝜇�

Ð , 𝐶 = Ï𝜎H; 𝜎H�;

𝜎H�; 𝜎�;Ð = "𝐶𝑜𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒𝑚𝑎𝑡𝑟𝑖𝑥"

Page 15: ASTR633 Astrophysical Techniques STATISTICS PRIMER

15

Shapelooksmorecircularforsmallρ,moreellipticalforlargeρ

Canreadilygeneralizeto>2variables4.MaximumLikelihoodofGaussianVariablesSupposewehaveNdatapoints𝑦6witherrors𝜎6 thatareindependentofothersandGaussiandistributedabout𝑦u,whichyoudonotknowbutwanttofindout.Theprobabilityofany𝑦6 is

𝑃(𝑦6)𝛥𝑦 = 1

𝜎6√2𝜋𝑒,(�Ô,�u)

F

;ºÔF 𝛥𝑦

TheprobabilityofallN𝑦6'sisjusttheproductoftheindividualprobabilities

𝑃(𝑦6)𝛥𝑦 = Ö𝑃(𝑦6)𝛥𝑦7

689

=Ö1

𝜎6√2𝜋𝑒,(�Ô,�u)

F

;ºÔF 𝛥𝑦

7

689

Definethelikelihoodasℒ ≡ −2 ln𝑃,whichforourcaseis:

ℒ = 5(𝑦6 − 𝑦u);

𝜎6;

7

689

+ 25lnØ𝜎6√2𝜋Ù7

689

− 2𝑁𝑙𝑛𝛥𝑦

Theprincipleofmaximumlikelihoodisthemostprobableestimateof𝑦uoccurswhenℒisminimized.

6 The Bivariate Normal Distribution

which is just the product of two independent normal PDFs. We can get someinsight into the form of this PDF by considering its contours, i.e., sets of pointsat which the PDF takes a constant value. These contours are described by anequation of the form

x2

σ2X

+y2

σ2Y

= constant,

and are ellipses whose two axes are horizontal and vertical.In the more general case where X and Y are dependent, a typical contour

is described byx2

σ2X

− 2ρxy

σXσY+

y2

σ2Y

= constant,

and is again an ellipse, but its axes are no longer horizontal and vertical. Figure4.11 illustrates the contours for two cases, one in which ρ is positive and one inwhich ρ is negative.

x

y

x

y

Figure 4.11: Contours of the bivariate normal PDF. The diagram on the left(respectively, right) corresponds to a case of positive (respectively, negative) cor-relation coefficient ρ.

Example 4.28. Suppose that X and Z are zero-mean jointly normal randomvariables, such that σ2

X = 4, σ2Z = 17/9, and E[XZ] = 2. We define a new random

variable Y = 2X − 3Z. We wish to determine the PDF of Y , the conditional PDFof X given Y , and the joint PDF of X and Y .

As noted earlier, a linear function of two jointly normal random variables isalso normal. Thus, Y is normal with variance

σ2Y = E

[(2X − 3Z)2

]= 4E[X2] + 9E[Z2] − 12E[XZ] = 4 · 4 + 9 · 17

9− 12 · 2 = 9.

Hence, Y has the normal PDF

fY (y) =1√

2π · 3e−y2/18.

Page 16: ASTR633 Astrophysical Techniques STATISTICS PRIMER

16

4.1WeightedMeanThemostlikelyvalueof𝑦uistheminimumvalueofℒ

𝜕ℒ𝜕𝑦u

= 5𝜕𝜕𝑦u

7

689

³(𝑦6 −𝑦u);

𝜎6;´ = −25

(𝑦6 −𝑦u)𝜎6;

7

689

= 0

=>5𝑦6𝜎6;

7

689

= 𝑦u51𝜎6;

7

689

=>𝑦u = ∑ 𝑤6𝑦67689

∑ 𝑤67689

Û wheretheweights𝑤6 = 1 𝜎6;©

aka“inverse-varianceweighting”.Notethatforequalweights,𝜎6 = 𝜎,

𝑦u = 1𝑁5𝑦6

7

689

asbefore

Theerrorontheweightedmeancomesfromourpreviousresultonerrorpropagation.

𝜎�u; = 5�𝜕𝑦u𝜕𝑦6

�;7

689

𝜎6; = 5𝜎6; Ï𝜕𝜕𝑦6

³∑ 𝑤6𝑦67689

∑ 𝑤67689

´Ð;7

689

=5𝜎6;

(∑ 𝑤67689 );

7

689

_𝜕𝜕𝑦6

Ý5𝑤6𝑦6

7

689

Þ`

;

=5𝜎6;𝑤6;

(∑ 𝑤67689 );

7

689

= ∑ 𝑤67689

(∑ 𝑤67689 );Û = 1 ∑ 𝑤67

689©

Inotherwords,

1𝜎�u;

= 51𝜎6;

7

689

if𝜎6 = 𝜎(dataallhavesameuncertainties),then𝜎�u = 𝜎/√𝑁asbefore.

Page 17: ASTR633 Astrophysical Techniques STATISTICS PRIMER

17

4.2LinearRegressionorWeightedLeastSquaresNowsupposewehavemeasured2quantitiesinpairs,{𝑥6, 𝑦6}.Wethinkthat𝑦 = 𝑎 + 𝑏𝑥.Howdowederiveaandbfromourdata?Note:thisformisnotasrestrictiveasitseems.Wecantransformlotsofrelationsintothisform,e.g.𝑦 = 𝐴𝑥ß => log 𝑦 = log𝐴 + 𝛼 log 𝑥andthenperformanalysisonlogyandlogx.

ℒ = 25(𝑦6 − 𝑎 − 𝑏𝑥6);

2𝜎;

7

689

+ 25ln𝜎6√2𝜋7

689

− 2𝑁 ln ∆𝑦

𝜕ℒ𝜕𝑎 = 0 ⟹5

𝑦6𝜎6;= 𝑎5

1𝜎6;+ 𝑏5

𝑥6𝜎6;

𝜕ℒ𝜕𝑏 = 0 ⟹5

𝑥6𝑦6𝜎6;

= 𝑎5𝑥6𝜎6;+ 𝑏5

𝑥6;

𝜎6;

Twolinearequationsintwounknowns,aandb,withsolution

𝑎 = 1∆Ï5

𝑥6;

𝜎6;5

𝑦6𝜎6;−5

𝑥6𝜎6;5

𝑥6𝑦6𝜎6;

Ð , 𝑏 = 1∆ Ï5

1𝜎6;5

𝑥6𝑦6𝜎6;

−5𝑥6𝜎6;5

𝑦6𝜎6;Ð

wherethedenominator,

∆= 51𝜎6;5

𝑥6;

𝜎6;− Ï5

𝑥6𝜎6;Ð;

Theuncertaintiesinaandbfollowfromerrorpropagation,

𝜎0; = 5�𝜕𝑎𝜕𝑦6

�;

𝜎6; = 1∆5

𝑥6;

𝜎6;,𝜎/; = 5�

𝜕𝑏𝜕𝑦6

�;

𝜎6; = 1∆5

1𝜎6;

Note1:numpy.polyfitwithdeg=1doesthisNote2:don’tthrowawayupperlimits–theyhaveinformation.We'regettingaheadofourselvesbutmodernfolksuseaBayesianapproachtolinefittingthatfits“censored”data:Kelly2007,ApJ,665,1489

Page 18: ASTR633 Astrophysical Techniques STATISTICS PRIMER

18

4.3LinearCorrelationIntheabsenceofanyhypothesis,anyknowledge,oranythingbettertodo,weoftencorrelateyiagainstxiinthehopeofdiscoveringsomeNewandUniversalTruth.Beforedoingso,youshouldask:-Doestheeyeseeanything?Ifnot,stopunlesstryingtodisprovesomehypothesis.-Istheapparentcorrelationduetoaselectioneffect?Commonmistakesarefluxlimits.-ApplytheRuleofThumb.Doesthecorrelationgoawayifyouplaceyourthumboversomeofthedata?UsethePearsoncorrelation,definedearlierforbivariateGaussiandistributions,

𝜌or𝑟 =𝜎H�;

𝜎H𝜎�

r=-1or+1meansdataperfectlyfitastraightline.

• “r”measuresthedegreeoflinearcorrelationbetween2variableswithoutknowledgeoftheerrors.Shouldnotbeusedfornon-linearrelationship.(Checkforthisbyplottingthedataandtakingalook!)

• Nodistinctionbetweenx&yintheformula.Eithercanbethedependent/independentvariable.

• Assumesvariablesareapproximatelynormallydistributed.• Tellsnothingabouttheslopeofthebestfittingline,onlythedegreeofcorrelation.

Forfinite-sizeddatasets,canhavefiniterevenforuncorrelateddata,andvice-versa,simplyduetotheuncertaintyin“r”.Canshowthat

𝜎ã = 1 − 𝑟;

√𝑁 − 1

Notethat𝜎ã CANNOTbeuseddirectlytoindicatesignificanceofacorrelationand/orwhetheroneobservedcorrelationissignificantlystrongerthananother.Itonlytellstheerroronthemeasurementofrinthissample.4.3.1Student'stdistributionInthecasewherex&yforma2-dimensionalGaussianabouttheirmeanvalues,thenwecantestwhethertheobserved“r”isconsistentwithaparentpopulationwithnocorrelation(r=0).Todoso,usethequantity

𝑡 ≡ 𝑟ä𝑁 − 21 − 𝑟;

Page 19: ASTR633 Astrophysical Techniques STATISTICS PRIMER

19

whichisdistributedintheno-correlationcaseastheStudent’st-distributionwith𝜈 = 𝑁 −2degreesoffreedom:

𝑓(𝑡; 𝜈) = 𝛤(𝜈 + 12 )

√𝜈𝜋𝛤 �𝜈2� �1 +𝑡;𝜈 �

(ç�9) ;⁄

wherethegammafunctionisdefinedas

𝛤(𝑧) = 3 𝑡Ë,9𝑒)𝑑𝑡∞

andisjustthefactorialfunctionextendedtonon-integers.forintegerx,Γ(x+1)=x!

𝑓(𝑡; 𝜈)hasμ=0,𝜎; = 𝜈/(𝜈 − 2)forν>2

Inthiscase,wewanttheprobabilityforthe2-taileddistribution,∫ 𝑓(𝑡; 𝜈)𝑑𝑡)

,) .Ifwealreadyknewthesignofthecorrelation,wewoulduse1-tail∫ 𝑓(𝑡; 𝜈)𝑑𝑡)

,∞

FigurefromWikipediahttps://en.wikipedia.org/wiki/Student%27s_t-distributionpythonhasthiscodedupofcourse…Example:supposewefindr=0.5forN=10datapoints.

𝑡 = 0.5ä80.75 = 1.63

from scipy.stats import t c = t.cdf(1.63, 8) à 0.93=>7%chancethattwouldbehigherinaonesidedtestsig = 100*(1-2*(1-c)) à 86%significanceintwo-tailedtestStudentwasthepseudonymofW.S.Gossett(1876-1937).HewasachemistwhoworkedforGuinnessBreweryinDublin,Ireland,developedt-testtostudythequalityofbrewingingredients.Guinnessdidnotallowtheirchemiststopublishtheirfindings(toavoidtheircompetitorsfromlearningtheywereemployingstatisticians),hencethepseudonym.

Page 20: ASTR633 Astrophysical Techniques STATISTICS PRIMER

20

4.3.2Fisherz-transformationWiththesameassumptionthatx&yaredrawnfromatwo-dimensionalGaussiandistribution,wecantestwhetherthedifferenceoftwononzero“r”valuesissignificantforN>10datasets.E.g.ifachangeinsomecontrolvariablesignificantlyaltersanexistingcorrelationbetweentwovariables.UseFisher’sz-transformation:

𝑧 = 9;𝑙𝑛 �9�ã

9,ã�=arctanh(r)

thisconvertsthePearsoncoefficienttoanapproximatelynormallydistributedvariable,z,withameanvalueof

𝑧̅ =12 ~𝑙𝑛 �

1 + 𝜌1 − 𝜌� +

𝜌𝑁 − 1�

andastandarddeviationof

𝜎Ë ≈1

√𝑁 − 3

wherer isthetruevalueofthecorrelationcoefficientfortheparentpopulation(tobetestedagainstthemeasured“r”).WecanusetheGaussianprobabilitytablesforthisandalsototestthesignificanceofadifferenceintwovaluesof“r”.Youcanalsoassumeaρ,createagaussianz,andinvert(=tanhz)tocreateconfidenceintervalsforrforlargesamples. 4.4Chi-squaredWehavediscussedprobabilitydistributionsandtheirstatistics.Nowwediscusswhetheraparticulardistributionandmodelactuallyfitthedata.Definea“badnessoffit”metric:

𝜒; = 5Ï𝑦6 − 𝑦°𝑥6, 𝑎ê±

𝜎êÐ

7

689

;

where𝑦°𝑥6, 𝑎ê±isageneralfunctionof𝑥6 withmodelparameters𝑎ê .

NoteforPoissonvariables,𝜎6 = ë𝑦°𝑥6, 𝑎ê±,butinordertogetconfidencesfrom𝜒;,we

need𝑦(𝑥6, 𝑎ê) > 20forall𝑥6 .Ifnotpossible,thenneedtobinupthedata(sumtheyi)forenoughxitomakeitso.

Page 21: ASTR633 Astrophysical Techniques STATISTICS PRIMER

21

Procedurethenistoadjustmodelparameters,aj,untilthe𝜒;isminimized(i.e.leastbad)à𝜒G6U; .Thisthenmaximizesthelikelihood.#ofdegreesoffreedom=ν=(N–M),where

N=#ofdatapointsM=#ofmodelparameters

4.4.1Distributionofchi-squared

𝑝(𝜒;; 𝜈) = (𝜒;)

ç,;;

2ç;𝛤 �𝜈2�

e,ìF;

where𝜇 = 𝜈, 𝜎; = 2𝜈, 𝜈 = 𝑁 −𝑀.DistributiongoestoGaussianasNà∞

4.4.2GoodnessoffitForthebestfit,theexpectedvalueis

𝜒G6U; = (𝑁 −𝑀) ± «2(𝑁 −𝑀) = 𝜐 ± √2𝜐Soifitisn’t,weknowwehaveabadfit.Thiscancomefrom

• Wrongmodel:𝜒; ≫ 𝜈• Wrongmeasurementerrors:𝜒; ≪ 𝜈(toopessimistic),≫ 𝜈(toooptimistic)• Dataarenotnormallydistributed:𝜒; ≫ 𝜈

Thisisoftenwrittenas“reduced𝜒;”=𝜒ýG6U; = 𝜒G6U,ç; = 𝜒G6U;

𝜈©

𝜒ýG6U; = 1 ± ä2

𝑁 −𝑀

sothebestfitoccursaround𝜒ýG6U; ≈ 1,thoughthescatteraroundthisdependsonDOF.4.4.3.ConfidenceLimitsforGoodnessofFitWhatweneedistheprobabilityofobserving𝜒; > 𝜒G6U;

𝑃°> 𝜒G6U,"; ± = 1

2"/;Γ(𝜐/2)3(𝜒;)(",;)/;𝑒,$F/;

-

$%Ô&F

𝑑(𝜒;)

seescipy.stats.chi2anduseinthesamewayasforthestudent-tdistributionabove

Page 22: ASTR633 Astrophysical Techniques STATISTICS PRIMER

22

4.4.4ConfidenceLimitsfortheFittedParameters(OBSOLETE)WetypicallywanttoidentifyaregionintheM-dimensionalspaceoftheparameters{aj}aboutthebestfitthatcontainsagivenpercentageofthetotalprobabilitydistribution,e.g.“thereisa99%chancethatthetrueparametervaluesareinthisregion.”Observergetstopicktheconfidencevalue,butcertainonesarecommon,e.g.,68%(1-sigma),95%(2sigma),etc.Notethatthesevaluesareaconventiontiedtoagaussiandistribution,eventhoughtheprobabilitydistributionoftheparametersareoftennotnormallydistributed.Conventionally,onechooses{aj}suchthat𝜒; = 𝜒G6U; +𝛥𝜒;(𝐶𝐿,#𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟𝑠).𝛥𝜒;isdistributedas𝜒;with𝜈 = 𝑀(=numberofparameters)degreesoffreedom,NOTwith𝜈 = 𝑁 −𝑀(=theactualdegreesoffreedomofthefit).Youcanmarginalizethistoasubset{aj}ofM’parameters.Then𝛥𝜒;isdistributedas𝜒;with𝜈 = 𝐼(=numberofinterestingparameters)degreesoffreedom.Example:M=1(i.e.onlyfitting1parameter).

Inthiscase,chi-squaredistributionforν=1isthesameasthesquareofasinglenormallydistributeddistribution(i.e.Gaussian).

𝛥𝜒; < 1occurs68.3%ofthetime(1sigmafornormaldistribution)𝛥𝜒; < 4occurs95.5%ofthetime(2sigma)𝛥𝜒; < 9occurs99.7%ofthetime(3sigma)

Example:M=2

1-sigmaCL:

fora1is[X1,X2]fora2is[Y1,Y2]

1-sigmaCLonbothparametersispurpleellipse,i.e.68.3%ofalldeterminationsofa1anda2shouldlieintheellipse.

CAUTION:verylimitedapplicationinrealastrophysicalsituations.See“DosandDon’tsofReducedChi-Squared”byAndraeetal.(arXiv:1012:3754).UseBayesianmethodsinstead.

Page 23: ASTR633 Astrophysical Techniques STATISTICS PRIMER

23

4.4.5.SignificanceofAddingAnotherParameterWeoftenwanttoknowifitisnecessarytoaddanotherparametertoamodelwehavebeenfittingtoourdata.E.g.,mightbereddeninginMilkyWayandinthehostgalaxytoasupernova.𝜒;willofcoursebelowerbecausetheadditionalfreedomenablesustogetthemodelcurveclosertothedata.Butisthedecreasesignificant?Supposewehavetwomodelswith𝜒U;and𝜒G; andnandmDOF,respectively.Therearetwousefulstatisticstotestifthe𝜒;valuesbetweenthe2modelsaresignificantlydifferent.(1)TheFstatistic:

𝐹U,G ≡𝜒U; 𝑛©𝜒G; 𝑚©

followstheFprobabilitydistribution:

𝑝U,G(𝑥) = Γ �𝑛 + 𝑚2 �𝑛U/;𝑚G/;

Γ(𝑛/2)Γ(𝑚/2) 𝑥U;,9

(𝑚 + 𝑛𝑥)(U�G)/;

hasmeanandvariance

𝜇 = 𝑚

𝑚 − 2

𝜎; = 2𝑚;(𝑚 + 𝑛 − 2)𝑛(𝑚 − 2);(𝑚 − 4)

WewanttheprobabilityofobservingsuchalargeFvalue:

𝑃(> 𝐹; 𝑛,𝑚) = 3 𝑝U,G(𝑥)𝑑𝑥-

*

Caution:definitionofthestatisticdoesnotdistinguishbetweenexperiment“1”and“2”.Sowecanform2statistics,onethereciprocaloftheother(F12andF21).BotharedistributedaccordingtotheFdistribution.Typically,wetestboth,checkingthatF12isnottoolargeandF21isnottoosmall.Asusual,pythonisyourfriend;scipy.stats.f

CAVEAT:Twoimportantconditionsmustbesatisfiedtousethisstuff(Protassovetal2002,ApJ,571,545–readthesummarysection6)

Page 24: ASTR633 Astrophysical Techniques STATISTICS PRIMER

24

1. Thetwomodelsyouarecomparingmustbe“nested”,i.e.allowedparametersvaluesofonemodelmustbeasubsetofthoseoftheothermodel.(SeealsoFreemanetal1999,APJ,524,53).e.g.Cannotcompareblackbodymodelwithsynchrotronemissionmodel,butcancomparegoodnessoffitforeachwith𝜒;fitting.

2. Zerovaluesofadditionalparametersmustnotbeontheboundaryofpossible

parameters.E.g.cannotcompare2pointsourceswith1,nordetectionofanemissionlineinaspectrum.Neithercaseallowsfornegativeflux.

Legitimateuses:-brokenpowerlawvssinglepowerlaw-non-solarvssolarabundance-comparingvariancesof2samples4.4.6.ProsandConsofChi-squaredPros:

• Mostpeoplehaveheardofthis,someevenaccepttheresultsJ• Sinceadditivebydefinition,differentsamplescanbetestedallatonce.• Automaticallygivesanestimateofwhethermodelisacceptable.

Cons:

• Datamustusuallybebinnedàlossofinformation.• Datamustbenormallydistributed.• Ifdatadonotagreewiththemodel,cannottellwhichdirectionisoff.• Cannotbeusedwithsmallsamples(<~20).• SeeAndraeetal.2010(arXiv:1012.3754)

5.RankTestsAgeneralsetoftestscomesfromreplacingthevaluesofNpairsofmeasurements(xi,yi)withtheirranks(Ri,Si).e.g.ifwehave4pairs(xi,yi)=(1,3)(5,0)(3,2)(4,1)

thenwehave(Ri,Si)=(1,4)(4,1)(2,3)(3,2)Forsimplicity,assumenoties.(SeeNumericalRecipesforthemoregeneralcase.)Whydothis?Becauseranksaredrawnfromauniformdistributionbetween1andN,witheachrankoccurringonlyonce(assumingnoties).Fromtheuniformdistribution: 𝑅u = 𝑆 ̅ = (𝑁 + 1) 2⁄ ,𝜎-; = 𝜎.; = (𝑁 − 1); 12⁄ ,etc.

Page 25: ASTR633 Astrophysical Techniques STATISTICS PRIMER

25

Wehavethereforetransformedfromvariableswithunknowndistributionstooneswithknowndistributions.Thereissomelossofinformationinreplacingtheoriginalnumbersbytheirranks,butnotmuch.Andthestatisticsofranksaremorerobustthanstatisticsoftheoriginalvariables,justasthemedianismorerobustthanthemean(andslightlynoisier).5.2.1SpearmanRankCorrelationCoefficientDefinetobethelinearcorrelationcoefficientoftheranks:

𝑟. =∑ (𝑅6 − 𝑅u)6 (𝑆6 − 𝑆 ̅)

«∑ (𝑅6 − 𝑅u);6 «∑ (𝑆6 − 𝑆 ̅);6= 1 − 6

∑ (𝑅6 − 𝑆6);6

𝑁� − 𝑁

(laststepoccursifnoties,becausevaluesofRiandSiareknown)

Obviouslywhenx&yarecorrelated,R&Swillbetoo.Asbefore,−1 ≤ 𝑟. ≤ +1.Ahighvalueindicatessignificantcorrelation.Totestthelevelofsignificance,canactuallycalculatethedistributionexplicitlyforNsmallwhenR&Sareuncorrelated.IfN>50,cancompute

𝑡 ≡ 𝑟.ä𝑁 − 21 − 𝑟.;

whichisdistributedaccordingtoStudent’ststatistic,withN-2degreesoffreedom.scipy.stats.spearmanr 5.2.2Kolmogorov-Smirnov(K-S)TestUsedforunbinneddatathatareacontinuousfunctionofasinglevariable,i.e.alistofvalues,e.g.distributionofprotoplanetarydiskmasses.Determineswhetherasampleagreeswithafunction(“one-sided/sample”)orwhether2samplescomefromthesameparentpopulation(“two-sided/sample”).Pros:--nolossofinformation--canbeusedforverysmallsamplesCons:--cannotbeusedforparameterestimation

Page 26: ASTR633 Astrophysical Techniques STATISTICS PRIMER

26

Thetestcomparesthecumulativedistributionofranks:

• Rankyoursampleinascendingorderofx.• CalculateSN(x),where

𝑆7(𝑥) = /

0𝑥 < 𝑥9

𝑟𝑁𝑥ã ≤ 𝑥 < 𝑥ã�9

1𝑥 ≥ 𝑥7

andNisthesizeofthesample.SN(x)isthefractionofthedatawithvalueslessthanx.

• Ifhavetwosamples,thencalculateSN1(x)andSN2(x)• Ifhaveafunction,thencalculate𝐹(𝑥) ≡ ∫ 𝑓(𝑦)𝑑𝑦H

,∞

• TheK-Sstatisticis

𝐷9,[601 = max,∞2H2∞

|𝑆7(𝑥) − 𝐹(𝑥)|𝐷;,[601 = max

,∞2H2∞3𝑆7E(𝑥) − 𝑆7F(𝑥)3

WhatmakestheK-Sstatisticusefulisthatitsdistributioninthecaseofdatadrawnfromthesame(unknown)distributioncanbecalculated.Andit'sallhere,readytoplugn'play:scipy.stats.kstest

Page 27: ASTR633 Astrophysical Techniques STATISTICS PRIMER

27

6.PartingComments

• Don'thidedata.• Trytousedistribution-freetests.• Lotsofteststochoosefrom,butnotallareequallypowerfulforagivenapplication.• Don’tgettooenamored/lostintheworldofstatistics.Youarebudding

astronomers,notbuddingstatisticians.• USECOMMONSENSE.