28
Statistical Issues Statistical Issues in High-Energy in High-Energy Gamma-Ray Gamma-Ray Astronomy for Astronomy for GLAST GLAST PHYSTAT2003 PHYSTAT2003 S. Digel (Stanford S. Digel (Stanford Univ./HEPL) Univ./HEPL) 10 September 2003

Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

  • Upload
    kurt

  • View
    32

  • Download
    0

Embed Size (px)

DESCRIPTION

Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST. PHYSTAT2003 S. Digel (Stanford Univ./HEPL). 10 September 2003. Outline. Introduction Gamma-ray astro- & astroparticle physics Important points about gamma-ray astronomy GLAST mission - PowerPoint PPT Presentation

Citation preview

Page 1: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Statistical Issues in Statistical Issues in High-Energy High-Energy Gamma-Ray Gamma-Ray

Astronomy for Astronomy for GLASTGLASTPHYSTAT2003PHYSTAT2003

S. Digel (Stanford S. Digel (Stanford Univ./HEPL)Univ./HEPL)10 September 2003

Page 2: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

OutlineOutline IntroductionIntroduction

Gamma-ray astro- & astroparticle physicsGamma-ray astro- & astroparticle physics Important points about gamma-ray Important points about gamma-ray

astronomyastronomy GLAST missionGLAST mission

LAT instrument design and nature of the LAT instrument design and nature of the datadata

LAT in perspectiveLAT in perspective Analysis needs from low to high levelAnalysis needs from low to high level

Statistical issuesStatistical issues Some current approachesSome current approaches

Page 3: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Motivation: Wealth of Motivation: Wealth of astro- and astroparticle astro- and astroparticle

physicsphysics ExtragalacticExtragalactic

Blazars – most of their luminosity is in gamma raysBlazars – most of their luminosity is in gamma rays Other active galaxies – Centaurus AOther active galaxies – Centaurus A Galaxy clusters?Galaxy clusters? Isotropic emission?Isotropic emission? Gamma-ray burstsGamma-ray bursts

In the Milky WayIn the Milky Way Pulsars, binary pulsars, millisecond pulsars, Pulsars, binary pulsars, millisecond pulsars,

plerionsplerions Microquasars, microblazarsMicroquasars, microblazars Supernova remnants, OB/WR associations, black Supernova remnants, OB/WR associations, black

holes?holes? Diffuse – cosmic rays interacting with interstellar Diffuse – cosmic rays interacting with interstellar

gas and photonsgas and photons WIMP annihilation?WIMP annihilation?

Solar flaresSolar flaresCommon theme (except for WIMPS): Nonthermal emission, particle acceleration in jets and shocks

Crab pulsar & nebula (CXC)

M87 jet (STScI)

Page 4: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Some important points Some important points about gamma-ray about gamma-ray

astronomyastronomy In the range up to ~50 GeV, In the range up to ~50 GeV,

the the detector must be in spacedetector must be in space In terms of the particle In terms of the particle

background, mass & power background, mass & power limitations, cost, review limitations, cost, review committees, etc., committees, etc., space is the space is the last place you want to put itlast place you want to put it Among other compromises, the Among other compromises, the

collecting area collecting area and and data ratedata rate are are limitedlimited

You need one of these

Page 5: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Important points (2)Important points (2)

The The angular response is really angular response is really bad bad (for physics reasons)(for physics reasons)

On the other hand the On the other hand the field of field of view is truly enormousview is truly enormous (the (the detector is not really a detector is not really a telescope) telescope)

Celestial Celestial fluxes are lowfluxes are low (except (except for impulsive GRBs)for impulsive GRBs) Photon number fluxes typically Photon number fluxes typically

~~EE-2-2

The Milky Way is a The Milky Way is a bright, bright, pervasive foregroundpervasive foreground ~10% of flux at low latitudes is ~10% of flux at low latitudes is

from point sourcesfrom point sources

Chandra ~1"

LAT (100 MeV)

12000"

LAT (10 GeV)

360"

Chandra

8 × 108 × 10-5-5 srsr

LAT 2.2 sr

PSF

FOV

γ-ray rates in LAT

Bright pt. src.

1/1/minuteminute

Entire FOV

2 Hz

Page 6: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Large Area Telescope on Large Area Telescope on GLASTGLAST 20 MeV to >300 GeV20 MeV to >300 GeV

Launch in late 2006Launch in late 2006 5-year design life (10-5-year design life (10-

year goal)year goal)

Spectrum Astro

Page 7: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Design of the LAT for Design of the LAT for gamma-ray detectiongamma-ray detection

e+ e–

TrackerTracker 18 XY tracking 18 XY tracking

planes with interleaved W planes with interleaved W conversion foils. Single-sided conversion foils. Single-sided silicon strip detectors (228 silicon strip detectors (228 μμm pitch). Measure the m pitch). Measure the photon direction; gamma ID.photon direction; gamma ID.

CalorimeterCalorimeter 1536 CsI(Tl) 1536 CsI(Tl) crystals in 8 layers; PIN crystals in 8 layers; PIN photodiode readouts. Image photodiode readouts. Image the shower to measure the the shower to measure the photon energy.photon energy.

Anticoincidence Anticoincidence Detector (ACD)Detector (ACD) 89 plastic 89 plastic scintillator tiles. Reject scintillator tiles. Reject background of charged background of charged cosmic rays; segmentation cosmic rays; segmentation limits self-veto at high limits self-veto at high energy.energy.

Calorimeter

Tracker

ACD

Electronics SystemElectronics System Includes flexible, robust Includes flexible, robust hardware trigger and software hardware trigger and software filters.filters.

Page 8: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

LAT in perspectiveLAT in perspective

Within its Within its first few weeksfirst few weeks, the LAT will , the LAT will doubledouble the number of celestial gamma rays ever the number of celestial gamma rays ever detecteddetected

Instrument

Years Ang. Res.(100 MeV)

Ang. Res. (10 GeV)

Energy Range (GeV)

Aeff Ω(cm2 sr)

# Gamma

Rays

OSO-31967–68

18° – >0.05 1.9 621

SAS-21972–73

7 – 0.03–10 40 ~10,000

COS-B1975–82

7 – 0.03–10 40 ~2 × 105

EGRET1991–00

5.8 0.5° 0.03–10 750 1.4 × 106

AGILE 2005– 4.7 0.2 0.03–50 15004 ×

106/yr

GLAST LAT

2006– 3.5 0.10.02–300

25,000

1 × 108/yr

Page 9: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

The Gamma-Ray SkyThe Gamma-Ray Sky

EGRET(>100 MeV)

Simulated LAT (>100 MeV, 1 yr)Simulated LAT (>1 GeV, 1 yr)

Page 10: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Nature of the LAT DataNature of the LAT Data Events are readouts of TKR Events are readouts of TKR

hits, TOT, ACD tiles, and hits, TOT, ACD tiles, and CAL crystal energy CAL crystal energy depositions, along with depositions, along with time, position, and time, position, and orientation of the LATorientation of the LAT

Intense charged particle Intense charged particle background & limited background & limited bandwidth for telemetry → bandwidth for telemetry → data are extremely filtereddata are extremely filtered ~3 kHz trigger rate~3 kHz trigger rate

30 Hz filtered event 30 Hz filtered event rate,rate, ~3 Gbyte/day ~3 Gbyte/day raw data, raw data, ~2 × ~2 × 101055 gamma rays/day gamma rays/day

T. Usher

Page 11: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Analysis needsAnalysis needs Reconstruction and classification of Reconstruction and classification of

eventsevents Charged particles vs. gamma-raysCharged particles vs. gamma-rays Quality of reconstruction of energy, directionQuality of reconstruction of energy, direction

Detection and characterization of Detection and characterization of celestial sources of gamma rayscelestial sources of gamma rays Locations, spectra, variability & transient Locations, spectra, variability & transient

alerts, angular extentsalerts, angular extents Identification of sources & population Identification of sources & population

studiesstudies Counterparts and correlationsCounterparts and correlations

Incre

asin

g le

vel

Incre

asin

g le

vel

Page 12: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Reconstruction of eventsReconstruction of events Pattern recognitionPattern recognition

Starting with clusters of hits in TKR, find Starting with clusters of hits in TKR, find straightest, longest straightest, longest ee±± tracks using a tracks using a combinatorial (brute force) approachcombinatorial (brute force) approach

Track fitting via Kalman filteringTrack fitting via Kalman filtering Multiple scattering is not GaussianMultiple scattering is not Gaussian

Iterative with energy reconstruction Iterative with energy reconstruction from CALfrom CAL

VertexingVertexing Find the conversion point (for gamma rays) Find the conversion point (for gamma rays)

and energy/directionand energy/direction Issues:Issues: I’d guess they are in hand. I’d guess they are in hand.

Much experience with track finding Much experience with track finding algorithms in the collaboration. Would algorithms in the collaboration. Would like better energy estimates from like better energy estimates from scattering.scattering.

Jones & Tompkins (1998)

Page 13: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Classification of eventsClassification of events

Classification Classification trees for PSF & trees for PSF & energy ‘pruning’ energy ‘pruning’ and charged and charged particle rejection particle rejection (W. Atwood)(W. Atwood) Trained with Trained with

Monte Carlo dataMonte Carlo data Must provide Must provide

useful inputs; useful inputs; can’t make the can’t make the tree do all the tree do all the workwork

In LAT case, this has meant ‘In LAT case, this has meant ‘flatteningflattening’ inputs to factor out general ’ inputs to factor out general trends with energy and inclination angle.trends with energy and inclination angle.

Outputs are probabilities, e.g., of good energy Outputs are probabilities, e.g., of good energy measurementsmeasurements

Issues (general):Issues (general): Exploring relevant inputs, optimizing Exploring relevant inputs, optimizing classification without tuning to the training data setsclassification without tuning to the training data sets

W. Atwood

A ‘tree’ represented inInsightful Miner

Page 14: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Higher-level analysis: Higher-level analysis: source detection and source detection and

characterizationcharacterization Low fluxes, pervasive celestial diffuse Low fluxes, pervasive celestial diffuse

emission, and limited angular resolution drive emission, and limited angular resolution drive the analysis to the analysis to model fittingmodel fitting

The detector is characterized by its The detector is characterized by its response response functionsfunctions PSF, energy resolution, and effective collecting areaPSF, energy resolution, and effective collecting area They depend on incident direction, energy, plane of They depend on incident direction, energy, plane of

conversion, etc.conversion, etc. Derived from beam tests and the detailed Derived from beam tests and the detailed

instrument simulationinstrument simulation

Page 15: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

3EG catalog (Hartman et al. 1999)

Detection & Detection & characterization (2)characterization (2)

The Milky Way is the strongest ‘source’. The Milky Way is the strongest ‘source’. Many point sources are transient and Many point sources are transient and

detected over a few weeks onlydetected over a few weeks only

EGRET(>100 MeV)

Page 16: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Detection and Detection and characterization (3)characterization (3)

Models are straightforward to define – Models are straightforward to define – radiative transfer is simpleradiative transfer is simple

Data-space version not as simple, but Data-space version not as simple, but manageablemanageable

Likelihood analysis is widely used in Likelihood analysis is widely used in γγ-ray -ray astronomy & we plan to use it for the astronomy & we plan to use it for the standard high-level analysis tool for LAT datastandard high-level analysis tool for LAT data Introduced by Pollock et al. (1981) for analysis of Introduced by Pollock et al. (1981) for analysis of

COS-B data, also used extensively for EGRET COS-B data, also used extensively for EGRET analysis.analysis.

i

iiiMW yyxxEFEyxIEyxI ,,,,,

Page 17: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

……Specific issues for the Specific issues for the LAT analysisLAT analysis

Computation of the likelihood Computation of the likelihood functionfunction Sensible level of detail in the high-level Sensible level of detail in the high-level

response functions – not too much and response functions – not too much and not too littlenot too little

Binned Binned vs.vs. unbinned analysis, unbinned analysis, multidimensional normalization (aka multidimensional normalization (aka exposure)exposure)

Practical optimization of multiparameter Practical optimization of multiparameter models & likelihood analysis tool for models & likelihood analysis tool for general usegeneral use

Scanning observations – smearing of Scanning observations – smearing of sourcessources

Albedo cuts, residual charged-particle Albedo cuts, residual charged-particle backgroundbackground

Systematic errors?Systematic errors?

Carnahan

Earth is not small and we can’t see through it

Page 18: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

……Specific issues for the Specific issues for the LAT analysis (2)LAT analysis (2)

Interpretation of the unbinned likelihood Interpretation of the unbinned likelihood function in likelihood ratio testsfunction in likelihood ratio tests Protassov et al. (2002) reminder about LRTs Protassov et al. (2002) reminder about LRTs

not being valid for determining the number not being valid for determining the number of components in a finite mixture model, i.e., of components in a finite mixture model, i.e., evaluating whether a source is present evaluating whether a source is present

Can we cover ourselves using simulations? Can we cover ourselves using simulations?

Page 19: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Example of why it matters: Example of why it matters: Galactic Center (3EG J1746Galactic Center (3EG J1746

—2851)—2851) Recent re-analysis of Recent re-analysis of

EGRET dataEGRET data Unbinned to use Unbinned to use

detailed response detailed response functionsfunctions

If the new analysis is If the new analysis is actually better,actually better, the the source is now not source is now not coincident with the coincident with the Galactic center itselfGalactic center itself

Many plausible Many plausible candidates exist, even candidates exist, even if dark matter if dark matter annihilation may no annihilation may no longer be one of themlonger be one of them

Hooper & Dingus (2002)

>5 GeV γ-rays

Yusef-Zadeh (2002)

20 cm radio continuum

Sgr A*

CS (2-1) line

3EG source confidence regionArches

cluster (~150 O stars)

Sgr A East

Page 20: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Higher-level Higher-level characterization: characterization:

VariabilityVariability A common A common

characteristic, characteristic, especially in especially in extragalactic extragalactic sources, but has sources, but has been hard to studybeen hard to study LAT will at least LAT will at least

have have much better much better sampling in timesampling in time and and inflight monitoring inflight monitoring of calibration should of calibration should be much easierbe much easier

McLaughlin et al. (1996), EGRET >100 MeV fluxes

Pulsar Blazar

Unident. (variable)Unident. (steady)

Page 21: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Variability (2)Variability (2) At least 3 statistics At least 3 statistics

have been published; have been published; interpretations are interpretations are not always consistent; not always consistent; Differences in how to Differences in how to

incorporate upper incorporate upper limitslimits

see Nolan et al. (2003)see Nolan et al. (2003) Issues:Issues: Variability Variability

index useful for index useful for classification, a useful classification, a useful ‘trigger’ for issuing ‘trigger’ for issuing alertsalerts

Variability measures for unid. sources compared

Reimer (2001)

Page 22: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Variability (3)Variability (3)

IssuesIssues for Gamma-Ray Bursts for Gamma-Ray Bursts Analysis issues – extensively explored – Analysis issues – extensively explored –

are for time series, e.g., pulse are for time series, e.g., pulse decompositiondecomposition

Norris & Bonnell

LAT limit

LAT analysis will not be LAT analysis will not be BATSE-like involving BATSE-like involving count rates and count rates and background subtractionbackground subtraction

Deadtime will be an Deadtime will be an issue for the most issue for the most interesting burstsinteresting bursts

Distribution of times between gamma rays for the 20th brightest GRB per year

Page 23: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Rotation-Rotation-powered pulsarspowered pulsars

Established Established methods exist to methods exist to find upper limits find upper limits on pulsation on pulsation (with and (with and without without ephemerides)ephemerides) Some implicitly Some implicitly

assume a profile assume a profile shapeshape

Variability: Periodic Variability: Periodic sourcessources

Blind searches (some pulsars are radio quiet)Blind searches (some pulsars are radio quiet) Problem: no template for pulse profileProblem: no template for pulse profile Various statistical methods have been developed: Various statistical methods have been developed:

Epoch folding, FFT, Gregory & Loredo (1992-96) BayesianEpoch folding, FFT, Gregory & Loredo (1992-96) Bayesian Also need to search position, period, period derivative and hope for no Also need to search position, period, period derivative and hope for no

glitchesglitches Issues: Probably in good handsIssues: Probably in good hands

Page 24: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Beyond model fitting: Non-Beyond model fitting: Non-parametric analysisparametric analysis

Likelihood will Likelihood will answer only the questions answer only the questions that you askthat you ask

An ideal nonparametric analysis method An ideal nonparametric analysis method wouldwould Characterize extended sources & obviate Characterize extended sources & obviate

need for a detailed model of the Milky Wayneed for a detailed model of the Milky Way Do it quicklyDo it quickly

Many methods are in use in astronomyMany methods are in use in astronomy WaveletWavelet (platelet, wedgelet) approaches for (platelet, wedgelet) approaches for

image analysis or time series (‘denoising’, image analysis or time series (‘denoising’, source detection – including extended source detection – including extended sources)sources)

Multiscale analysesMultiscale analyses (wavelet transform or (wavelet transform or platelet image decomposition), with a platelet image decomposition), with a prescription for deciding what terms are prescription for deciding what terms are worth keeping (e.g., Willett & Nowak 2002 worth keeping (e.g., Willett & Nowak 2002 define the ‘penalized likelihood function’); define the ‘penalized likelihood function’); ICA?ICA?

Issues:Issues: Interpretion of results (statistical Interpretion of results (statistical significances); incorporating the detailed significances); incorporating the detailed response functionsresponse functions

Terrier (2002)

EGRET (>100 MeV)

Prototype CWT Analysis

Page 25: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Construction of the LAT Construction of the LAT source catalogsource catalog

Issues:Issues: Criteria for inclusion, spurious Criteria for inclusion, spurious sourcessources For EGRET catalog criteria were For EGRET catalog criteria were

conservative to cover estimated conservative to cover estimated systematic uncertainties (>5systematic uncertainties (>5σσ for | for |bb| < | < 10°)10°)

Spurious source rateSpurious source rate Mattox et al. (1996) – simulation of distribution Mattox et al. (1996) – simulation of distribution

of likelihood test statistic – effective beam size.of likelihood test statistic – effective beam size. ‘‘Trials factor’Trials factor’

Page 26: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Source identificationSource identification Positional Positional

coincidencecoincidence is is not nearly good not nearly good enoughenough Source Source

localization is localization is poor (~1° for poor (~1° for EGRET, ~several EGRET, ~several arcmins for LAT)arcmins for LAT)

Counterpart Counterpart densities are densities are highhigh

Hartman et al. (1999)

Ideally, for an established population of sources, other Ideally, for an established population of sources, other information can be used (e.g., spectral hardness or information can be used (e.g., spectral hardness or correlated variability of the potential counterparts)correlated variability of the potential counterparts)

Issue:Issue: Quantitative assignments of confidence levels of Quantitative assignments of confidence levels of association of sources, how to establish a new source classassociation of sources, how to establish a new source class

Sowards-Emmerd et al. (2003)

Page 27: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

Population correlationsPopulation correlations Weaker than finding counterpartsWeaker than finding counterparts Correlations of gamma-ray sources Correlations of gamma-ray sources

with SN/OB associations was noted with SN/OB associations was noted already in COS-B era (e.g., already in COS-B era (e.g., Montmerle 1979)Montmerle 1979)

Recent work on correlations of Recent work on correlations of unidentified EGRET sourcesunidentified EGRET sources Supernova remnants, OB associations, Supernova remnants, OB associations,

WR stars, pulsars (e.g., Romero et al. WR stars, pulsars (e.g., Romero et al. 1999)1999)

Galaxy clusters (e.g., Colafrancesco 2002 Galaxy clusters (e.g., Colafrancesco 2002 +Kawasaki & Totani 2002; Scharf & +Kawasaki & Totani 2002; Scharf & Mukherjee 2002 correlate clusters with Mukherjee 2002 correlate clusters with EGRET data directly; Reimer et al. 2003 EGRET data directly; Reimer et al. 2003 ‘stack’ the obs.)‘stack’ the obs.)

Issues:Issues: Characterization of Characterization of populations to enable useful populations to enable useful correlations, validation via simulationcorrelations, validation via simulation

Reimer et al. (2003)

Page 28: Statistical Issues in High-Energy Gamma-Ray Astronomy for GLAST

ConclusionsConclusions Great advances in gamma-ray astronomy can Great advances in gamma-ray astronomy can

be expected with GLASTbe expected with GLAST Maximizing the scientific return will require Maximizing the scientific return will require

addressing the statistical issues at every level addressing the statistical issues at every level in the data analysisin the data analysis

EGRETPhases 1-5>100 MeV

LATSimulation