29
Real-time Classification for The Palomar Transient Factory Joseph Richards UC Berkeley Department of Astronomy Department of Statistics j[email protected] Mini-Workshop on Computational AstroStatistics

Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

Real-time Classification for The Palomar

Transient Factory

Joseph Richards

UC BerkeleyDepartment of Astronomy

Department of Statistics

[email protected]

Mini-Workshop on Computational AstroStatistics

Page 2: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

Transient Classification Project (TCP)

J. Richards Classification for PTF 2

Page 3: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

Outline

1 Palomar Transient Factory (PTF)

2 Transient Classification Project (TCP)DotAstro.orgFeatures for Classification

3 Light Curve Classification using Diffusion MapSupernova Classification ChallengeDiffusion Map Classifier for PTF?

J. Richards Classification for PTF 3

Page 4: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

Palomar Transient Factory (PTF)

I Fully-automated survey to explore the optical transientsky

I Palomar 48” telescope (P48): 8.1 square-degree FoV, 101megapixel CCD array, 2 filters

Experiment Science Goals Setup5-day SNe Ia, CC SNe, AGNs, 60 sec. exposures

Cadence QSOs, Novae, etc. 8000 deg2 / yearDynamic Unexplored (fast) regimes 60 sec. exposuresCadence RR Lyr, CVs, flare stars, SNe, etc. 1 min.- 5 day cadences

Orion Field transiting planets single fieldHα Hα sky survey narrow-band on/off Hα

I Follow-up of detected transients by Palomar 60” (P60)and telescopes at other facilities

I Follow-up decisions based on source classifications

J. Richards Classification for PTF 4

Page 5: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

Palomar Transient Factory (PTF)

Law et al. (2009, PASP, 121, 1395)

J. Richards Classification for PTF 5

Page 6: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

Transient Classification Project (TCP)

I For optimal allocation of follow-up resources, needaccurate, real-time classifications of transient events

I Some challenges:

1 Need labeled data to train & validate our classifier

J. Richards Classification for PTF 6

Page 7: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

TCP: DotAstro.org

www.DotAstro.org: 100,000 sources, 150 classesAstronomer-classified objects (from over 100 papers)

J. Richards Classification for PTF 7

Page 8: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

Transient Classification Project (TCP)

I For optimal allocation of follow-up resources, needaccurate, real-time classifications of transient events

I Some challenges:

1 Need labeled data to train & validate our classifier2 Highly multi-class problem with nested classifications

J. Richards Classification for PTF 8

Page 9: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

TCP: DotAstro.org Classification Taxonomy

J. Richards Classification for PTF 9

Page 10: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

Transient Classification Project (TCP)

I For optimal allocation of follow-up resources, needaccurate, real-time classifications of transient events

I Some challenges:

1 Need labeled data to train & validate our classifier2 Highly multi-class problem with nested classifications3 Light curves are noisy, irregularly sampled, and may

include non-detections

Question: What information should we extract from the lightcurves to train our classifier?

J. Richards Classification for PTF 10

Page 11: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

Transient Classification Project (TCP)

I For optimal allocation of follow-up resources, needaccurate, real-time classifications of transient events

I Some challenges:

1 Need labeled data to train & validate our classifier2 Highly multi-class problem with nested classifications3 Light curves are noisy, irregularly sampled, and may

include non-detections

Question: What information should we extract from the lightcurves to train our classifier?

J. Richards Classification for PTF 10

Page 12: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

TCP: Features for Classification

TCP currently uses:

1 Features derived from time-series light curvesI flux quantiles, skewness, slopeI period, amplitude of largest peaks of Lomb-Scargle

periodogramI light-curve fitting parametersI color information

2 “Context” informationI galactic latitude & longitudeI distance from ecliptic planeI distance to nearest galaxy, properties of that galaxy

Many possible features, lots of missing data, high levels ofnoise on some features!

J. Richards Classification for PTF 11

Page 13: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

TCP: Features for Classification

TCP currently uses:

1 Features derived from time-series light curvesI flux quantiles, skewness, slopeI period, amplitude of largest peaks of Lomb-Scargle

periodogramI light-curve fitting parametersI color information

2 “Context” informationI galactic latitude & longitudeI distance from ecliptic planeI distance to nearest galaxy, properties of that galaxy

Many possible features, lots of missing data, high levels ofnoise on some features!

J. Richards Classification for PTF 11

Page 14: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

Light Curve Classification usingDiffusion Map

J. Richards Classification for PTF 12

Page 15: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

Light Curve Classification using Diffusion Map

I Idea: First map each light curve, x, into m-dimensionaldiffusion space x 7→ {ψ1(x), ..., ψm(x)}

I Use the diffusion map coordinates as features for aclassifier

I Advantages:I All we need is distance between each pair of light curves,

s(xi , xj)I Exploits the underlying sparse structure of the data setI Can avoid estimating physical parameters of each light

curve

J. Richards Classification for PTF 13

Page 16: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

Supernova Classification Challenge

I DES SN Photometric Classification Challenge to testsupernova light curve classification methods(Kessler et al. 2010, arXiv:1008.1024)

I griz SN light curves simulated from templates

I 1300 labeled SNe (Ia / II / Ibc) to classify 18,000

I Our entry: Find sparse structure in the SN database,exploit low-dimensional representation to build classifier

I InCA Group team: JWR, D. Homrighausen, C. Schafer,and P. Freeman

J. Richards Classification for PTF 14

Page 17: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

SN Photometric Classification: Methods

Data: Noisy realizations fromSN LC templates ineach of 4 filters

Our Approach:

1 Fit regression spline ineach filter of each SN

2 Distances betweenSNe: weighted-`2

distance betweennormalized spline fits

3 Apply standardclassification methodusing diffusion coords.

J. Richards Classification for PTF 15

Page 18: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

SN Photometric Classification: Results

Two-dimensional diffusion map representations of 18,000 SNe

Richards et al. (2010), in preparationJ. Richards Classification for PTF 16

Page 19: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

SN Photometric Classification: Results

SN Photometric Classification Challenge judged on bothefficiency and purity of type Ia classifications.

FoM =Ntrue

Ia

NTotalIa

× NtrueIa

NtrueIa + 3Nfalse

Ia

Submission Description Training FoM Test FoM

this work Diffusion Map with Random Forest Classifier 0.810 0.236Rodney Template Fitting 0.488 0.319Gonzalez Template Fitting 0.528 0.169SNANA cuts Template Fitting 0.582 0.221Sako Template Fitting 0.802 0.506MGU+DU-1 Difference Boosting Neural Network on Light Curve Slopes 0.385 0.079MGU+DU-2 Random Forest on Light Curve Slopes 0.635 0.148JEDI KDE Kernel Density Estimation with 21 Parameters* 0.946 0.371JEDI Boost Boosted Decision Trees with 21 Parameters* 0.961 0.204All Ia Classify all SNe as Type Ia 0.385 0.107

*light curve for each filter is fit to a modified Γ-distribution function with five parameters

J. Richards Classification for PTF 17

Page 20: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

SN Photometric Classification: Results

SN Photometric Classification Challenge judged on bothefficiency and purity of type Ia classifications.

FoM =Ntrue

Ia

NTotalIa

× NtrueIa

NtrueIa + 3Nfalse

Ia

Submission Description Training FoM Test FoM

this work Diffusion Map with Random Forest Classifier 0.810 0.236Rodney Template Fitting 0.488 0.319Gonzalez Template Fitting 0.528 0.169SNANA cuts Template Fitting 0.582 0.221Sako Template Fitting 0.802 0.506MGU+DU-1 Difference Boosting Neural Network on Light Curve Slopes 0.385 0.079MGU+DU-2 Random Forest on Light Curve Slopes 0.635 0.148JEDI KDE Kernel Density Estimation with 21 Parameters* 0.946 0.371JEDI Boost Boosted Decision Trees with 21 Parameters* 0.961 0.204All Ia Classify all SNe as Type Ia 0.385 0.107

*light curve for each filter is fit to a modified Γ-distribution function with five parameters

J. Richards Classification for PTF 17

Page 21: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

SN Photometric Classification: Redshift

J. Richards Classification for PTF 18

Page 22: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

SN Photometric Classification: Predicted Class

J. Richards Classification for PTF 19

Page 23: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

SN Photometric Classification: Training vs. Test

J. Richards Classification for PTF 20

Page 24: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

SN Photometric Classification: SNR

J. Richards Classification for PTF 21

Page 25: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

SN Photometric Classification: No. of Obs.

J. Richards Classification for PTF 22

Page 26: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

SN Photometric Classification: Lessons Learned

I Diffusion map is a competitive method for SNclassification

I Latent variables (e.g. redshift) are captured by thediffusion coordinates.

I No time dilation or K-corrections!I Can be used as photo-z estimator

I Need more SN training data at high-z!I If spectral coverage is not possible, can we model the

physics?

I Much care is needed in choosing s(·, ·)

J. Richards Classification for PTF 23

Page 27: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

Diffusion Map for PTF Classifier: Challenges

I Must deal withperiodic (e.g. variablestars) and non-periodic(e.g. explosive events)light curves

I How do we construct ageneral distancemeasure, s(·, ·)?

I Flux normalization?Time zero-point?Period folding?Fourier space?

J. Richards Classification for PTF 24

Page 28: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

Summary

I Accurate, real-time transient classification is imperativefor the PTF (and LSST)

I DotAstro.org contains expert-classified light curves for100,000 sources

I Classification problem is highly multi-class and data arenoisy and heterogeneous

I What are the optimal features for classification? Can weincorporate measurement errors?

I Diffusion map can be used to uncover simple structure insets of light curves

I Application to SN Classification Challenge yieldedimpressive results

I Can we adapt this methodology to build a generaltransient classifier?

J. Richards Classification for PTF 25

Page 29: Real-time Classification for The Palomar Transient Factory · 2010. 8. 24. · Experiment Science Goals Setup 5-day SNe Ia, CC SNe, AGNs, 60 sec ... The Berkeley Transient Classi

TCP Publications

Bloom, J. S., Starr, D. L., Butler, N. R., Nugent, P., Rischard, M., Eads, D.,Poznanski, D. Towards a real-time transient classification engine (2008, AN, 329,284)

Starr, Dan L., Bloom, J. S., Butler, N. R. Real-time Transient Classification Pipeline(2008, AIPC, 1000, 635)

Bloom, Joshua S., Starr, D. L., Butler, N. R., Poznanski, D., Rischard, M., Kennedy,R., Brewer, J. Rapid and Automated Classification of Events from the PalomarTransient Factory (2009, AAS, 41, 419)

Poznanski, Dovi et al. The Standard Candle Method For Type II-P Supernovae -New Sample And PTF Perspective (2009, AAS, 41, 419)

Brewer, J. M., Bloom, J. S., Kennedy, R., Starr, D. L. A Web-Based Framework Fora Time-Domain Warehouse (2009, ASPC, 411, 357)

Starr, D. L., Bloom, J. S., Brewer, J. M., Butler, N. R., Poznanski, D., Rischard, M.,Klein, C. The Berkeley Transient Classification Pipeline: Deriving Real-timeKnowledge from Time-domain Surveys (2009, ASPC, 411, 493)

Butler, Nathaniel R., Bloom, Joshua S. Optimal Time-Series Selection of Quasars(2010, arXiv:1008.3143)

J. Richards Classification for PTF 26