Quality Assurance Benchmark Datasets and Processing David Nidever SQuaRE Science Lead

Quality Assurance Benchmark Datasets and Processing

David NideverSQuaRE Science Lead

2

Science Quality Assurance

• Need to verify the LSST pipeline software (“the stack”) produces data products that are good enough to do the science that LSST has promised.

• One of the essential goals of the Science Quality and Reliability Engineering (SQuaRE) team in Tucson (led by Frossie Economou and myself).

3

QA Approaches

Several approaches or aspects of this:• Run quick tests on small datasets and produce

metrics on a regular basis to performance of the stack (i.e. regression testing).

• Run large precursor datasets through the stack “end-to-end” and manually inspect the results. Treat them as if we were going to do science on them.

4

“End-to-End” Processing

• Want "end-to-end" as much as possible to fully test the software; starting from raw data through to calibrated source catalogs and alerts.

• If end-to-end can't be done, then at least we'll be able to figure out which portions of the stack are currently missing or need work.

• In some ways this is a “data challenge” for ourselves.

5

Benchmark Datasets• Want datasets to test the stack in regimes of four

primary science drivers:– Dark Energy / Dark Matter– Solar System– Transients– Milky Way

• Want datasets to test two main categories of data products:– Level 1, Alert Production (difference imaging, transients)

– Level 2, Data Release Production (deep stacks, “static” sky)

6

Benchmark Datasets• Want datasets taken in different environments (e.g.

crowded/sparse regions, good/bad seeing, etc.).• We need multiple datasets since no existing survey

spans all these regimes the way LSST will.

• Bunch of DM scientists sat down on Monday and came up with an initial list of benchmark datasets that satisfy most of these needs.

• Focused on DECam, CFHT and HSC

7

Benchmark DatasetsDECam• COSMOS field, PI: Dey

– Deep (~26 AB mag), 3 sq. deg., ugrizY– ~1500 images– All public now– Lots of existing datasets and catalogs for reference

• Bulge survey, PI: Saha– ugriz, 6 fields– ~3500 shallow frames (~700 per field), ~3500 deep frames (~7000 per

field)– Crowded field, looking for variables– All public now

8

Benchmark Datasets• Solar system objects, PI: Allen– Covering large area, looking for solar system objects– ~3900 60s r-band images public so far

• HiTS survey, PI: Forster– Difference imaging, transients, SNe– ~2800 images of 40 fields going public in 2015B (almost

all g) and more in 2016B– ~100-150s exposures, most g (~7000) and r (~1000)– ~50 fields, ~30-40 epochs per field

9

Benchmark Datasets• SMASH survey, PI: Nidever

– Deep, ~24.5 mag, ugriz– 180 fields, ~30 frames per field– Magellanic Clouds and periphery– Crowded and sparse fields– Stripe82 data, useful reference field with good astrometry and

photometry from SDSS– Eventually want to run the full dataset through the stack

• The stack currently runs on calibrated DECam images, not yet on raw images. Need to implement DECam Instrumental Signature Removal (ISR). Slater/Chiang/Nidever working on this.

10

Benchmark Datasets• CFHT Lensing Survey

– Deep, ~25 mag, ugriz, 154 sq. deg– Good for weak lensing– To be run by our French colleagues at IN2P3 (Dominique Boutigny et

al.)– The stack is already set up to run on CFHT data

• HSC COSMOS data– Very good seeing, very deep (deeper than 10yr LSST)– Telescope/camera similar to LSST– Won't be publicly available until March 2016– The stack is already set up to run on HSC data

11

Benchmark Datasets

• Simulated LSST data– Still need to figure out what we want/need– But important to test that the stack is ready for

LSST data

• This is a first draft of the benchmark datasets and it will likely grow/change.

• Want to keep it to a relatively small number of datasets.

12

Software Cross-Comparison

• Almost all of these datasets will have separate reductions by the PI teams using non-LSST software.

• Allows for comparison of LSST stack with other software suites (e.g. DAOPHOT, DoPHOT, Sextractor, etc.).

13

Benchmark Data Processing

• Goal to run a "significant" fraction of these data through the stack in the near future.

• Still need to converge on the number of frames for each dataset, but these will be large runs.

14

Post-Processing Analysis

• After processing will inspect the outputs manually and perform quality check/metrics.

• Will look to turn the manual inspection and checks into automated ones.

• Would like to make the results of the runs available to the community.

15

The End

16

17

Documents

Quality Assurance Benchmark Datasets and Processing David Nidever SQuaRE Science Lead