23
Design Design Considerations Considerations for Catalogs for Catalogs Joseph A. Hourclé Joseph A. Hourclé 2008-02-20 2008-02-20 NSO-Tucson NSO-Tucson

Design Considerations for Catalogs Joseph A. Hourclé 2008-02-20NSO-Tucson

Embed Size (px)

Citation preview

Design Design Considerations for Considerations for

CatalogsCatalogs

Joseph A. HourcléJoseph A. Hourclé

2008-02-202008-02-20

NSO-TucsonNSO-Tucson

About MeAbout Me

Types of CatalogsTypes of Catalogs

Data CatalogsData Catalogs Used to track all data availableUsed to track all data available May be ‘observation’ centric or ‘file’ May be ‘observation’ centric or ‘file’

centriccentric Typically maintained by the mission or Typically maintained by the mission or

PIPI Event / Feature CatalogsEvent / Feature Catalogs

Added science inputAdded science input Typically a byproduct of other researchTypically a byproduct of other research

Why catalogs?Why catalogs?

Too much dataToo much data SDO : ~100k discrete observations per SDO : ~100k discrete observations per

dayday Why repeat the work?Why repeat the work?

New science builds on previous workNew science builds on previous work Draws on experience from domain Draws on experience from domain

expertsexperts Adds value to the dataAdds value to the data

Annotates the dataAnnotates the data Makes data of interest more ‘findable’Makes data of interest more ‘findable’

Common Catalog Common Catalog ProblemsProblems

What are we cataloging?What are we cataloging?

May need a common concept of May need a common concept of ‘record’ to create meaningful unions‘record’ to create meaningful unions

‘‘Observations’ vs. ‘Files’Observations’ vs. ‘Files’ May have multiple files that contain a May have multiple files that contain a

given observationgiven observation Browse products, different processing, Browse products, different processing,

different file formatsdifferent file formats

Data PermutationsData Permutations

Lack of Lack of DocumentationDocumentation

What does ‘red’ mean?What does ‘red’ mean? ‘‘LightRed’ vs ‘DarkRed’LightRed’ vs ‘DarkRed’

How do we translate the PI’s terms to How do we translate the PI’s terms to discipline concepts?discipline concepts? Does the catalog use an unconventional Does the catalog use an unconventional

definition of a term?definition of a term? Has usage of the term changed since Has usage of the term changed since

the catalog was started?the catalog was started?

Catalogs Have Intended Catalogs Have Intended Purposes / UsersPurposes / Users

Catalogs may have to be manipulated to Catalogs may have to be manipulated to get the information you wantget the information you want

A Solar Physics catalog may not be useful A Solar Physics catalog may not be useful to someone in Heliospheric Physicsto someone in Heliospheric Physics Trying to answer different questionsTrying to answer different questions

CameraCamera Filter Filter PositionPosition

Polarization Polarization PositionPosition

00 22 11

11 11 00

22 00 11

No Entry ≠ No EventNo Entry ≠ No Event

Lack of a record may be from lack of Lack of a record may be from lack of datadata

LASCO CME catalogLASCO CME catalog what about periods when LASCO wasn’t what about periods when LASCO wasn’t

observing?observing? Event catalogs show when we know Event catalogs show when we know

something existed, not when we know something existed, not when we know they didn’t exist. they didn’t exist.

Event catalogs need to disclose gaps Event catalogs need to disclose gaps in the data they’re based onin the data they’re based on

Catalogs are not just flat Catalogs are not just flat filesfiles

Catalogs can be databasesCatalogs can be databases Don’t need to be 2 dimensionalDon’t need to be 2 dimensional• Relational databasesRelational databases

Can have multiple views into the dataCan have multiple views into the data• Each may have a different purposeEach may have a different purpose• Each may have a different audienceEach may have a different audience

Storage of catalogsStorage of catalogs Beware of proprietary formatsBeware of proprietary formats• May not be useful to all audiencesMay not be useful to all audiences• May not be available in the futureMay not be available in the future

Added ValueAdded Value

Catalogs can link to other catalogsCatalogs can link to other catalogs Or browse productsOr browse products Or applicationsOr applications• Search SystemsSearch Systems• Visualization ToolsVisualization Tools

Other Types of CatalogsOther Types of Catalogs

Link Publications to DataLink Publications to Data Where can I find the data from this Where can I find the data from this

article?article? Has this data been written about?Has this data been written about?

Catalogs of Services / ApplicationsCatalogs of Services / Applications What data is out there? (VxOs)What data is out there? (VxOs) Translators / reprocessors? (SSW, CoSEC)Translators / reprocessors? (SSW, CoSEC) Visualization tools? (SSW, CoSEC?)Visualization tools? (SSW, CoSEC?)

Annotation ServicesAnnotation Services Observing / Campaign CatalogsObserving / Campaign Catalogs

There is helpThere is help

Database AdministratorsDatabase Administrators Data ModelingData Modeling

Library and Information ScienceLibrary and Information Science Cataloging / Library TheoryCataloging / Library Theory Information ArchitectureInformation Architecture

Terminology variesTerminology varies AGU : Science InformaticsAGU : Science Informatics NSF : Cyber InfrastructureNSF : Cyber Infrastructure ARL : E-scienceARL : E-science

SummarySummary

Documentation!Documentation! Define terms / columns / rowsDefine terms / columns / rows Note the unknown periodsNote the unknown periods

Consult expertsConsult experts Database admins / Librarians / etcDatabase admins / Librarians / etc

Consider non-flat structuresConsider non-flat structures Multiple views into the dataMultiple views into the data

Consider accessibilityConsider accessibility Can people read/understand this format?Can people read/understand this format?

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Sunspot on 15 July 2002 from theSunspot on 15 July 2002 from theSwedish 1-m Solar Telescope on La Swedish 1-m Solar Telescope on La PalmaPalma

http://virtualsolar.org/ http://virtualsolar.org/ [email protected]@nasa.gov

Functional Requirements Functional Requirements for Bibliographic Recordsfor Bibliographic Records

And it mostly works And it mostly works