25
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI Realizing LIGO Virtual Data Caltech: Roy Williams, Albert Lazzarini, Phil Ehrens, Kent Blackburn ISI: Laura Pearlman, Leila Meshkat, Gaurang Mehta, Carl Kesselman, Ewa Deelman UWM: Scott Koranda, Bruce Allen

Realizing LIGO Virtual Data

  • Upload
    nani

  • View
    50

  • Download
    0

Embed Size (px)

DESCRIPTION

Realizing LIGO Virtual Data. Caltech: Roy Williams, Albert Lazzarini, Phil Ehrens, Kent Blackburn ISI: Laura Pearlman, Leila Meshkat, Gaurang Mehta, Carl Kesselman, Ewa Deelman UWM: Scott Koranda, Bruce Allen. Outline. - PowerPoint PPT Presentation

Citation preview

Page 1: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

RealizingLIGO Virtual Data

Caltech: Roy Williams, Albert Lazzarini, Phil Ehrens, Kent Blackburn

ISI: Laura Pearlman, Leila Meshkat, Gaurang Mehta, Carl Kesselman, Ewa Deelman

UWM: Scott Koranda, Bruce Allen

Page 2: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

Outline• LIGO experiment and LDAS (LIGO Data

Analysis System)• General and LIGO specific Virtual Data

scenarios• Prototype Overview

– User interface– Request interpreter– Request planner (Replica Catalog, G-DAG)– Security model– Request execution (Condor-G/DAGMan)

• Future plans

Page 3: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

The Virtual Data Grid (VDG) Model

• Data suppliers publish data to the Grid• Users request raw or derived data from Grid, without

needing to know– Where data is located– Whether data is stored or computed

• Users can easily determine– What it will cost to obtain data– Quality of derived data

• VDG serves requests efficiently, subject to global and local policy constraints

Page 4: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

LIGO Experiment(Laser Interferometer Gravitational-Wave Observatory)

• Aims to detect gravitational waves predicted by Einstein’s theory of relativity.

• Can be used to detect– binary pulsars– mergers of black holes– “starquakes” in neutron stars

• Two installations: in Louisiana (Livingston) and Washington State (Hanford)– Other projects: Virgo (Italy), GEO (Germany), Tama (Japan)

• Besides the gravitational sensors, many other types• Data collected during experiments is a collection of time

series (multi-channel)• Analysis is performed in time and Fourier domains

Page 5: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

LIGO Experiment(Laser Interferometer Gravitational-wave Observatory)

Long time frames

Store

archive

raw channels

Short time frames

Hz

Time

Single Frame

Inte

rfero

mete

r

clean transpose

Time-frequency Image

Find Candidate event DB

Page 6: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

LDAS

LIGOData

AnalysisSystem

managerAPI

BeowulfCluster

Frame Files3-5 MB/sec

frameAPIdown select channels

&concatenate series

dataConditionAPIstatistical summary,

line removal,bandwidth reduction,regression analysis

wrapperAPIparallel event

detection analysis

metaDataAPIinsertion/querying of statistical and

relational results intoLIGO Database

Tables

UserAPIsUserAPIs

Job/ID

Tcl Tcl

fram

es

controlMonitorAPImonitor all LDAS APIs

jobs and Logfiles,start/stop LDAS APIscontrol and monitorparallel processes

mpiAPIstart/monitorwrapperAPI,

load balance multiplejobs

me

ssag

es

eventMonitorAPIevent summary

analysis,detection confidence,

post-detectionprocessing

messages

lightWeightAPIread & writeLIGO_LWdocuments

ilwd

Anonymous FTP, Web Server, E-Mail

LIGO_LW Fileslight-weight

(XML based)

E-MailJob Status,

location of results

LIG

OLW

Asst Mgr Asst Mgr Job/ID

LDASComponentsused by MPIMock DataChallenge

TclCommands

ilwd

emai

l

LIG

OLW

frameAPIdown select channels

&concatenate series

dataConditionAPI

statistical summary,

line removal,

bandwidth reduction,

regression analysis

managerAPI

Asst Mgr Asst Mgr

Anonymous FTP, Web Server, E-Mail

LIGO_LW Fileslight-weight

(XML based)

LIG

OL

W

LIG

OL

W

E-MailJob Status,

location of results

Page 7: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

Virtual Data Scenario

• (LIGO) “Conduct a pulsar search on the data collected from Oct 16 2000 to Jan 1 2001”

GriPhyNLIGO DataSpecification

LIGO DataProduct

XML XML

Page 8: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

The “GriPhyN box” functionality

• For each requested data value, need to– Understand the request

– Determine if it is instantiated; if so, where; if not, how to compute it

– Plan data movements and computations required to obtain all results

– Execute this plan

– Monitor progress

– Make requested value available

Page 9: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

LIGO’s virtual data(in prototype)

• Virtual Data Products– Full frame for a specific time interval– Individual channel for a specific time interval– Decimated channel for a specific time interval

• Transformations– Extract(channelname, frame)– Concatenate(frame1, frame2)– Decimate(frame)

Page 10: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

GriPhyN/LIGO box functionality

• Understand an XML-specified request• Acquire user’s proxy credentials• Consult replica catalog for available data• Construct a plan to produce data not available• Execute the plan• Return requested data in Frame or XML format

GriPhyN/LIGOLIGO SpecificDataSpecification

LIGO DataProduct

XML XML

Page 11: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

HTTPfrontend

MyProxyserver

ReplicaCatalog

ExecutorCondorG/DAGMan

Planner MonitoringMDS

TransformationCatalog

GridFTP GRAM/LDAS

LDAS

GridCVS

Logs

Storage Resource

GridFTP

ComputeResource

GRAM

xml

Cgi interface

G-DAG (DAGMan)

Page 12: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

Page 13: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

Request Interpreter

<?xml version="1.0"?> <!DOCTYPE LIGO-LW SYSTEM "LIGO-LW.dtd"> <LIGO_LW Name="banana" Type="VirtualDataRequest">

<Time Name="StartTime" Type="GPS">65800000</Time> <Time Name="EndTime" Type="GPS">65800010</Time> <LIGO_LW Type="ChannelSpecification">

<Param Name="Detector">LHO,LLO</Param> <Param Name="RegEx">H:LSC-

AS_Q</Param> </LIGO_LW> <Param Name="ResponseFormat"> LIGO-LW

</Param> <Param Name="ResponseDomain"> dc-user.isi.edu</Param> <Param Name="ResponseLocation">

//dc-n1.isi.edu/scratch/myfile.xml </Param> </LIGO-LW>

Page 14: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

Request Planning(C_A_100, C_A_101)

• Optimize planning decisions with respect to final data destination (final_dest = UWM)

• Consult the Replica Catalog for Data Existence– Determines which data needs to be produced

(C_A_100 at ISI)(C_A_101 not present)

• Map request into a DAG of grid operations that need to be performed– Template instantiation

Page 15: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

Components

• Replica Catalog• Template Instantiation• Execution Environment• Security

Page 16: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

LIGO Replica Catalog Structure

Logical Collection

For times 638834000- 638834500

Replica Catalog

Locationdataserver.uwm.edu

Locationdc-n1.isi.edu

Filename: H-638834071.TFilename: H-638834271.T…

Filename: H-638834071.TFilename: H-638834271.TFilename: …..Protocol: gridftpUrlConstructor: gsiftp:// dataserver.phys.uwm.edu / griphyn_test

Filename: H-638834071.T…Filename:Protocol: gridftpUrlConstructor: gsiftp:// dc-n1.isi.edu / pub/ligo2

Logical File Parent

Logical File

H-638834071.T

Logical File

H-638847271.T Size: 506214400

Logical Collection

638834500- 638835000

Page 17: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

Abstract G-DAG

Template instantiation

globus_url_copy XFrom a to b

globus_url_copy C_A_100From dc.isi.edu/frames to

To host.uwm.edu/myframes

C_A_100 in dc.isi.edu/framesOutput location: host.uwm.edu/myframes

Register XIn RC with location b Register C_A_100

In RC with location host.uwm.edu/myframes

Concrete G-DAG(DAGMan)

Page 18: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

Templates

globus_url_copy XFrom a to b

globus_url_copy XFrom a to b

ExecuteLDAS_concat

X,Yat b

globus_url_copy ZFrom b to d

Register ZIn RC with location d

globus_url_copy YFrom c to b

globus_url_copy XFrom a to b

Executedecimate

on Xat b

globus_url_copy XFrom a to b

ExecuteLDAS_extract

C_Y from Xat b

globus_url_copy C_YFrom b to c

Register C_YIn RC with location c

Page 19: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

DAGManCondor-G

DAGMan

GRAM LDAS

ComputeResources

GridFTP

GRAM

GridFTP

GridFTP

ComputeResources

GRAM

GridFTP

GridFTP

Execution EnvironmentISI

UWM

UWM

UWM

GRAM LDASGridFTP

Caltech

ISI

Page 20: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

Security Model

MyProxyServer

PrototypePrototype: userid and certificate

User’s username and password (ssl connection)

Authentication betweenPrototype and MyProxy

User needs to register a proxy certificateWith the MyProxy server

User’s user name and password

User’s proxy credential

$X509_USER_PROXY=User’s proxy credential

GridFTPCondor-G

gsi authentication

Page 21: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

LDAS/Globus Interface

UserAPIs

Tcl

managerAPI

Asst Mgr Asst Mgr Job/ID

Anonymous FTP, Web Server, E-Mail

LIGO_LW Fileslight-weight

(XML based)

LIG

OL

W

LIG

OL

W

E-MailJob Status,

location of results

emai

l

LDAS

GRAM

Condor-GRSL specified job

LDAS commands

GridFTPData in LDASspace

gsiftp://

Globus

Secure, GSI-enabled interface to LDAS

Page 22: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

The JOB ID = 11397the Desired channel name is H2:LSC-AS_Q with timestamp 65800000

writing sub file transfer_a2b_1.11397.sub

WILL TRANSFER H-65800000.Ffrom gsiftp://dc-n1.isi.edu/ligodata/frames/H-65800000.F to

gsiftp://dataserver.phys.uwm.edu/grid_incoming/H-65800000.F

Writing Sub File transform.11397.sub

my infile is H-65800000.F,

my outfile is H-H2:LSC-AS_Q-65800000.xml11397

Will apply Transformation

WILL GET RESULT H-H2:LSC-AS_Q-65800000.xml

from

gsiftp://dataserver.phys.uwm.edu/grid_outgoing/H-H2:LSC-AS_Q-65800000.xml11397 to

gsiftp://dc-n1.isi.edu/OUTPUT/H-H2:LSC-AS_Q-65800000.xml

the data will be available at

http://dc-n1.isi.edu/OUTPUT/H-H2:LSC-AS_Q-65800000.xml**********JOB SUBMITTED************

Page 23: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

OutlookShort term (SC’2001)

– Integrate with RC– Integrate with MyProxy– Better job monitoring

Longer term• Support a richer set of Virtual Data products• Implement and incorporate the Transformation Catalog• Investigate Derived Data Catalog (abstract DAGs)• Investigate more complex planning techniques

– Query estimation• Execution

– Error handling and recovery– Alternative strategies

Page 24: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

Prototype to Tool• Enable a close collaboration between LIGO participants

and other gravitational wave communities such as Virgo and GEO

• Replica Catalogs to provide information about data existence (Register new selected data in the catalog to make it available)

• Including Materialized Data (no need to recompute)• Provide access to data analysis software and systems

(Transformation Catalog)• Use the prototype as the basis for a data exchange and

replication system (security)• Provide access to world-wide computing resources

Page 25: Realizing LIGO Virtual Data

GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI

DEMO TONIGHT