Upload
nani
View
50
Download
0
Embed Size (px)
DESCRIPTION
Realizing LIGO Virtual Data. Caltech: Roy Williams, Albert Lazzarini, Phil Ehrens, Kent Blackburn ISI: Laura Pearlman, Leila Meshkat, Gaurang Mehta, Carl Kesselman, Ewa Deelman UWM: Scott Koranda, Bruce Allen. Outline. - PowerPoint PPT Presentation
Citation preview
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
RealizingLIGO Virtual Data
Caltech: Roy Williams, Albert Lazzarini, Phil Ehrens, Kent Blackburn
ISI: Laura Pearlman, Leila Meshkat, Gaurang Mehta, Carl Kesselman, Ewa Deelman
UWM: Scott Koranda, Bruce Allen
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
Outline• LIGO experiment and LDAS (LIGO Data
Analysis System)• General and LIGO specific Virtual Data
scenarios• Prototype Overview
– User interface– Request interpreter– Request planner (Replica Catalog, G-DAG)– Security model– Request execution (Condor-G/DAGMan)
• Future plans
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
The Virtual Data Grid (VDG) Model
• Data suppliers publish data to the Grid• Users request raw or derived data from Grid, without
needing to know– Where data is located– Whether data is stored or computed
• Users can easily determine– What it will cost to obtain data– Quality of derived data
• VDG serves requests efficiently, subject to global and local policy constraints
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
LIGO Experiment(Laser Interferometer Gravitational-Wave Observatory)
• Aims to detect gravitational waves predicted by Einstein’s theory of relativity.
• Can be used to detect– binary pulsars– mergers of black holes– “starquakes” in neutron stars
• Two installations: in Louisiana (Livingston) and Washington State (Hanford)– Other projects: Virgo (Italy), GEO (Germany), Tama (Japan)
• Besides the gravitational sensors, many other types• Data collected during experiments is a collection of time
series (multi-channel)• Analysis is performed in time and Fourier domains
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
LIGO Experiment(Laser Interferometer Gravitational-wave Observatory)
Long time frames
Store
archive
raw channels
Short time frames
Hz
Time
Single Frame
Inte
rfero
mete
r
clean transpose
Time-frequency Image
Find Candidate event DB
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
LDAS
LIGOData
AnalysisSystem
managerAPI
BeowulfCluster
Frame Files3-5 MB/sec
frameAPIdown select channels
&concatenate series
dataConditionAPIstatistical summary,
line removal,bandwidth reduction,regression analysis
wrapperAPIparallel event
detection analysis
metaDataAPIinsertion/querying of statistical and
relational results intoLIGO Database
Tables
UserAPIsUserAPIs
Job/ID
Tcl Tcl
fram
es
controlMonitorAPImonitor all LDAS APIs
jobs and Logfiles,start/stop LDAS APIscontrol and monitorparallel processes
mpiAPIstart/monitorwrapperAPI,
load balance multiplejobs
me
ssag
es
eventMonitorAPIevent summary
analysis,detection confidence,
post-detectionprocessing
messages
lightWeightAPIread & writeLIGO_LWdocuments
ilwd
Anonymous FTP, Web Server, E-Mail
LIGO_LW Fileslight-weight
(XML based)
E-MailJob Status,
location of results
LIG
OLW
Asst Mgr Asst Mgr Job/ID
LDASComponentsused by MPIMock DataChallenge
TclCommands
ilwd
emai
l
LIG
OLW
frameAPIdown select channels
&concatenate series
dataConditionAPI
statistical summary,
line removal,
bandwidth reduction,
regression analysis
managerAPI
Asst Mgr Asst Mgr
Anonymous FTP, Web Server, E-Mail
LIGO_LW Fileslight-weight
(XML based)
LIG
OL
W
LIG
OL
W
E-MailJob Status,
location of results
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
Virtual Data Scenario
• (LIGO) “Conduct a pulsar search on the data collected from Oct 16 2000 to Jan 1 2001”
GriPhyNLIGO DataSpecification
LIGO DataProduct
XML XML
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
The “GriPhyN box” functionality
• For each requested data value, need to– Understand the request
– Determine if it is instantiated; if so, where; if not, how to compute it
– Plan data movements and computations required to obtain all results
– Execute this plan
– Monitor progress
– Make requested value available
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
LIGO’s virtual data(in prototype)
• Virtual Data Products– Full frame for a specific time interval– Individual channel for a specific time interval– Decimated channel for a specific time interval
• Transformations– Extract(channelname, frame)– Concatenate(frame1, frame2)– Decimate(frame)
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
GriPhyN/LIGO box functionality
• Understand an XML-specified request• Acquire user’s proxy credentials• Consult replica catalog for available data• Construct a plan to produce data not available• Execute the plan• Return requested data in Frame or XML format
GriPhyN/LIGOLIGO SpecificDataSpecification
LIGO DataProduct
XML XML
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
HTTPfrontend
MyProxyserver
ReplicaCatalog
ExecutorCondorG/DAGMan
Planner MonitoringMDS
TransformationCatalog
GridFTP GRAM/LDAS
LDAS
GridCVS
Logs
Storage Resource
GridFTP
ComputeResource
GRAM
xml
Cgi interface
G-DAG (DAGMan)
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
Request Interpreter
<?xml version="1.0"?> <!DOCTYPE LIGO-LW SYSTEM "LIGO-LW.dtd"> <LIGO_LW Name="banana" Type="VirtualDataRequest">
<Time Name="StartTime" Type="GPS">65800000</Time> <Time Name="EndTime" Type="GPS">65800010</Time> <LIGO_LW Type="ChannelSpecification">
<Param Name="Detector">LHO,LLO</Param> <Param Name="RegEx">H:LSC-
AS_Q</Param> </LIGO_LW> <Param Name="ResponseFormat"> LIGO-LW
</Param> <Param Name="ResponseDomain"> dc-user.isi.edu</Param> <Param Name="ResponseLocation">
//dc-n1.isi.edu/scratch/myfile.xml </Param> </LIGO-LW>
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
Request Planning(C_A_100, C_A_101)
• Optimize planning decisions with respect to final data destination (final_dest = UWM)
• Consult the Replica Catalog for Data Existence– Determines which data needs to be produced
(C_A_100 at ISI)(C_A_101 not present)
• Map request into a DAG of grid operations that need to be performed– Template instantiation
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
Components
• Replica Catalog• Template Instantiation• Execution Environment• Security
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
LIGO Replica Catalog Structure
Logical Collection
For times 638834000- 638834500
Replica Catalog
Locationdataserver.uwm.edu
Locationdc-n1.isi.edu
Filename: H-638834071.TFilename: H-638834271.T…
Filename: H-638834071.TFilename: H-638834271.TFilename: …..Protocol: gridftpUrlConstructor: gsiftp:// dataserver.phys.uwm.edu / griphyn_test
Filename: H-638834071.T…Filename:Protocol: gridftpUrlConstructor: gsiftp:// dc-n1.isi.edu / pub/ligo2
Logical File Parent
Logical File
H-638834071.T
Logical File
H-638847271.T Size: 506214400
Logical Collection
638834500- 638835000
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
Abstract G-DAG
Template instantiation
globus_url_copy XFrom a to b
globus_url_copy C_A_100From dc.isi.edu/frames to
To host.uwm.edu/myframes
C_A_100 in dc.isi.edu/framesOutput location: host.uwm.edu/myframes
Register XIn RC with location b Register C_A_100
In RC with location host.uwm.edu/myframes
Concrete G-DAG(DAGMan)
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
Templates
globus_url_copy XFrom a to b
globus_url_copy XFrom a to b
ExecuteLDAS_concat
X,Yat b
globus_url_copy ZFrom b to d
Register ZIn RC with location d
globus_url_copy YFrom c to b
globus_url_copy XFrom a to b
Executedecimate
on Xat b
globus_url_copy XFrom a to b
ExecuteLDAS_extract
C_Y from Xat b
globus_url_copy C_YFrom b to c
Register C_YIn RC with location c
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
DAGManCondor-G
DAGMan
GRAM LDAS
ComputeResources
GridFTP
GRAM
GridFTP
GridFTP
ComputeResources
GRAM
GridFTP
GridFTP
Execution EnvironmentISI
UWM
UWM
UWM
GRAM LDASGridFTP
Caltech
ISI
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
Security Model
MyProxyServer
PrototypePrototype: userid and certificate
User’s username and password (ssl connection)
Authentication betweenPrototype and MyProxy
User needs to register a proxy certificateWith the MyProxy server
User’s user name and password
User’s proxy credential
$X509_USER_PROXY=User’s proxy credential
GridFTPCondor-G
gsi authentication
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
LDAS/Globus Interface
UserAPIs
Tcl
managerAPI
Asst Mgr Asst Mgr Job/ID
Anonymous FTP, Web Server, E-Mail
LIGO_LW Fileslight-weight
(XML based)
LIG
OL
W
LIG
OL
W
E-MailJob Status,
location of results
emai
l
LDAS
GRAM
Condor-GRSL specified job
LDAS commands
GridFTPData in LDASspace
gsiftp://
Globus
Secure, GSI-enabled interface to LDAS
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
The JOB ID = 11397the Desired channel name is H2:LSC-AS_Q with timestamp 65800000
writing sub file transfer_a2b_1.11397.sub
WILL TRANSFER H-65800000.Ffrom gsiftp://dc-n1.isi.edu/ligodata/frames/H-65800000.F to
gsiftp://dataserver.phys.uwm.edu/grid_incoming/H-65800000.F
Writing Sub File transform.11397.sub
my infile is H-65800000.F,
my outfile is H-H2:LSC-AS_Q-65800000.xml11397
Will apply Transformation
WILL GET RESULT H-H2:LSC-AS_Q-65800000.xml
from
gsiftp://dataserver.phys.uwm.edu/grid_outgoing/H-H2:LSC-AS_Q-65800000.xml11397 to
gsiftp://dc-n1.isi.edu/OUTPUT/H-H2:LSC-AS_Q-65800000.xml
the data will be available at
http://dc-n1.isi.edu/OUTPUT/H-H2:LSC-AS_Q-65800000.xml**********JOB SUBMITTED************
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
OutlookShort term (SC’2001)
– Integrate with RC– Integrate with MyProxy– Better job monitoring
Longer term• Support a richer set of Virtual Data products• Implement and incorporate the Transformation Catalog• Investigate Derived Data Catalog (abstract DAGs)• Investigate more complex planning techniques
– Query estimation• Execution
– Error handling and recovery– Alternative strategies
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
Prototype to Tool• Enable a close collaboration between LIGO participants
and other gravitational wave communities such as Virgo and GEO
• Replica Catalogs to provide information about data existence (Register new selected data in the catalog to make it available)
• Including Materialized Data (no need to recompute)• Provide access to data analysis software and systems
(Transformation Catalog)• Use the prototype as the basis for a data exchange and
replication system (security)• Provide access to world-wide computing resources
GriPhyN/LIGO prototype 10/2001 Ewa Deelman, ISI
DEMO TONIGHT