http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 1
CS5547 e-Science & Grid Computing- introduction -
What is e-Science? What is the Grid?Grid middleware
Virtual Organisations - some issuesData access & integration
MetadataMSc in e-Science Technology at-a-glance
http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 2
CS5547 Some definitions
e-Science“The large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet. “Typically, a feature of such collaborative scientific enterprises is that they will require access to very large data collections, very large scale computing resources and high performance visualisation back to the individual user scientists.”
[nesc.ac.uk]
Grid“An infrastructure that enables flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions and resources.”
[Foster & Kesselman, globus.org]
http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 3
CS5547 The Global Grid
http
://ww
w.ne
sc.a
c.uk
/eve
nts/
ahm
2004
/pre
sent
atio
ns/T
onyH
ey.p
pt
http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 4
CS5547 UK SuperJANET 4/5
http
://ww
w.ne
sc.a
c.uk
/eve
nts/
ahm
2004
/pre
sent
atio
ns/T
onyH
ey.p
pt
(Links up to 2.5Gbit/s)
http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 5
CS5547 Scale, distribution, complexity
Multiscale modelling of the heart
Cell
Person
Multiscale modelling of cancer
http
://ww
w.ne
sc.a
c.uk
/eve
nts/
ahm
2004
/pre
sent
atio
ns/T
onyH
ey.p
pt
http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 6
CS5547 Large Hadron Collider (LHC)
http://gridportal.hep.ph.ic.ac.uk/rtm/
http
://ww
w.ne
sc.a
c.uk
/eve
nts/
ahm
2004
/pre
sent
atio
ns/B
obJo
nes.p
pt
http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 7
CS5547 e-Science & engineering
Engine flight data
Airline office
Maintenance Centre
European data center
London AirportNew York Airport
American data center
Grid
Diagnostics Centre
“A Significant factor in the success of the Rolls-Royce campaign to power the Boeing 7E7 with the Trent 1000 was the emphasis on the new aftermarket support service for the engines provided via DS&S. Boeing personnel were shown DAME as an example of the new ways of gathering and processing the large amounts of data that could be retrieved from an advanced aircraft such as the 7E7, and they were very impressed”, DS&S 2004
XTO
Engine Model
Case Based Reasoning
Signal Data Explorer
Companies:Rolls-RoyceDS&S Cybula
Universities:York,Leeds,Sheffield, Oxford
http
://ww
w.ne
sc.a
c.uk
/eve
nts/
ahm
2004
/pre
sent
atio
ns/T
onyH
ey.p
pt
http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 8
CS5547
A B C
A: Identification of overlapping sequenceB: Characterisation of nucleotide sequenceC: Characterisation of protein sequence
e-Science workflows
http
://ww
w.ne
sc.a
c.uk
/eve
nts/
ahm
2004
/pre
sent
atio
ns/T
onyH
ey.p
pt
http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 9
CS5547Grid middleware: Globus toolkit (GT)
The Anatomy of the Grid: Enabling Scalable Virtual Organizations. I. Foster, C. Kesselman, S. Tuecke. International J. Supercomputer Applications, 15(3), 2001.
The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration. I. Foster, C. Kesselman, J. Nick, S. Tuecke, Open Grid Service Infrastructure WG, Global Grid Forum, 2002. ht
tp://
www.
glob
us.o
rg
http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 10
CS5547 Grid & Web Services convergence
The definition of WSRF means that the Grid and Web services communities can move forward on a common base.
http
://ww
w.gl
obus
.org
http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 11
CS5547 Web & Grid Services
‘WS-I+’profile
WS-I
Standards that havebroad industry support
and multiple interoperableimplementations
Specifications that are emergingfrom standardisation process
and are recognised as being ‘useful’
Specifications that have/will enter a standardisation processbut are not stable and are still experimental
http
://ww
w.gl
obus
.org
http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 12
CS5547 UK National Grid Service
Projectse-Mineralse-MaterialsOrbital Dynamics of GalaxiesBioinformatics (using BLAST) GEODISE projectUKQCD Singlet meson projectCensus data analysis MIAKT projecte-HTPX project.RealityGrid (chemistry)
Users LeedsOxfordUCLCardiffSouthamptonImperialLiverpoolSheffieldCambridgeEdinburghQUBBBSRCCCLRC
Interfaces
OGSI::LiteOGSI::Lite
http
://ww
w.ne
sc.a
c.uk
/eve
nts/
ahm
2004
/pre
sent
atio
ns/T
onyH
ey.p
pt
http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 13
CS5547 Grid Virtual Organisations - some issues
Forming a VO dynamically• partner identification• Service Level Agreements
(SLAs)• QoS, trust, reputation
Operating a VO• monitoring QoS• perturbation: coping with
failures - and new opportunities!
• policing: what went wrong? who’s to blame?
www.conoise.org
http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 14
CS5547 Grid Data Service
Data ResourceImplementation
Role Mapper
TheEngine
datadata
dataquery
perform document
response document
element element element
credentials
QueryActivity
TransformActivity
DeliveryActivity
role
credentialsconnection
connection
role
http
://ww
w.og
sada
i.org
.uk/
http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 15
CS5547 GDS - pipeline example
DeliverToURL
<sqlQueryStatement name="statement"> <expression> select * from myTable where id=10 </expression> <resultSetStream name=“MyOutput"/></sqlQueryStatement>
<deliverToURL name="deliverOutput"> <fromLocal from=“MyOutput"/> <toURL> ftp://anon:[email protected]/home </toURL></deliverToURL>
SqlQuery
Statement
http
://ww
w.og
sada
i.org
.uk/
http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 16
CS5547 Grid data access & integration
Solutions in place to handle• heterogeneous data storage• pipelines / dataflows• access control• … within the Grid svc arch
Not specific to e-Science!e.g. see FirstDIG project
Major issues remain, including• provenance - where did it
come from, who did what to it? • data quality - living with
variable-quality data (www.qurator.org)
http
://ww
w.og
sada
i.org
.uk/
http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 17
CS5547 Metadata in e-Science
Publications• formal/reviewed• “grey”• associated artefects
People• expert directories• communities of practice
Projects• formal/funded• working groups
Experiment datasets• formally curated• raw/pre-processed• in vivo / in vitro / in silico
Scientific method• experiment workflow• knowledge roles:
hypotheses, observations, predictions, deductions, …
• Discourse & natural arguments: proof, refutation, agreement, …
http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 18
CS5547 Managing scientific metadata
e-Science metadata management platform
Hypothesis
Hypothesis Publication
Agrees With Hypothesis
Disagrees With Hypothesis
Hypothesis Publication Publication
HypothesisPublication
Experiment
Experiment
Described
In
Evidence
http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 19
CS5547 Fearlus-Gpilot project
desktop client
metadata schema(ontology)
metadata client
Globusclient
http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 20
CS5547 MSc e-Science Technologies: next…
CS5547 e-Science & Grid Computing• Grid middleware, e-Science workflow, metadata
CS5553 Intelligent Architectures• technologies for Virtual Organisations
CS5545 Data Interpretation & Communication• technologies at the data/user-scientist interface
CS5544 E-Technology Workshop• group project, with an e-Science application
CS5945 MSc Project in E-Technology• potential to do a project with user-scientists