69
The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, [email protected] Co-director e-Science North West UK regional centre Director myGrid UK e-Science pilot project Co-chair Global Grid Forum Semantic Grid Research Group

The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, [email protected]@cs.man.ac.uk Co-director e-Science North

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

The Grid Needs You. Enlist Now!

Professor Carole GobleUniversity of Manchester, UK, [email protected]

Co-director e-Science North West UK regional centreDirector myGrid UK e-Science pilot projectCo-chair Global Grid Forum Semantic Grid Research Group

Page 2: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

The Grid Needs You. Enlist Now!

The what and why of the Grid.

Services, data and semantics and the Grid.

Getting involved – a call to arms.

Page 3: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

The take home

“The Grid is the next big thing” – and it isn’t just big computers and fat pipes.

The Grid is actually the latest attempt at distributed computing

If you aren’t involved yet maybe its because you don’t think its relevant, or its done already or you haven’t anything to offer

You are most likely wrong If you are already into the Grid

this is a “ra ra” exercise

Page 4: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Origins of the Grid

The Grid: Blueprint for a New Computing Infrastructure

Edited by Ian Foster and Carl Kesselman

July 1998, 701 pages. a proposed distributed

computing infrastructure for advanced science and engineering

pervasive and dependable

Page 5: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

What is the Grid?

Computational power as a utility Securely and transparently sharing supercomputing

resources on demand. Fast pig iron with fat pipes for cycle intensive

scientific problems Large scale data access and transportation Making the most of what you have got

Page 6: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Why do it now?

Enormous quantities of data: Petabytes For an increasing number of

communities, gating step is not collection but analysis

Ubiquitous Internet: 100+ million hosts Collaboration & resource sharing the

norm Ultra-high-speed networks: 10+ Gb/s

Global optical networks Huge quantities of computing: 100+ Top/s

Moore’s law gives us all supercomputers

114 genomes735 in progress

Page 7: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Isn’t this just high performance computing for high energy physicists?

Page 8: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

What is the Grid for?

Global e-Science Large-scale science and engineering are done

through the interaction of people, heterogeneous computing resources, information systems, and instruments, all of which are geographically and organizationally dispersed.

The motivation for “Grids” is to facilitate the routine interactions of these resources in order to support large-scale science and engineering.

KEYWORDS Collaboration, Democratization, Speculation

Bill Johnston, NASA July 01

Page 9: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

9

Global Collaborative Knowledge Communities

Slide courtesy of Ian Foster

Page 10: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Global Knowledge Communities

Teams organised around common goals Communities: “Virtual organisations” Overlapping memberships, resources and activities

Essential diversity is a strength & challenge membership & capabilities

Geographic and political distribution No location/organisation/country possesses all required

skills and resources Dynamic: adapt as a function of their situation

Adjust membership, reallocate responsibilities, renegotiate resources

Slide derived from Ian Foster’s SSDBM 03 keynote

Page 11: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

The Grid Opportunity“flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources - what we refer to as virtual organizations."

The Anatomy of the Grid: Enabling Scalable Virtual OrganizationsFoster, Kesselman, Tueke

KEYWORD: VIRTUALISATION

Page 12: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Why Grids? A biochemist exploits 10,000 computers to screen 100,000

compounds in an hour; A biologist combines a range of diverse and distributed

resources (databases, tools, instruments) to answer complex questions;

1,000 physicists worldwide pool resources for petaop analyses of petabytes of data

Civil engineers collaborate to design, execute, & analyze shake table experiments

Climate scientists visualize, annotate, & analyze terabyte simulation datasets

An emergency response team couples real time data, weather model, population data

A multidisciplinary analysis in aerospace couples code and data in four companies

Slide courtesy of Steve Tuecke

Page 13: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Telemicroscopy Sharing of UHVEM(Ultra High Voltage Electron Microscopy) in

Osaka University with NCMIR (National Center for Microscopy and Imaging Research) 3 Million electron volts; the most powerful microscopy facility

KEYWORDS: SHARING SCARCE RESOURCES ON DEMAND

Tokyo XP(Chicago)

STAR TAP

TransPACAPAN

vBNS

(UC San Diego)SDSC

NCMIR(San Diego)

UHVEM(Osaka, Japan)

JGN

Osaka University

Page 14: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Smallpox Grid

United Devices, IBM, Oxford University, Accelrys

Analysis of 35 million drug compounds against nine smallpox proteins to try to find a way to stop the replication of the virus.

Volunteers from over 190 countries donated their spare CPU power at www.grid.org, the world's largest public computing resource

Contributed over 39,000 years of computing time in less than six months.

September 30, 2003 —delivered the results of the Smallpox Research Grid project to representatives from the United States Department of Defense in an event hosted by the British Embassy in Washington, D.C.

Page 15: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

230 - Radiologists (Double Reading)50% - Workload Increase

2,000,000 - Screened every Year120,000 - Recalled for Assessment10,000 - Cancers1,250 - Lives Saved

Digital

Digital

Page 16: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

RealityGrid http://www.realitygrid.org

Closely coupling computation and experiment to speed up scientific discovery.

Simulation, visualization and data gathering coupled

X-ray microtomography produces 3D X-ray attenuation maps of specimens at a microscopic level

Scientist remotely steers calculation from laptop

Visualization and computation use supercomputers accessed via Grid.

Page 17: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Collaboration Interactive

environments and virtual presence integrated with Grid middleware

SARS Combat Grid, Taiwan Emergency Access Grids Integration of patient data and

models of dissemination

http://www.accessgrid.org

Page 18: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Access Grid

Page 19: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Foundation for e-Science

sensor nets

Shared data archives

computers

software

colleagues

instruments

Grid

Diagram derived fromIan Foster’s slide

Page 20: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Butterfly.net Fully-distributed server

technology pioneering the use of open grid computing protocols in large-scale immersive game networks that support unlimited numbers of players and require the most demanding levels of service.

Page 21: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

More commercial examples…

Novartis Pharmaceuticals accelerate lead identification and profiling to increase relevant targets in drug discovery, screening applications that were previously considered CPU constrained.

Nippon Life Insurance improve the performance of Financial Risk Management Applications customer project in applying Grid technology for this application. Reduced processing time for financial risk calculation from around 10 hours to about 49 minutes – a 12-fold increase in speed. Can run more complex scenarios to reduce risk exposure

Page 22: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Global Grid Forumhttp://www.ggf.org

Standards body for Grid Computing

Over 2000 members All the vendors 44 WGs and RGs Three meetings per

annum ~ 1000 attendees at

plenary meetings ~ 400 at “working”

meetings GGF10 Frankfurt, March

2004

Page 23: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Investment

UK Government invested £240 million into e-Science and Grid related research

EU invested ~€351million in FP5 and FP6 USA invested – lots! IBM invested ~10-20% R&D budget in Grid

Computing $1.5million per annum on GridFTP alone

Japan and China invested in Grids Practically every EU member has a Grid programme.

Page 24: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

The Grid means what I say it means

The Grid – the vision of forming federations A Grid - A virtual organisation of resources

Machines – computational grid Geography – a UK Grid A field – Mouse Genome Grid A (temporary) problem – protein folding simulation

No one grid – lots of interoperating Grids Grid middleware infrastructure specification

Services stacks, policies, protocols, standards, APIs Reference implementations

Globus, Condor, Unicore, Sun Grid Engine, Avaki, United Devices...

Grid tools Portals, heartbeat monitors etc

E-Science: application of all the above for the benefit of Science

Page 25: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

The Grid is forming federations… Infrastructure middleware for establishing, managing,

and evolving multi-organizational federations Dynamic, autonomous, domain independent On-demand, ubiquitous access to computing, data,

and services Mechanisms for resource virtualization & workflow

management within federations New capabilities constructed dynamically and

transparently from distributed services Service-oriented, virtualization

Page 26: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

…when the federations are… Dynamic and volatile. A consortium of services

(databases, sensors, compute servers) participating in a complex analysis may be switched in and out they become available or cease to be available;

Ad-hoc. Service consortia have no central location, no central control, and no existing trust relationships;

Large Hundreds of services could be orchestrated at any time;

Potentially long-lived. A simulation could take weeks.

HOLD THESE THOUGHTS!

Page 27: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

myGrid http://www.mygrid.org.uk

Knowledge-driven middleware for data intensive ad hoc in silico experiments in biology

Straightforward discovery, interoperation, deployment & sharing of services

Service-oriented architecture

Semantic based discovery of workflows and workflow composition

Integration and Information

Workflow & Distributed DB Queries

Experimentation

Provenance, propagating change, personalisation

Page 28: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Three legacy views

Grid middleware is a bag of low level protocols The Grid is about compute cycle stealing The Grid is about plumbing and has nothing to do

with semantics

Page 29: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Three legacy views

Grid middleware is a bag of low level protocols The Grid is about compute cycle stealing The Grid is about plumbing and has nothing to do

with semantics

This was once true. Some still hold this view (notably US programme managers)

It is not the view of the Grid visionaries or the Grid policy makers outside the US.

Page 30: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Three legacy views

Grid middleware is a bag of low level protocols The Grid is about compute cycle stealing The Grid is about plumbing and has nothing to do

with semantics

This was once true. Some still hold this view (notably US programme managers)

It is not the view of the Grid visionaries or the Grid policy makers outside the US.

The Open Grid Service Architecture

Data Grids

Semantic Grids

Page 31: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Grid Evolution1st generation

Incr

ease

d fu

nctio

nalit

y,st

anda

rdiz

atio

n

Time

Customsolutions

Globus Toolkit

Defacto standardsGGF: GridFTP, GSI

X.509,LDAP,FTP, …

(based on Foster GGF7 Plenary)

• Computationally intensive• File access/transfer• Bag of various heterogeneous protocols & toolkits•Monolithic design• Recognises internet, ignores Web• Academic teams

Legion, Condor, Unicore …

Page 32: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Grid Evolution2nd Generation

Incr

ease

d fu

nctio

nalit

y,st

anda

rdiz

atio

n

Time

Customsolutions

Open GridServices Arch

GGF: OGSI, …(+ OASIS, W3C)

Multiple implementations,including Globus Toolkit 3

Web services

Globus Toolkit

Defacto standardsGGF: GridFTP, GSI

X.509,LDAP,FTP, …

App-specificServices

• Data intensive -> knowledge intensive• Open services-based architecture• Recognises Web services• Global Grid Forum• Industry participation

(based on Foster GGF7 Plenary)

Page 33: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Open Grid Services Architectureongoing since early 2002

Standard mechanisms for describing and invoking services: WSDL, SOAP, WS-Security etc

Standard interfaces and behaviours for distributed systems: naming, service state, lifetime management, notification

Standard services: agreement, data access and integration, workflow, security, policy…

Specific services: drug discovery pipeline

OGSA

OGSI

Web Services

Grid Applications

(Graphic courtesy of Savas Parastatidis )

Page 34: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

OGSI: Standard Web Services Interfaces & Behaviours Naming and bindings (basis for virtualization)

Every service instance has a unique name (Grid Service Handle) from which can discover supported bindings which are volatile (Grid Service Reference)

Two tiered naming scheme to cope with service migration and failover

Lifecycle (basis for fault resilient state management) Service instances created by factories Destroyed explicitly or via soft state

Information model (basis for monitoring & discovery) Service data (attributes) associated with GS instances (SDEs) Operations for querying (introspecting) and setting this info Asynchronous notification of changes to service data

Service Groups (basis for registries & collective services) Group membership rules & membership management

Base Fault type All sound kind of familiar?

Page 35: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

OGSI

Implementation

Servicedata

element

Other standard interfaces:factory,

notification,collections

Hosting environment/runtime(“C”, J2EE, .NET, …)

Servicedata

element

Servicedata

element

GridService(required)

Dataaccess

Lifetime management• Explicit destruction• Soft-state lifetime

Introspection:• What port types?• What policy?• What state?

Client

Grid ServiceHandle

Grid ServiceReference

handleresolution

(Slide courtesy of Ian Foster)

Page 36: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Service registry

Service requestor (e.g. user application)

Service factory

Create Service

Grid Service Handle

Resource allocation

Service instances

Register Service

Service discovery

Interactions standardized using WSDL

Service data Keep-alives Notifications Service invocation

Authentication & authorization are applied to all requests

OGSI

(Slide courtesy of Ian Foster)

Page 37: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Service Migration

Hosting Environment B

GSH...

hdl:1.2/abc...

GSR...

<wsdl>...

Service

Hosting Environment A

Service1. Service Migration

RequesterHandleResolver

2. new network endpoint (GSR) registration for

same GSH

3. failed access with old network

endpoint info (old GSR)

6. successful access to moved service throughnew GSR

5. new GSR with new network endpoint

4. findByHandle(GSH) GSHhdl:1.2/abc

GSR<wsdl>

Service Locator

(Slide courtesy of Ian Foster)

Page 38: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Sound familiar?

Layering a component-based distributed object model over a web service framework

Early OGSI implementations Globus Toolkit 3 OGSI.NET OGSI::Lite Unicore

Web ServicesLoose coupled, stateless, persistent

CORBATightly coupled, naming, stateful,

lifetime management

Grid ServicesRobust naming, stateful,

lifetime management

Page 39: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

OGSI Status and Issues

OGSI version 1.0 in GGF proposed recommendation

Issue: compliance to Web Service Standards GWSDL changes WSDL 1.1 by

extending portType syntax to define a Service Data Element.

Why not use WS standards for state management idioms: e.g. WS-Context/Coordination?

By eliminating a new mandatory infrastructure (OGSI), can use conventional tooling.

But it needs to meet the requirements of Grid

Service Implementation

OGSA Operations

Service specific

operations

WS-Context and/or

other WS-*

https://forge.gridforum.org/projects/ogsi-wg

(Graphic courtesy of Savas Parastatidis)

Page 40: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

300 pound gorillas

If you want to use standards then you have to use them or work with them

W3C and OASIS are big gorillas

E.g. GSH/GSR, Handle.net, Life Science Identifer and WS-address

Page 41: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Grid Applications On The MoveThe rise of the Information Grid

Large scale dataLarge number of machinesComputationally intensiveSimple semanticsSmall homogeneous communities

Smaller scale dataData intensiveComplex heterogeneous applicationsComplex semanticsLarge diverse communities

High Energy Physics Functional GenomicsOceanographyBiodiversityEarth ScienceNeuroscience …

Page 42: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Data-intensive integration:what the e-scientist REALLY wants Scientists do data

integration Actually they do

application and model integration too!

Cooperative information systems

Workflows Data virtualisation

Page 43: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Integrating Across Biological Systems

Page 44: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

… & Types of Information

ID MURA_BACSU STANDARD; PRT; 429 AA.DE PROBABLE UDP-N-ACETYLGLUCOSAMINE 1-CARBOXYVINYLTRANSFERASEDE (EC 2.5.1.7) (ENOYLPYRUVATE TRANSFERASE) (UDP-N-ACETYLGLUCOSAMINEDE ENOLPYRUVYL TRANSFERASE) (EPT).GN MURA OR MURZ.OS BACILLUS SUBTILIS.OC BACTERIA; FIRMICUTES; BACILLUS/CLOSTRIDIUM GROUP; BACILLACEAE;OC BACILLUS.KW PEPTIDOGLYCAN SYNTHESIS; CELL WALL; TRANSFERASE.FT ACT_SITE 116 116 BINDS PEP (BY SIMILARITY).FT CONFLICT 374 374 S -> A (IN REF. 3).SQ SEQUENCE 429 AA; 46016 MW; 02018C5C CRC32; MEKLNIAGGD SLNGTVHISG AKNSAVALIP ATILANSEVT IEGLPEISDI ETLRDLLKEI GGNVHFENGE MVVDPTSMIS MPLPNGKVKK LRASYYLMGA MLGRFKQAVI GLPGGCHLGP RPIDQHIKGF EALGAEVTNE QGAIYLRAER LRGARIYLDV VSVGATINIM LAAVLAEGKT IIENAAKEPE IIDVATLLTS MGAKIKGAGT NVIRIDGVKE LHGCKHTIIP DRIEAGTFMI

Page 45: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Data on the Grid pre: OGSA

Chiefly files!

LDAP as a query language

No RDBMS access from Globus 1.1

MDS and MCAT catalogs

Honorable exception

Storage Resource Broker

“Support data-intensive applications that manipulate very large data sets by building upon object-relational database technology and archival storage technology”

Page 46: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

OGSA-Data Access and IntegrationGGF OGSA-DAIS WG

Data Grid applications benefit from many lower level services: Data movement. Data Replication. Data Virtualisation Database access and

integration. Work underway on designing,

developing and standardising many core Grid Data Management services.

Designing services in a dynamic and heterogeneous environment is non-trivial,

Plenty to be done!!

OGSA-DAI Basic Services

OGSA-DAI Distributed Query

Database, Communication, OS… Technology

Resource Grid Infrastructure – OGSA…

Data Grid Infrastructure –Location, Delivery, Replication…

Clever semantic integration stuff here

Page 47: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Infrastructure Architecture

OGSA

OGSI: Interface to Grid Infrastructure

Data Intensive Applications for X-ology Research

Compute, Data & Storage Resources

Distributed

Simulation, Analysis & Integration Technology for X-ology

Data Intensive X-ology Researchers

Virtual Integration Architecture

Generic Virtual Data Access and Integration Layer

Structured DataIntegration

Structured Data Access

Structured Data Relational XML Semi-structured-

Transformation

Registry

Job Submission

Data Transport Resource Usage

Banking

Brokering Workflow

Authorisation

(Slide Courtesy Malcolm Atkinson, UK National e-Science Centre

Page 48: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

OGSA-DAIS, OGSA-DAIS, OGSA-DAIT

DB2

Oracle 10g

Part of Globus Toolkit 3

Data can be XML, RDBMS and ODBMS

UK dominance

Page 49: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

1a. Request to Registry for sources of data about “x”

1b. Registry responds with

Factory handle2a. Request to Factory for access to database

2c. Factory returns handle of GDS to client

3a. Client queries GDS with XPath, SQL, etc

3b. GDS interacts with database

3c. Results of query returned to client as XML

SOAP/HTTP

service creation

API interactions

Registry

Factory

2b. Factory creates GridDataService to manage access

Grid Data Service

Client

XML / Relational database

Data Access & Integration Services

Slide Courtesy Malcolm Atkinson, UK eScience Center

Page 50: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Virtual Data Concept Capture and manage information about

relationships among Data (of widely varying representations) Programs (& their execution needs) Computations (& execution

environments) Apply this information to, e.g.

Discovery: Data and program discovery Workflow: for organizing, locating,

specifying, & requesting data Explanation: provenance Planning and scheduling

mass = 200decay = WWstability = 1event = 8

mass = 200decay = WWstability = 1plot = 1

mass = 200decay = WWplot = 1

mass = 200decay = WWevent = 8

mass = 200decay = WWstability = 1

mass = 200decay = WWstability = 3

mass = 200

mass = 200decay = WW

mass = 200decay = ZZ

mass = 200decay = bb

mass = 200plot = 1

mass = 200event = 8

mass = 200decay = WWstability = 1LowPt = 20HighPt = 10000

Search for WW decays of the Higgs Boson for which only stable, final state particles are recorded?

Workflow by Rick Cavanaugh and Dimitri Bourilkov, University of Florida

Page 51: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Grid intelligence: semantics A gap between grid computing

endeavours and the vision of Grid computing

To support the full richness of the grid computing vision we need to explicitly assert & explicitly use semantics (knowledge) throughout the Grid software stack

The Grid has always had lots of semantics embedded in Schema and Directory services, and used by schedulers and brokers Globus MDS2 -> Globus Information

Service Condor ClassAds

Page 52: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Semantic Grid http://www.semanticgrid.org

Semantic Web Services -> Semantic Grid Services

GGF SEM-GRD RG bringing semantic web technologies and techniques to the Grid Ontologies & RDF

Knowledge Services

Semantic Information Services

Base Services: Data/computation Services

e-Scientist environment

Problem Solving Environments

Application PortalsCollaboratories

Page 53: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Grids are driven by metadata

The semantics might be buried but they are there nonetheless!

Grid Applications Operational know-how of the domain.

a query or workflow; the annotation of results, parameters, personal notes, provenance data describing sources and derivation paths of information, etc

Knowledge about the domain: its data and its processes

Page 54: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

A Multi-Hierarchical Rock Classification Ontology (GSC)

Composition

Genesis

Fabric

Texture

Slide courtesy of Bertram Ludascher

Page 55: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Grids are driven by metadatathe semantics might be buried but they are there nonetheless!

Grid infrastructure the classification of

computational and data resources, performance metrics, job control; schema integration, workflow descriptions, resource brokering, resource scheduling, service state, event notification topics, typing service inputs and outputs, provenance trails; access rights to databases, personal profiles and security groupings; charging infrastructure …

problem solving selection and intelligent portals;

Managing and operating a Grid intelligently requires the

interpretation of knowledge about the state and properties of

Grid components, and their configurations for solving

problems

Knowledge permeates the GridData elementsService descriptions (service data elements)Protocols (e.g. policy, provisioning)

Page 56: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Semantics in myGrid http://www.mygrid.org.uk

Service discovery

Service discovery

Workflow construction

Workflow construction

Workflow discoveryWorkflow discovery

Semantic mark up of results and logs

Semantic mark up of results and logs

Page 57: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

GridGridGrid

workflow executor(DAGman)Execution

WorkflowPlanning

Globus ReplicaLocation Service

Globus Monitoringand Discovery

Service

Information andModels

Metadata CatalogService

Resource Models

detector

Raw data

Co

nc

rete

Wo

rkfl

ow

High-level specs ofdesired results andintermediate data

products

Dy

na

mic

info

rma

tio

n

Request Manager

CurrentState

Generator

Submission andMonitoring System

AI-basedPlanner

Pegasus planning environment for LIGO Pulsar search

Slide courtesy of Jim Blyth

Page 58: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

NJSNJSNJSNJS

BrokerBrokerBrokerBroker

Unicore BrokerUnicore Broker Globus BrokerGlobus Broker

IDBIDBTranslatorTranslator

FilterFilterOntology engineOntology engine

Resource DiscoveryResource DiscoveryServiceService

Delegates resource checkDelegates resource check

LookupLookupresourcesresources

Delegates Delegates translationtranslation Uses to drive Uses to drive

MDS searchMDS search

HierarchicalHierarchicalGrid SearchGrid Search

DiagramDiagramOf Broker Of Broker ArchitectureArchitecture

Grid Interoperability ProjectGrid Interoperability ProjectInteroperable Resource BrokerInteroperable Resource Broker

FilterFilter

Uses to Uses to Drive MDSDrive MDSSearchSearch

Nodal Nodal Grid SearchGrid Search

OtherOtherBrokersBrokers

Resource DiscoveryResource DiscoveryServiceService

Slide courtesy of John Brooke

Page 59: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Semantics for integration and scientific workflows “Semantic registration” of data sets; How to employ semantic information in data discovery, workflow

discovery, service discovery, data binding, query and workflow planning and execution;

Semantic matchmaking of grid resources to satisfy requirements of application components in workflows, and indeed substituting whole workflows;

Intelligent reasoners for grid computing (semantic matchmakers, planners, resource brokers, etc.) that exploit knowledge of scientific applications as well as grid resources;

Scientific workflow design and execution; Scientific workflow lifecycle & methodology (authoring,

publishing, discovering, personalising, enacting, validating, modifying of workflows)

The list goes on….

Page 60: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Semantic Grid

Web Services

GridSemantic

Web

Semantic Grid

Grid services

Semantic WebServices

Semantics for the Grid

Grid-ware Semantic Services

Page 61: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Semantic Grid

ClassicalWeb

ClassicalComputational Grid

SemanticWeb

Dat

a an

d S

eman

tics

com

plex

ity

Computational complexity

DynamicWeb

Info/DataWeb

Web Services

Grid Services

An attempt at a context picture

Page 62: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Reality Check!

Official production request of the CMS collaboration of 1,200,000 Monte Carlo simulation data with Grid resources.

“We encountered many problems during the run, and fixed many of them, including integration issues arising from the integration of legacy CMS software tools with Grid tools, bottlenecks arising from operating system limitations, and bugs in both the grid middleware and application software.

Every component of the software contributed to the overall "problem count" in some way. However, we found that with the current level of functionality, we were able to operate the US-CMS Grid with 1.0 FTE effort during quiescent times over and above normal system administration and up to 2.5 FTE during crises.”

“The Grid in Action: Notes from the Front” G. Graham,R. Cavanaugh, P. Couvares, A. DeSmet, M. Livny, 2003

Page 63: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Slide courtesy of Miron Livny

Benefits

Effort

Goal

IntraGrids You are here

One ofa kind… or here

Page 64: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Ok, what’s the reality? The Grid is in the same state as the Web

was 10 years ago Few production grids and not many killer

demos - something you couldn’t have done before.

Middleware hard to use and incomplete (and certainly not invisible!)

OGSA in its infancy. Varying degrees of maturity, but people

use it anyway! Deployment, research, development,

applications and standardisation all happening together

Danger of half-baked solutions, premature standardisation, a Grid Winter

Pioneering spirit! It’s the Wild West!!It’s all very

exciting and rather

daunting

Page 65: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Are you involved in Grid?

There is hardly a paper at OTM that isn’t relevant. But participation in Grid is largely from the “Grid

Community” When the database people came to town they

rocked it! But there are not so many that take part, and it’s the

vendors that dominate though there are many research problems to overcome.

Reinvention, muddle, confusion ensues. Why aren’t you involved?

Page 66: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Why you should be involved in Grid

Established communities can be hard work to get involved in the latest thing

DCOM, CORBA, WS…we have seen it all before! So your history is valuable. And its not just rehashing

your history either (crossing out agents and crayoning in grid ain’t gonna work!)

An amazing, open and active community. With tons of real applications and users who really

need this stuff. GridPP had better work!!

Some substantial industry and government backing.

Page 67: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Between community travellers Pioneers on tour!

The Web

The GridThe Semantic Web

WWW2002 Waikiki, Hawaii

SSDBM2003 ISWC2002 WWW2002 VLDB2003 OTM2003 AIMA2003

Page 68: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

Grid Middleware On The Move

Open Service

Architecture

Data and Information

Grids

Semantic GridsSecond

Generation Grid

Computing

Page 69: The Grid Needs You. Enlist Now! Professor Carole Goble University of Manchester, UK, carole@cs.man.ac.ukcarole@cs.man.ac.uk Co-director e-Science North

The Grid Needs You! Enlist Now!

http://www.ggf.org

The Grid

Now with added services architecture, data

management and semantics!!