28
Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 1 Introducti on to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop April 2007

Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Embed Size (px)

Citation preview

Page 1: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 1

Introduction to

Grids and GridPP Steve Lloyd

Queen Mary, University of LondonLondon Tier-2 Workshop April 2007

Page 2: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 2

Web: Information Sharing

• Invented at CERN by Tim Berners-Lee

• Agreed protocols: HTTP, HTML, URLs

• Anyone can access information and post their own

• Quickly crossed over into public use

No. of

Inte

rnet

host

s (m

illio

ns)

Year

Page 3: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 3

Distributed Resource Sharing @Home Projects

• Uses home PCs to run numerous calculations with dozens of variables.

• Distributed computing project, not a Grid

• Some @home projects– BBC Climate

Change ExperimentSETI @ Home

– FightAIDS@home

Distributed File SharingPeer To Peer Networks

Peer-to-peer network

• No centralised database of files• Legal problems with sharing

copyrighted material• Security problems

Distributed Computing

Page 4: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 4

SETI@home

• A distributed computing project - not really a Grid project

• You pull the data from them rather than they submit the job to you

Arecibo telescope in Puerto Rico

Users - 5,240,038

Results received – 1,632,106,991

Years of CPU Time – 2,121,057

Extraterrestrials found – 0

Page 5: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 5

The Grid

Ian Foster / Carl Kesselman:

"A computational Grid is a hardware and software infrastructure that provides dependable, consistent, pervasive and inexpensive access to high-end computational capabilities."

'Grid' means different things to different people

All agree it's a funding opportunity!

1999 – The “Grid”

Page 6: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 6

Electricity Grid

Analogy with the Electricity Power Grid

'Standard Interface'

Power Stations

Distribution Infrastructure

Page 7: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 7

Computing Grid

Computing and Data Centres

Fibre Optics of the Internet

Page 8: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 8

UK e-Science

"e-Science will change the dynamic of the way Science is undertaken"

"Science increasingly done through distributed global collaborations enabled by the internet using very large data collections, terascale computing resources and high performance visualisation"

Dr John Taylor - Director General of Research Councils:

"e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it"

2001 – Establishment of UK e-Science Programme

Page 9: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 9

Major Activities:•GridPP and AstroGrid (PPARC)•Core e-Science Programme (EPSRC)•Other projects (other RCs)The first phase of the Core e-Science Programme :• A National e-Science Centre linked to a network of Regional Grid

Centres• Generic Grid Middleware and Demonstrator Projects • Grid 'IRC' Research Projects• Support for e-Science Pilot Projects• Participation in International Grid Projects and Activities• Establishment of a Grid Network TeamThe second phase of the Core e-Science Programme :• A National e-Science Centre linked to a network of Regional Grid Centres• Support activities for the UK e-Science Community - National Grid Service

(NGS)• An Open Middleware Infrastructure Institute (OMII)• A Digital Curation Centre (DCC)• New Exemplars for e-Science• Participation in International Grid Projects and Activities.

UK e-Science

Page 10: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 10

e-Science Centres

Core Sites•White Rose (Compute)•Oxford (Compute)•RAL (Data)•Manchester (Data)Partner Sites•Belfast•Bristol•Cardiff•Lancaster•WestminsterAffiliates•Edinburgh (NeSC)National HPC Facilities•HPCx

Page 11: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 11

Gartner Hype Cycle

Page 12: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 12

19 UK Universities, CERN,RAL & Daresbury Funded by PPARC/STFC:

GridPP1 2001-2004 “From Web to Grid”

GridPP2 2004-2008 “From Prototype to Production”

GridPP3 2008-2011“From Production to

Exploitation”

Who are GridPP?

Developed a working, highly functional Grid

Page 13: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 13

The CERN Large Hadron Collider - LHC

4 Large Experiments

The world’s most powerful particle accelerator – Starting 20078

• ~100,000,000 electronic channels

• 800,000,000 proton-proton interactions per second

• 0.0002 Higgs per second• 10 PBytes of data a year • (10 Million GBytes = 14 Million

CDs)

Why?

Page 14: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 14

LHC Computing Grid (LCG)Grid Deployment Project for

the Large Hadron Collider (LHC)

EU Enabling Grids for E-SciencE (EGEE) 2004-2008Grid Deployment Project for all disciplines

GridPP LCG

EGEE

GridPP is part of EGEE and LCG (currently the largest Grid in the world)

UK National Grid ServiceUK’s core production computational and data Grid

Open Science Grid (USA)Science applications from HEP to biochemistry

Nordugrid (Scandinavia)Grid Research and Development collaboration

UK part of LCG

PP part of EGEE

International Context

Page 15: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 15

What is (gLite) Middleware?

MIDDLEWARE

CPUDisks, CPU etc

PROGRAMS

OPERATING SYSTEM

Word/Excel

Email/Web

Your Progra

mGames

CPUCluste

r

UserInterfac

eMachine

CPUCluste

r

CPUCluste

r

Resource Broker

Information Service

Single PC

Grid

DiskServer

Your Progra

m

Middleware is the Operating System of a distributed computing system

Replica CatalogueBookkeepin

g Service

Middleware

Page 16: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 16

Software Stacks

Hardware Resources

Experiment/User Application Software

Grid Middleware

Application MiddlewareIntegration

GridPP

NGSSmall amount of common middleware

Large amount of accessible hardware

Many diverse user communities

A number of different software layers

Page 17: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 17

GridPP Middleware Development

Workload Management

Storage Interfaces

Network Monitoring

SecurityInformation Services

Grid Data Management

GridPP Middleware

Page 18: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 18

The EGEE Grid Status

Worldwide

237 Sites

50 Countries

35,716 CPUs

21.3 PB Disk

10,579 Years of CPU time

UK

21 Sites

8089 CPUs

876 TB Disk

3,361 Years of CPU time

Page 19: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 19

Brunel

Tier Structure

Tier 0

Tier 1National centres

Tier 2Regional groups

Institutes

Workstations

Offline farm

Online system

CERN computer centre

RAL,UK

ScotGrid NorthGridSouthGrid London

FranceItalyGermanyUSA

Imperial QMUL RHULUseful model for Particle Physics but not necessary for others

UCL

Page 20: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 20

ScotGridDurham, Edinburgh, Glasgow NorthGridDaresbury, Lancaster, Liverpool,Manchester, Sheffield

SouthGridBirmingham, Bristol, Cambridge,Oxford, RAL PPD

LondonBrunel, Imperial, QMUL, RHUL, UCL

Mostly funded by HEFCE

UK Tier-2 Centres

Page 21: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 21

What you need to use the Grid

1. Get a digital certificate (UK Certificate Authority)

2. Join a Virtual Organisation (VO)

3. Get access to a local User Interface Machine (UI) and copy your files and certificate there

Authentication – who you are

Authorisation – what you are allowed to do

4. Write some Job Description Language (JDL) and scripts to wrap your programs

############# HelloWorld.jdl #################Executable = "/bin/echo";Arguments = "Hello welcome to the Grid ";StdOutput = "hello.out";StdError = "hello.err";OutputSandbox = {"hello.out","hello.err"};#########################################

Using The Grid

Page 22: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 22

How it works

Job Descriptio

n

UI Machine

Resource Broker

Input SandboxScript you

want to run

Other files (Job

Options, Source...)

Storage Element

Compute Element

Storage Element

Grid

Proxy Certificate

Output SandboxOutput

files (Plots, Logs...)

Input Data

Output Data

Job

Output Sandbo

x

Page 23: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 23

What is it good for?

Problems that are highly parallelizable

Problem

Grid

Solution

Input data is independent e.g.

Images:

A=2B=3 A=3

B=3 A=2B=4

Simulation using different

parameters:

Not so good for closely coupled

problems

These pieces may be independent

These pieces will have to interact

Page 24: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 24

Other Uses for a Grid

• Astronomy

• Healthcare

• Bioinformatics

• Gaming

• Engineering

• Commerce

Page 25: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 25

"GridPP has been developed to help answer questions about the conditions in the Universe just after the Big Bang," said Professor Keith Mason, head of the Particle Physics and Astronomy Research Council (PPARC)."But the same resources and techniques can be exploited by other sciences with a more direct benefit to society."

Avian Flu Studies

Page 26: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 26

Cambridge Ontology

Cambridge Ontology – Startup looking at content based image retrieval. Search picture content without using meta-data or image annotations.

Ontological Query Language

Retrieval Requirements

Retrieval Results Query

EvaluationRelevance

AssessmentSemantic

Gap

Page 27: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 27

Total

Marine Experiment

Results of marine

experiments

Modelled results based on bore-hole data and wave equations

Use the Grid with modelled data to validate results from marine experiments. Other areas potential areas to port: Seismic Processing. Interpretation of subsurface structures. Reservoir / Field modelling

Not real data

Total Exploration & Production

Page 28: Slide 1 Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Introduction to Grids and GridPP Steve Lloyd Queen Mary, University of London London Tier-2 Workshop

Steve Lloyd London Tier-2 Workshop - 16 Apr 2007 Slide 28

Further Information …

http://www.gridpp.ac.uk

RSS News feed