30
Grids for the LHC Grids for the LHC Paula Eerola Paula Eerola Lund University, Sweden Lund University, Sweden Four Seas Conference Four Seas Conference Istanbul Istanbul 5-10 September 2004 5-10 September 2004 Acknowledgement: much of the material is from Ian Bird, Lepton- Photon Symposium 2003, Fermilab.

Grids for the LHC

  • Upload
    yeriel

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

Grids for the LHC. Paula Eerola Lund University, Sweden Four Seas Conference Istanbul 5-10 September 2004. Acknowledgement: much of the material is from Ian Bird, Lepton-Photon Symposium 2003, Fermilab. Outline. Introduction What is a Grid? Grids and high-energy physics? Grid projects - PowerPoint PPT Presentation

Citation preview

Page 1: Grids for the LHC

Grids for the LHCGrids for the LHC

Paula EerolaPaula EerolaLund University, SwedenLund University, Sweden

Four Seas ConferenceFour Seas ConferenceIstanbulIstanbul5-10 September 20045-10 September 2004

Acknowledgement: much of the material is from Ian Bird, Lepton-Photon Symposium 2003, Fermilab.

Page 2: Grids for the LHC

Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004

2

OutlineOutline IntroductionIntroduction

– What is a Grid?What is a Grid?– Grids and high-energy physics?Grids and high-energy physics?

Grid projectsGrid projects– EGEEEGEE– NorduGridNorduGrid

LHC Computing Grid projectLHC Computing Grid project– Using grid technology to access and Using grid technology to access and

analyze LHC dataanalyze LHC data OutlookOutlook

Page 3: Grids for the LHC

Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004

3

IntroductionIntroduction

What is a Grid?What is a Grid?

Page 4: Grids for the LHC

Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004

4

About the GridAbout the Grid

WEBWEB: get : get information information on on any computer in the any computer in the worldworld

GRIDGRID: get : get CPUCPU-resources, -resources, diskdisk-resources, -resources, tapetape--resources on any resources on any computer in the worldcomputer in the world

Grid needs advanced Grid needs advanced software, software, middlewaremiddleware, , which connects the which connects the computers togethercomputers together

Grid is the future Grid is the future infrastructure of infrastructure of computing and data computing and data managementmanagement

Page 5: Grids for the LHC

Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004

5

Short historyShort history 1996: Start of the 1996: Start of the Globus Globus project for connecting US project for connecting US

supercomputers togethersupercomputers together (funded by US Defence (funded by US Defence Advanced Research Projects Agency...)Advanced Research Projects Agency...)

1998: early Grid testbeds in the USA - 1998: early Grid testbeds in the USA - supercomputing centers connected togethersupercomputing centers connected together

1998 Ian Foster, Carl Kesselman:1998 Ian Foster, Carl Kesselman:GRID:GRID: Blueprint for a new Computing InfrastructureBlueprint for a new Computing Infrastructure

2000— PC capacity increases, prices drop 2000— PC capacity increases, prices drop supercomputers become obsolete supercomputers become obsolete Grid focus is Grid focus is moved from supercomputers to PC-clustersmoved from supercomputers to PC-clusters

1990’s – WEB, 2000’s – GRID?1990’s – WEB, 2000’s – GRID? Huge Huge commercial interestscommercial interests: IBM, HP, Intel, …: IBM, HP, Intel, …

Page 6: Grids for the LHC

Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004

6

Grid prerequisitesGrid prerequisites Powerful PCs are cheapPowerful PCs are cheap PC-clusters are everywherePC-clusters are everywhere Networks are improving Networks are improving

even faster than CPUseven faster than CPUs Network & Storage & Network & Storage &

Computing Computing exponentials:exponentials:– CPU performance (# CPU performance (#

transistors) doubles every 18 transistors) doubles every 18 monthsmonths

– Data storage (bits per area) Data storage (bits per area) doubles every 12 monthsdoubles every 12 months

– Network capacity (bits per Network capacity (bits per sec) doubles every 9 monthssec) doubles every 9 months

Page 7: Grids for the LHC

Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004

7

Grids and high-energy Grids and high-energy physics?physics?

The Large Hadron The Large Hadron Collider, LHC, start 2007Collider, LHC, start 2007

4 experiments, ATLAS, 4 experiments, ATLAS, CMS, ALICE, LHCb, with CMS, ALICE, LHCb, with physicists from all over physicists from all over the worldthe world

LHC computing = data LHC computing = data processing, data storage, processing, data storage, production of simulated production of simulated datadata

LHC computing is of LHC computing is of unprecedented scaleunprecedented scale

Massive data flow The 4 experiments are accumulating 5-8 PetaBytes of data/year

Massive data flow The 4 experiments are accumulating 5-8 PetaBytes of data/year

Page 8: Grids for the LHC

Paula EerolaFour Seas Conference, Istanbul, 5-10 September 2004

8

Needed capacity Storage – 10 PetaBytes of disk and tape Processing – 100,000 of today’s fastest PCs World-wide data analysis Physicists are located in all the continents

Needed capacity Storage – 10 PetaBytes of disk and tape Processing – 100,000 of today’s fastest PCs World-wide data analysis Physicists are located in all the continents

Computing must be distributed for many reasons

Not feasible to put all the capacity in one place Political, economic, staffing: easier to get funding for resources at home country Faster access to data for all physicists around the world Better sharing of computing resources required by physicists

Computing must be distributed for many reasons

Not feasible to put all the capacity in one place Political, economic, staffing: easier to get funding for resources at home country Faster access to data for all physicists around the world Better sharing of computing resources required by physicists

Page 9: Grids for the LHC

Paula EerolaFour Seas Conference, Istanbul, 5-10 September 2004

9

LHC Computing HierarchyLHC Computing Hierarchy

Tier 1FNAL CenterIN2P3 Center INFN Center RAL Center

Tier 1 Centres = large computer Tier 1 Centres = large computer centers (about 10). Tier 1’s centers (about 10). Tier 1’s provide permanent storage and provide permanent storage and management of management of rawraw, , summarysummary and other data needed during and other data needed during the analysis process.the analysis process.

Tier 2 Centres = smaller Tier 2 Centres = smaller computer centers (several 10’s). computer centers (several 10’s). Tier 2 Centres provide disk Tier 2 Centres provide disk storage and concentrate on storage and concentrate on simulation and end-user simulation and end-user analysis.analysis.

Tier2 CenterTier2 CenterTier2 CenterTier2 CenterTier2 Center

Tier 2

InstituteInstituteInstituteInstitute

Workstations

Physics data cache

CERN Center PBs of Disk;

Tape Robot

~100-1500 MBytes/s Tier 0

Experiment

Tier 0= CERN. Tier 0 receivesTier 0= CERN. Tier 0 receives rawraw data from the Experiments data from the Experiments and records them on permanent and records them on permanent mass storage. First-pass mass storage. First-pass reconstruction of the data, reconstruction of the data, producing producing summarysummary data. data.

Page 10: Grids for the LHC

Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004

10

Grid technology as a Grid technology as a solutionsolution

Grid technology can provide optimized Grid technology can provide optimized access to and use of the computing and access to and use of the computing and storage resourcesstorage resources

Several HEP experiments currently running Several HEP experiments currently running (Babar, CDF/DO, STAR/PHENIX), with (Babar, CDF/DO, STAR/PHENIX), with significant data and computing significant data and computing requirements, have already started to requirements, have already started to deploy grid-based solutionsdeploy grid-based solutions

Grid technology is not yet off-the shelf Grid technology is not yet off-the shelf product product Requires Requires developmentdevelopment of of middleware, protocols, services,…middleware, protocols, services,…Grid development and engineering Grid development and engineering

projects: EDG, EGEE, NorduGrid, Grid3,projects: EDG, EGEE, NorduGrid, Grid3,….….

Grid development and engineering Grid development and engineering projects: EDG, EGEE, NorduGrid, Grid3,projects: EDG, EGEE, NorduGrid, Grid3,….….

Page 11: Grids for the LHC

Grid projectsGrid projects

Page 12: Grids for the LHC

Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004

12

US, Asia, AustraliaUS, Asia, Australia

USAUSA NASA Information Power GridNASA Information Power Grid DOE Science GridDOE Science Grid NSF National Virtual NSF National Virtual

ObservatoryObservatory NSF GriPhyNNSF GriPhyN DOE Particle Physics Data GridDOE Particle Physics Data Grid NSF TeraGridNSF TeraGrid DOE ASCI GridDOE ASCI Grid DOE Earth Systems GridDOE Earth Systems Grid DARPA CoABS GridDARPA CoABS Grid NEESGridNEESGrid DOH BIRNDOH BIRN NSF iVDGLNSF iVDGL ……

Asia, AustraliaAsia, Australia Australia: ECOGRID, Australia: ECOGRID, GRIDBUS,… GRIDBUS,… Japan: BIOGRID, Japan: BIOGRID, NAREGI, …NAREGI, … South Korea: National South Korea: National Grid Basic Plan, Grid Grid Basic Plan, Grid Forum Korea,…Forum Korea,… … …

Page 13: Grids for the LHC

Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004

13

EuropeEurope

EGEEEGEE NorduGridNorduGrid EDG, LCG EDG, LCG UK GridPP UK GridPP INFN Grid, ItalyINFN Grid, Italy Cross-grid projectsCross-grid projects in in

order order to link to link together together Grid Grid projects projects

Many Grid projects have particle physics as the initiator Other fields are joining in: healthcare, bioinformatics,… Address different aspects of grids:

Middleware Infrastructure Networking, cross-Atlantic interoperation

Many Grid projects have particle physics as the initiator Other fields are joining in: healthcare, bioinformatics,… Address different aspects of grids:

Middleware Infrastructure Networking, cross-Atlantic interoperation

Page 14: Grids for the LHC

PARTNERS

70 partners organized in nine regional federations

Coordinating and Lead Partner: CERN

CENTRAL EUROPE – FRANCE - GERMANY & SWITZERLAND – ITALY - IRELAND & UK - NORTHERN EUROPE - SOUTH-EAST EUROPE - SOUTH-WEST EUROPE – RUSSIA - USA

STRATEGY

Leverage current and planned national and regional Grid programmes Build on existing investments in Grid Technology by EU and US Exploit the international dimensions of the HEP-LCG programme Make the most of planned collaboration with NSF CyberInfrastructure initiative

A seamless international Grid infrastructure to provide researchers in academia and industry

with a distributed computing facility

ACTIVITY AREAS

SERVICESDeliver “production level” grid services (manageable, robust, resilient to failure) Ensure security and scalability

MIDDLEWARE Professional Grid middleware re-engineering activity in support of the production services

NETWORKING Proactively market Grid services to new research communities in academia and industry Provide necessary education

Page 15: Grids for the LHC

Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004

15

Create a Create a European-wide Grid InfrastructureEuropean-wide Grid Infrastructure for the support of for the support of research in all scientific areasresearch in all scientific areas, on top of the EU Reseach , on top of the EU Reseach Network infrastructure (GEANT)Network infrastructure (GEANT) Integrate regional grid efforts

EGEE: goals and partnersEGEE: goals and partners

9 regional federations covering 70 partners in 26 countrieshttp://public.eu-egee.org/

Page 16: Grids for the LHC

Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004

16

EGEE projectEGEE project Project funded by EU FP6, 32 MEuro for 2 years Project start 1 April 2004 Activities:

•Grid Infrastructure: Provide a Grid service for science research•Next generation of Grid middleware gLite•Dissemination, Training and Applications (initially HEP & Bio)

Page 17: Grids for the LHC

Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004

17

EGEE: timelineEGEE: timeline

Page 18: Grids for the LHC

Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004

18

Grid in Scandinavia: Grid in Scandinavia: the NorduGrid Projectthe NorduGrid Project

Nordic Testbed forNordic Testbed for

Wide Area Wide Area Computing and Computing and Data HandlingData Handling

www.nordugrid.orgwww.nordugrid.org

Page 19: Grids for the LHC

Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004

19

NorduGrid: original NorduGrid: original objectives and current objectives and current statusstatusGoals 2001 (project start):Goals 2001 (project start): Introduce the Grid to Introduce the Grid to

ScandinaviaScandinavia Create a Grid Create a Grid

infrastructure in Nordic infrastructure in Nordic countriescountries

Apply available Grid Apply available Grid technologies/middlewartechnologies/middleware e

Operate a functional Operate a functional Testbed Testbed

Expose the Expose the infrastructure to end-infrastructure to end-users of different users of different scientific communities scientific communities

Status 2004:Status 2004: The project has grown world-wide: The project has grown world-wide:

nodes in Germany, Slovenia, nodes in Germany, Slovenia, Australia,...Australia,...

39 nodes, 3500 CPUs39 nodes, 3500 CPUs Created own NorduGrid Created own NorduGrid

Middleware, ARC (Advanced Middleware, ARC (Advanced Resource Connector), which is Resource Connector), which is operating in a stable wayoperating in a stable way

Applications: massive production Applications: massive production of ATLAS simulation and of ATLAS simulation and reconstructionreconstruction

Other applications: AMANDA Other applications: AMANDA simulation, genomics, bio-simulation, genomics, bio-informatics, visualization (for informatics, visualization (for metheorological data), multimedia metheorological data), multimedia applications,...applications,...

Status 2004:Status 2004: The project has grown world-wide: The project has grown world-wide:

nodes in Germany, Slovenia, nodes in Germany, Slovenia, Australia,...Australia,...

39 nodes, 3500 CPUs39 nodes, 3500 CPUs Created own NorduGrid Created own NorduGrid

Middleware, ARC (Advanced Middleware, ARC (Advanced Resource Connector), which is Resource Connector), which is operating in a stable wayoperating in a stable way

Applications: massive production Applications: massive production of ATLAS simulation and of ATLAS simulation and reconstructionreconstruction

Other applications: AMANDA Other applications: AMANDA simulation, genomics, bio-simulation, genomics, bio-informatics, visualization (for informatics, visualization (for metheorological data), multimedia metheorological data), multimedia applications,...applications,...

Page 20: Grids for the LHC

Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004

20

Current NorduGrid Current NorduGrid statusstatus

Page 21: Grids for the LHC

The The LLHC HC CComputing omputing GGrid, rid, LCGLCG

The distributed computing environment The distributed computing environment to to analyse the LHC dataanalyse the LHC data

lcg.web.cern.chlcg.web.cern.ch

Page 22: Grids for the LHC

Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004

22

LCG - goalsLCG - goalsGoal: prepare and deploy the computing Goal: prepare and deploy the computing

environmentenvironment

that will be used to that will be used to analyse the LHC dataanalyse the LHC data

Phase 1: 2003 – 2005Phase 1: 2003 – 2005 Build a service prototypeBuild a service prototype Gain experience in running a production grid Gain experience in running a production grid

serviceservice

Phase 2: 2006 – 2008Phase 2: 2006 – 2008 Build and commission the initial LHC computing Build and commission the initial LHC computing

environmentenvironment

2003 2006

Technical Design Report for Phase 2

LCG full multi-tier prototype batch+interactive service

LCG service opens

LCG with upgraded m/w, management etc.

20052004Event simulation productions

Page 23: Grids for the LHC

Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004

23

LCG composition and LCG composition and taskstasks

The LCG Project is a The LCG Project is a collaborationcollaboration of of – The LHC experimentsThe LHC experiments– The Regional Computing CentresThe Regional Computing Centres– Physics institutesPhysics institutes

Development and operation of Development and operation of a distributed a distributed ccomputing service omputing service

– computing and storage resources in computing computing and storage resources in computing centres, physics institutes and universities around the centres, physics institutes and universities around the worldworld

– reliable, coherent environment for the experimentsreliable, coherent environment for the experiments Support forSupport for applications applications

– provision of common tools, frameworks, environment, provision of common tools, frameworks, environment, data persistencydata persistency

Page 24: Grids for the LHC

Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004

24

Resource targets ´04Resource targets ´04   CPUCPU

(kSI2K)(kSI2K)DiskDisk TBTB

SupportSupport FTEFTE

Tape Tape TBTB

CERNCERN 700700 160160 10.010.0 10001000

Czech Rep.Czech Rep. 6060 55 2.52.5 55

FranceFrance 420420 8181 10.210.2 540540

GermanyGermany 207207 4040 9.09.0 6262

HollandHolland 124124 33 4.04.0 1212

ItalyItaly 507507 6060 16.016.0 100100

JapanJapan 220220 4545 5.05.0 100100

PolandPoland 8686 99 5.05.0 2828

RussiaRussia 120120 3030 10.010.0 4040

TaiwanTaiwan 220220 3030 4.04.0 120120

SpainSpain 150150 3030 4.04.0 100100

SwedenSweden 179179 4040 2.02.0 4040

SwitzerlandSwitzerland 2626 55 2.02.0 4040

UKUK 16561656 226226 17.317.3 295295

USAUSA 801801 176176 15.515.5 17411741

TotalTotal 56005600 11691169 120.0120.0 42234223

Page 25: Grids for the LHC

LCG status Sept ’04LCG status Sept ’04 Tier 0 Tier 0 CERNCERN

Tier 1 CentresTier 1 Centres Brookhaven Brookhaven CNAF BolognaCNAF Bologna PIC BarcelonaPIC Barcelona FermilabFermilab FZK Karlsruhe FZK Karlsruhe IN2P3 LyonIN2P3 Lyon Rutherford Rutherford

(UK)(UK) Univ. of TokyoUniv. of Tokyo CERNCERN

Tier 2 centersTier 2 centers South-East Europe: South-East Europe:

HellasGrid, AUTH, HellasGrid, AUTH, Tel-Aviv, Tel-Aviv, WeizmannWeizmann

BudapestBudapest PraguePrague KrakowKrakow Warsaw Warsaw Moscow regionMoscow region ItalyItaly …………

Page 26: Grids for the LHC

Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004

26

LCG status Sept ´04LCG status Sept ´04 First production serviceFirst production service for LHC experiments for LHC experiments

operational operational – Over 70 centers, over 6000 CPUs, although Over 70 centers, over 6000 CPUs, although many of many of

these sites are small and cannot run big simulationsthese sites are small and cannot run big simulations– LCG-2 middlewareLCG-2 middleware – – testing, certification, packaging, testing, certification, packaging,

configuration, distribution and site validationconfiguration, distribution and site validation Grid operations centersGrid operations centers in RAL and Taipei (+US) in RAL and Taipei (+US) – –

performance monitoring, problem solving – 24x7 performance monitoring, problem solving – 24x7 globallyglobally

Grid call centersGrid call centers in FZK Karlsruhe and Taipei. in FZK Karlsruhe and Taipei. Progress towards inter-operation between LCG, Progress towards inter-operation between LCG,

NorduGrid, Grid3 (US)NorduGrid, Grid3 (US)

Page 27: Grids for the LHC

OutlookOutlook

EU vision of EU vision of ee--infrastructure in Europeinfrastructure in Europe

Page 28: Grids for the LHC

Paula EerolaFour Seas Conference, Istanbul, 5-10 September 2004

28

Moving towards an Moving towards an ee--infrastructureinfrastructure

IPv6

IPv6

IPv6

GridsGrids

GridsGÉANT

Grids middleware

Page 29: Grids for the LHC

Paula EerolaFour Seas Conference, Istanbul, 5-10 September 2004

29

Moving towards an e-infrastructure

Grids middleware

Grid-empowered e-infrastructure – “all in one”

e-Infrastructure

Page 30: Grids for the LHC

Paula EerolaPaula EerolaFour Seas Conference, Four Seas Conference, IstanbulIstanbul, 5-10 September 2004, 5-10 September 2004

30

SummarySummary

Huge investment in Huge investment in ee-science and -science and Grids in EuropeGrids in Europe– regional, national, cross-national, regional, national, cross-national,

EUEU Emerging vision of European-wide Emerging vision of European-wide ee--

science infrastructure for researchscience infrastructure for research High Energy Physics is a major High Energy Physics is a major

application that needs this application that needs this infrastructure today and is pushing infrastructure today and is pushing the limits of the technologythe limits of the technology