Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
The Worldwide LHC Computing Grid
Nils HøimyrIT Department
LHC accelerator and detectors
Ultra high vacuum, colder than outer space
LHC ring:27 km circumference
CMS
ALICE
LHCb
ATLAS
CERN IT DepartmentCH-1211 Genève 23
Switzerlandwww.cern.ch/it
7000 tons, 150 million sensorsgenerating data 40 millions times per second
The ATLAS experiment
4
A collision at LHC
January 2013 - The LHC Computing Grid - Nils Høimyr
COLLISIONS
30/11/2016 OpenStack Switzerland Meetup 5
Collisions Produce 1PB/s
Pick the interesting events• 40 million per second
• Fast, simple information• Hardware trigger in
a few micro seconds
• 100 thousand per second• Fast algorithms in local
computer farm • Software trigger in <1 second
• Few 100 per second• Recorded for study
6
Muontracks
Energydeposits
17 January 2017 Computing for the LHC
Pick the interesting events: Data size
• 40 million per second• Fast, simple information• Hardware trigger in
a few micro seconds
• 100 thousand per second• Fast algorithms in
computers • Software trigger
• Few 100 per second• Recorded for study
Computing for the LHC 7
~1 Petabyte per second?• Cannot afford to store it
• 1 year’s worth of LHC data at 1 PB/s would cost few hundred trillion dollars/euros
• Have to filter in real time to keep only “interesting” data
• We keep 1 event in a million • Yes, 99.9999% is thrown away
>>6 Gigabytes per second
17 January 2017
CERN Data Centre
• Built in the 70s on the CERN site (Meyrin-Geneva), 3.5 MW for equipment
• Extension located at Wigner (Budapest), 2.7 MW for equipment• Connected to the Geneva CC with 3x100Gb links (24 ms RTT)• Hardware generally based on commodity• 15,000 servers, providing 190,000 processor cores• 80,000 disk drives providing 250 PB disk space• 104 tape drives, providing 140 PB
Computing for the LHC 817 January 2017
OpenStack day CERN 27/05/19 [email protected] 9
Worldwide computing2017:- 63 MoU’s- 167 sites; 42 countries
2017:- 63 MoU’s- 167 sites; 42 countries
When we started LHC computing (~2001) There were no internet companies, no cloud computing – Google was a search engine, Amazon, etc. did not exist
We had to invent all of the tools from scratch At CERN we had no tools to manage a data centre at the scale we thought was needed (no commercial or OS tools
existed) Initial tools developed through EU Data Grid – Open Source from the beginning
Grid ideas from computer science did not work in the real world at any reasonable scale We (EU, US, LHC grid projects) had to make them work at scale We had to invent trust networks to convince funding agencies to open their resources to federated users
Our users were not convinced that any of this was needed ;-)
OpenStack day CERN 27/05/19 [email protected] 10
Evolution of Grids
26
Data Challenges
First physics
GRID 3
EGEE 1
LCG 1
EU DataGrid
GriPhyN, iVDGL, PPDG
EGEE 2
OSG
LCG 2
EGEE 3
1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008
WLCG
CosmicsService Challenges
26 June 2009 Ian Bird, CERN
Public national and international grid infrastructures
WLCG for LHC
“Grid” Federated, distributed, computing infrastructure
“Grid” Federated, distributed, computing infrastructure
Data - 2018
OpenStack day CERN 27/05/19 [email protected] 11
2018: 88 PBATLAS: 24.7CMS: 43.6LHCb: 7.3ALICE: 12.4
inc. parked b-physics data
Data transfers
HI Run
14 PB in August
0
1
2
3
4
5
6
7
8
2010 Jan
2010 M
ar
2010 M
ay
2010 Jul
2010 Sep
2010 N
ov
2011 Jan
2011 M
ar
2011 M
ay
2011 Jul
2011 Sep
2011 N
ov
2012 Jan
2012 M
ar
2012 M
ay
2012 Jul
2012 Sep
2012 N
ov
2013 Jan
2013 M
ar
2013 M
ay
2013 Jul
2013 Sep
2013 N
ov
2014 Jan
2014 M
ar
2014 M
ay
2014 Jul
2014 Sep
2014 N
ov
2015 Jan
2015 M
ar
2015 M
ay
2015 Jul
2015 Sep
2015 N
ov
2016 Jan
2016 M
ar
2016 M
ay
2016 Jul
2016 Sep
2016 N
ov
2017 Jan
2017 M
ar
2017 M
ay
2017 Jul
2017 Sep
2017 N
ov
2018 Jan
2018 M
ar
2018 M
ay
2018 Jul
2018 Sep
2018 N
ov
2019 Jan
2019 M
ar
Billion H
S06-h
ours
CPU Delivered: HS06-hours/month
ALICE ATLAS CMS LHCb
0
1
2
3
4
5
6
7
8
2010 Jan
2010 M
ar
2010 M
ay
2010 Jul
2010 Sep
2010 N
ov
2011 Jan
2011 M
ar
2011 M
ay
2011 Jul
2011 Sep
2011 N
ov
2012 Jan
2012 M
ar
2012 M
ay
2012 Jul
2012 Sep
2012 N
ov
2013 Jan
2013 M
ar
2013 M
ay
2013 Jul
2013 Sep
2013 N
ov
2014 Jan
2014 M
ar
2014 M
ay
2014 Jul
2014 Sep
2014 N
ov
2015 Jan
2015 M
ar
2015 M
ay
2015 Jul
2015 Sep
2015 N
ov
2016 Jan
2016 M
ar
2016 M
ay
2016 Jul
2016 Sep
2016 N
ov
2017 Jan
2017 M
ar
2017 M
ay
2017 Jul
2017 Sep
2017 N
ov
2018 Jan
2018 M
ar
2018 M
ay
2018 Jul
2018 Sep
2018 N
ov
2019 Jan
2019 M
ar
Billion H
S06-h
ours
CPU Delivered: HS06-hours/month
ALICE ATLAS CMS LHCb
~ 860 k cores continuous
HPC Use is challengingHEP engagement with DOE & NSF in USA and (together with SKA) with PRACE and EuroHPC in Europeand participating in BDEC2 workshops
Heterogenous computing The majority of today’s HEP processing is
performed on dedicated clusters of commodity processors (“x86”)
Recently: opportunistic use of many types of compute, in particular HPC systems, and HLT
In future, this heterogeneity will expand; we must be able to make use of all types: Non-x86 (esp GPU), HPC, clouds, HLT farms
(inc FPGA?)
OpenStack day CERN 27/05/19 [email protected] 12
HLT
Heterogenous compute Requires:
Common provisioning mechanisms, transparent to users
Facilities able to control access (cost), appropriate use, etc
HPC, Clouds, HLT will not have (affordable) local storage service (in the way we assume today) Must be able to deliver data to
them when they are in active use
OpenStack day CERN 27/05/19 [email protected] 13
Deployed in a hybrid cloud mode: • Procurers’ data centres• commercial cloud
service providers • GEANT network and
EduGAIN Federated Identity Management
LHC Schedule
Run 3 Alice, LHCb upgrades
Run 4 ATLAS, CMS upgrades
OpenStack day CERN 27/05/19 [email protected] 14
The HL-LHC computing challenge
HL-LHC needs for ATLAS and CMS are above the expected hardware technology evolution (15% to 20%/yr) and funding (flat)
The main challenge is storage, but computing requirements grow 20-50x
OpenStack day CERN 27/05/19
ATLAS and CMS had to cope with monster pile-up
With L=1.5 x 1034 cm-2 s-1 and 8b4e bunch structure à pile-up of ~ 60 events/x-ing (note: ATLAS and CMS designed for ~ 20 events/x-ing)
CMS: event with 78 reconstructed vertices
CMS: event from 2017 with 78reconstructed vertices
ATLAS: simulation for HL-LHC with 200 vertices
Googlesearches
98 PB
LHC Sciencedata
~200 PBSKA Phase 1 –
2023~300 PB/yearscience data
HL-LHC – 2026~600 PB Raw data
HL-LHC – 2026~1 EB Physics data
SKA Phase 2 – mid-2020’s~1 EB science data
LHC – 201650 PB raw data
Facebookuploads180 PB
GoogleInternet archive~15 EB
Yearly data volumes
10 Billion of theseOpenStack day CERN 27/05/19 [email protected] 16
Many software challenges ● Improved algorithms, Machine Learning (ML)
– “ML” as Neural Networks used for more than 20 years in HEP
– A lot of development in the IT industry in this area, scope for re-use and improvements
● Vectorisation, GPUs, FPGAs, other processor architectures
● Data Analysis model and software changes
● Visualisation
● Storage and preservation
CERN IT DepartmentCH-1211 Genève 23
Switzerlandwww.cern.ch/it
Grid vs Cloud
• “Cloud computing” everywhere– Web based solutions (http/https and RES)
– Virtualisation, containers….
• GRID has mainly a scientific user base– Complex applications running across multiple sites, but
works like a cluster batch system for the end user
– Mainly suitable for parallel computing and massive data processing
• Technologies converging– “Internal Cloud” at CERN – OpenStack
– Xbatch – extending to external cloud providers
– CernVM – virtual machine running e.g. at Amazon
– Google Cloud Kubernetes Higgs Challenge
– “Volunteer Cloud” - LHC@home 2.0
CERN IT DepartmentCH-1211 Genève 23
Switzerlandwww.cern.ch/it
Volunteer grid - LHC@home
• LHC volunteer computing– Allows us to get additional computing
resources for e.g. accelerator physics and theory simulations
• Based on BOINC– “Berkeley Open Infrastructure for
Network Computing”
– Software platform for distributed computing using volunteered computer resources
– Uses a volunteer PC’s unused CPU cycles to analyse scientific data
– Virtualization support - CernVM
– Other well known projects• SETI@Home
• Climateprediction.net
• Einstein@Home
20
You can help us!
• As a volunteer, you can help us by donating CPU when your computer is idle
• Connect with us on: – http://cern.ch/lhcathome
20
The Balance between Academic Freedom, Operations & Computer Security
http://cern.ch/security
Open Data
CERN IT DepartmentCH-1211 Genève 23
Switzerlandwww.cern.ch/it19 August 2015 CERN-ITU - Frédéric Hemmer 23
http://opendata.cern.ch
http://zenodo.org
http://cds.cern.ch
Open Data – Open KnowledgeCERN & the LHC experiments have made the first steps towards Open Data (http://opendata.cern.ch/)
- Key drivers: Educational Outreach & Reproducibility
- Increasingly required by Funding Agencies
- Paving the way for Open Knowledge as envisioned by DPHEP (http://dphep.org)
- ICFA Study Group on Data Preservation and Long Term Analysis in High Energy Physics
CERN has released Zenodo, a platform for Open Data as a Service (http://zenodo.org)1
• Building on experience of Digital Libraries & Extreme scale data management
• Targeted at the long tail of science
• Citable through DOIs, including the associated software
• Generated significant interest from open data publishers such as Wiley, Ubiquity, F1000, eLife, PLOS
1Initially cofunded by the EC FP7 OpenAire series of projects
19 August 2015 CERN-ITU - Frédéric Hemmer 24
Training
16 January 2015 IT Department Meeting 25
CERN School of Computing
13 October 2016 CERN-NTNU Nils Høimyr 26
• A science – industry partnership to drive R&D and innovation with over a decade of success
• Evaluate state-of-the-art technologies in a challenging environment and improve them
• Test in a research environment today what will be used in many business sectors tomorrow
• Train next generation of engineers/employees
• Disseminate results and outreach to new audiences
27The Worldwide LHC Computing Grid
CERN IT DepartmentCH-1211 Genève 23
Switzerlandwww.cern.ch/it
IT at CERN – more than the Grid
• Physics computing – Grids (this talk!)
• Administrative information systems– Financial and administrative management systems, e-business...
• Desktop and office computing– Windows, Linux and Web infrastructure for day to day use
• Engineering applications and databases– CAD/CAM/CAE (Autocad, Catia, Cadence, Ansys etc)
– A number of technical information systems based on Oracle, MySQL
• Controls systems– Process control of accelerators, experiments and infrastructure
• Networks and telecom– European IP hub, security, telephony software...
More information: http://cern.ch/it
Thank You!Thank You!
Accelerating Science and Innovation Accelerating Science and InnovationCERN/IT Nils Høimyr