LISHEP, Rio de Janeiro, 20 February 2004
Russia in Russia in LHC DCs andLHC DCs and
EDG/LCG/EGEEEDG/LCG/EGEE
V.A. IlyinMoscow State University
MONARC project
regional group
LHC Computing GRID: the “cloud” view
CERNTier3physics
department
Desktop
Germany
UK
France
Italy
CERN Tier1
USA
Tier1
The opportunity ofGrid technology
Tier2
Uni a
Lab c
Uni n
Lab m
Lab b
Uni bUni y
Uni x
Russia
Russian Tier2-ClusterRussian regional center for LHC computing (RRC-LHC)
Cluster of institutional computing centers with Tier2 functionality and summary resources at 50-70% level of the canonical Tier1 center for each experiment (ALICE, ATLAS, CMS, LHCb):
analysis; simulations; users data support.
Participating institutes:Moscow ITEP, RRC KI, MSU, LPI, MEPhI…Moscow region JINR, IHEP, INR RASSt.Petersburg PNPI RAS, …Novosibirsk BINP SB RAS
Coherent use of distributed resources by means of LCG (EDG+VDT, …) technologies.
Active participation in the LCG Phase1 Prototyping and Data Challenges (at 5% level).
2002 2003 2004 Q4 2007
CPU kSI95 5 10 25-35 410
Disk TB 7 12 30 850
Tape TB - 6 30 1250
International
connectivity (CERN)
Mbps
20 155 155/… Gbps/…
Russia Country MapThree regions are indicated on the map, where HEP centers are located: Moscow, St-Petersburg and Novosibirsk
Site (Centre)Acc./Coll. HEP Fac.
Other Exp’s Participation in major HEP Int. Collab.
BINP SB RAS(Novosibirsk)
http://www.inp.nsk.su
VEPP-2M (linear collider at 1.4 GeV)VEPP-4 (linear collider up to 6 GeV)
Non-Acc. HEP Exp’s. (Neutrino
Phys., etc),Synchrotron
Rad. F.
CERN: ATLAS, LHC-acc, CLICFNAL: Tevatron-accDESY: TESLAKEK: BELLESLAC: BaBar
IHEP(Protvino,
Moscow Region)http://www.ihep.su
U-70 (fix target, proton beam 70 GeV)
Medical Exp’s BNL: PHENIX, STARCERN: ALICE, ATLAS, CMS, LHCb DESY: ZEUS, HERA-B, TESLAFNAL: D0, E-781(Selex)
ITEP(Moscow)
http://www.itep.ru
U-10 (fix target, proton beam 10 GeV)
Non-Acc. HEP Exp’s. (Neutrino Phys., etc)
CERN: ALICE, ATLAS, CMS, LHCb, AMSDESY: H1, HERMES, HERA-B, TESLA
FNAL: D0, CDF, E-781(Selex)KEK: BELLEDAFNE: KLOE
JINR(Dubna, Moscow
Region)http://www.jinr.ru
Nuclotron (heavy ions coll. at 6 GeV/n)
Low Ener. Acc.,Nuclear Reactor,
Synchrotron Rad.F.,
Non-Acc. HEP Exp’s: Neutrino Phys., Medical
Exp’s, Heavy-ion Physics
BNL: PHENIX, STARCERN: ALICE, ATLAS, CMS, NA48, COMPASS, CLIC, DIRACDESY: H1, HERA-B, HERMES, TESLAFNAL: D0, CDFKEK: E391a
Site (Centre)HEP Acc./Coll.
Other Exp’sParticipation in major HEP Int.
Collab.
INR RAS(Troitsk,
Moscow region, Research Centre)
http://www.inr.ac.ru
Low Energy Acc.,Non-Acc. HEP
Exp’s (Neutrino Phys.)
CERN: ALICE, CMS, LHCbKEK: E-246TRIUMF: E-497
RRC KI(Moscow, Res.
Centre)http://www.kiae.ru
Low Energy Acc.,Nuclear Reactors,
Synchrotron Rad. F.
BNL: PHENIXCERN: ALICE, AMS
MEPhI(Moscow, University)http://www.mephi.ru
Low Energy Acc.,Nuclear Reactor
BNL: STARCERN: ATLASDESY: ZEUS, HERA-B, TESLA
PNPI RAS(Gatchina,
St-Petersburg region, Research Centre)
http://www.pnpi.spb.ru
Mid/Low Energy Acc.,
Nulcear Reactor
BNL: PHENIXCERN: ALICE, ATLAS, CMS, LHCbDESY: HERMESFNAL: D0, E-781(Selex)
SINP MSU(Moscow, University)
http://www.sinp.msu.ru
Low Energy Acc.,Non-Acc. HEP Exp.
(EAS-1000)
CERN: ATLAS, CMS, AMS, CLICDESY: ZEUS, TESLA
FNAL: D0, E-781(Selex)
Goals of Russian (distributed) Tier2 to provide a full-scale participation of Russian physicists in the analysis
only in this case Russian investments in LHC would lead to the final goal of obtaining a new fundamental knowledge on the structure of matter
to open wide possibilities for participation of students and young scientists in research at LHC
support and improve a high level of scientific schools in Russia
participation in the creation of international LHC Computing GRID will mean for Russia an access to new advanced computing techniques
Functions of Russian (distributed) Tier2
• physical analysis of AOD (Analysis Object Data);
• access to (external) ESD/RAW and SIM data bases for preparing necessary (local) AOD sets;
• replication of AOD sets from Tier1/Tier2 grid (cloud);
• event simulation at the level of 5-10% of the whole SIM data bases for
each experiment;
• replication and store of 5-10% of ESD required for testing the procedures of the AOD
creation;
• storage of data produced by users.
participation in distributed storage of full ESD data (Tier1 function)…?
Architecture of Russian (distributed) Tier2
RRC-LHC will be a clustera cluster of institutional centers with Tier2Tier2 functionality
distributed system - DataGrid cloud of Tier2(/Tier3) centers
a coherent interactioncoherent interaction of computing centers of participating Institutes:
each Institute knows its resources but can get significantly more if others agree;
for each Collaboration summary resourcessummary resources (of about 4-54-5 basic institutional centers) will reach the level of 50-70% of a canonical Tier1 center:
each Collaboration knows its summary resources but can get significantly more if other Collaborations agree;
RRC-LHC will be connected to Tier1 at CERN and/or to other Tier(s)1 in a
context of a global grid for data store and accessglobal grid for data store and access:
each Institute and each Collaboration can get significantly more if other reg.centers agree.
Russian Regional Center: the DataGrid cloud
PNPI
IHEP
RRC KI
ITEP
JINR
SINP MSU
The opportunity ofGrid technology
RRC-LHC
LCG Tier1/Tier2cloud
CERN
…
Gbits/s
FZK
Regional connectivity:
cloud backbone – Gbit’s/s
to labs – 100–1000 Mbit/s
Collaborative centers
Tier2cluster
GRID access
“Users”-“Tasks” and resources(analysis from 2001 – need to be updated – conception of Tier2s)
The number of active users is main parameter for estimation of the resources needed. We did some estimates, in particular based on extrapolation of Tevatron analysis tasks performed by our physicists (single top production at D0, …). Thus, in some “averaging” figures:
an “user task” – analysis of 107 events per a day (8 hours) by one physicist ALICE ATLAS CMS LHCb 40 60 60 40
In the following we estimate RRC resources (Phase 1) by the assumption that our participation in SIM data base production is at 5% level for each experiment.
Very poor understanding of this key (for Tier2) characteristics!
Resources required by the 2008 ALICE ATLAS RDMS CMS LHCb Total
CPU (KSI95)
100 120 120 70 410
Disk (TB) 200 250 250 150 850
Tape (TB) 300 400 400 150 1250We suppose: •each active user will create local AOD sets ~10 times per year, and keep these sets on the disks during the year •the general AOD sets will be replicated from the Tier1 cloud ~10 times per year, storing previous sets on the tapes.
The disk space usage will be partitioned as 15% to store general AOD+TAG sets; 15% to store local sets of AOD+TAG; 15% to store users data; 15% to store current sets of sim.data (SIM-AOD, partially SIM-ESD); 30-35% to store the 10% portion of ESD; 5-10% cache.
Construction timeline
Timeline for the RRC-LHC resources at the construction phase:
2006 2007 2008
15% 30% 55%
After 2008 investments will be necessary for supporting the computing and storage facilities and increasing the CPU power and storage space.
In 2008 about 30% of the expenses in 2008.
Every next year: renewing of 1/3 of CPU, increase the disk space for 50%, and increase the tape storage space for 100%.
Financial aspects
Phase1 (2001-2005) 2.5 MCHF equipment, 3.5 MCHF network + initial inivestments to some regional networks
Construction phase (2006-2008) 10 MCHF equipment, 3 MCHF network _____________________________________ in total (2001-2008) 19 MCHF
2009 – 200x 2 MCHF/year
2003, December – new Protocol has been signed by Russia and CERN on frameworks for Russia participation in LHC project on period from 2007, including: 1) M&O, 2) computing in Exps, 3) RRC-LHC and LCG.
LHCb DC03 Resource Usage
• c.f. DC02– 3.3M evts– 49 days
• CERN 44%• Bologna 30%• Lyon 18%• RAL 3.9%• Cambridge 1.1%• Moscow 0.8%• Amsterdam 0.7%• Rio 0.7%• Oxford 0.7%
ITEP Moscow
IHEP Protvino
JINR DubnaSINP MSU
CMS Productions (2001)
Simulation
Digitization
GDMP
Common Production
tools
(IMPALA)No PU PU
CERN
Fully operational
FNAL
Moscow (First!)
INFN
Caltech
UCSD
UFL
Bristol
Wisconsin
IN2P3 Not Op.
Helsinki Not Op. Not Op.
Man Power for CMS Computing in RussiaInstitutes farm
administrationinstallation&
production running
production tools
PRS SW code ORCA
Physics
generators FTE
SINP MSU & RCC
1 1.5 - 3.5 4 10
JINR 0.4 0.4 - 2.3 2.8 5.9ITEP 0.8 0.2 1 2 - 4IHEP 0.3 0.6 0.2 0.4 1.3 2.8Kharkov
(Ukrain)1 0.6 - - - 1.6
LPI 0.6 0.4 - - - 1 FTE
4.1 3.7 1.2 8.2 8.1
in a total – 25.3 FTE
Sept.2003
IMPALA/BOSS integration with GRID
GRID
CMKINIMPALA
Job Executer
Gate KeeperBatch Manager
BOSS
Dolly
WN1 WN2 Wnn
UI
NFS
CE
RecourceBroker
Job
Jobs
MySQL DB
CERNRefDB Environment
SINP MSU (Moscow) – JINR (Dubna) – INFN (Padova) 2002
Russia in LCG• We have started activity in LCG in autumn 2002.
• Russia joined to the LCG-1 infrastructure (CERN press-release 29.09.2003). First SINP MSU, soon RRC KI, JINR, ITEP and IHEP (already to LHC-2). http://goc.grid-support.ac.uk/gridsite/gocmain/monitoring/
• Manpower contribution to LCG (started in May 2003): the Agreement is under signing by CERN and Russia and JINR officials, 3 tasks for our responsibility: 1) testing new GRID mw to be used in LCG 2) evaluation of new-on-the-market GRID mw (first task – evaluation of OGSA/GT3) 3) common solutions for event generators (event data bases)
Twice per year (spring-autumn) meetings of the Russia-CERN Joint Working Group on Computing. Next meeting on 19 March at CERN.
Resource Publication for Q2 2004
0
0
1
9,3
1,8
13,3
1
4
3,4
0
4,6
10
1,4
0 100 200 300 400 500 600 700 800 900 1000
Netherlands
United States
Czech Republic
Germany
Poland
Italy
France
Russia
UK
Japan
Taipei
CERN
Sw itzerland
FTEtape space TBdisk space TBCPU KSI2000
Information System testing for LCG-1
Elena SlabospitskayaInstitute for High Energy Physics, Protvino, Russia
18.07.2003
Information System testing for LCG-1
UI RB
CE WN
WN
WN
Edg-job-submit
CondorG
PBS, LSF....
Con
dor
GGlobus EDG
Glo
bu
sru
n
Gatekeeper
The schema of the job submission via RB and directly to the CE via Globus GRAM
Network server
WorkloadManager
CondorG
It was designed and realized OGSA/GT3 testbed (named 'Beryllium') on the basis of PCs located at CERN and SINP MSU modelling a GT3 based Grid system. http://lcg.web.cern.ch/LCG/PEB/GTA/LCG_GTA_OGSA.htm
Created software for common library of MC generators, GENSER, http://lcgapp.cern.ch/project/simu/generator/ New project MCDB (Monte Carlo Data Base) for LCG AA is proposed with Russia responsibility, as common solution for storing and providing access cross the LCG sites to the samples of events at partonic level.
The simplified schema of Beryllium testbed (CERN-SINP)
● The resource broker plays a central role:
– Accepts requests from the User– Using the Information Service
information, selects the suitable computer elements
– Reserve the selected Computing Element
– Communicates to the user a “ticket” to allow job submission
– Maintains a list of all jobs running and receive confirmation
– messages of the ongoing processing from the CEs
– At job end, it updates the table of running job/CE status
Externally Funded LCG Personnel at CERN
Diff. FTESource SUM SUM Cost C-RRB
of Funding 2002 2003 2004 2005 FTE Years Head Years kCHF Apr-03EU DataGrid 7.2 6.7 1.0 14.8 15.3 1775 0.7LCG Assoc.1 2.2 3.0 2.3 3.0 10.5 11.4 1320 -0.1France 2.8 6.1 6.3 4.8 20.0 40.0 2230 4.9Germany2 0.3 2.3 4.0 3.1 9.6 9.6 1150 -6.4Hungary 0.4 1.6 0.7 0.7 3.3 4.5 295 -0.3Israel 2.0 2.2 2.1 2.0 8.3 7.8 595 0.0Italy-INFN2 1.4 6.8 11.2 10.0 29.3 31.2 3495 -1.6Portugal 2.1 2.1 1.1 5.3 6.1 530 1.7Russia+JINR 2.2 3.0 3.0 8.2 8.3 985 1.9Spain 1.6 2.5 0.8 4.9 7.4 490 -3.5Sweden 0.6 1.1 1.2 1.2 4.1 3.4 263 -0.1Switzerland 0.5 1.8 1.6 1.2 5.1 6.0 470 -0.9Taipei 0.3 2.6 1.3 4.2 4.2 500 2.2UK-EPSRC 0.8 1.0 0.2 2.0 2.0 226 0.0UK-PPARC 10.8 24.1 21.8 12.3 69.0 68.5 9650 -3.1UK-Expts 1.7 1.7 1.7 1.7 6.8 8.0 720 6.8USA 6.0 6.1 6.1 6.1 24.4 59.9 2930 9.9SUM 38.5 73.6 67.3 50.2 229.6 293.4 27624 12.0
India3 0.6 8.0 10.0 10.0 28.6Russia+JINR4 3.0 6.0 6.0 15.0
Personnel in FTE
EU-DataGridRussia institutes participated in the EU-DataGrid project
WP6 (Testbed and Demonstration) WP8 (HEP Application)
2001:• Grid information service (GRIS-GIIS), • DataGrid Certificate Authority (CA) and Registration Authority (RA).
WP6 Testbed0 (Spring-Summer 2001) – 2 sites.WP6 Testbed1 (Autumn 2001) – 4 active sites (SINP MSU, ITEP, JINR,
IHEP), significant resources (160 CPUs, 7.5 TB disk).
2002:• Testbed1 new active site – PNPI• Testbed1 Virtual Organizations (VO) – WP6, ITeam • WP8 – CMS VO, ATLAS and ALICE VO’s,• WP8 CMS MC Run (spring) – ~1 Tbyte data transferred to CERN and FNAL,• Resource Broker (RB) – SINP MSU +CERN+INFN experiment• Metadispatcher (MD) – colaboration with Keldysh Inst.Appl.Math. (Moscow) – algorithms of dispatchering (scheduling) jobs in DataGrid environment.
CE WN
lhc01.sinp.msu.rulhc02.sinp.msu.ru
SINP MSU Site
SE
lhc03.sinp.msu.ru
EDG Software deployment at SINP MSU (example - CMS VO, 7 June 2002)
SINP MSU RB+InformationIndex
lhc20.sinp.msu.ru
User Interface Node
lhc04.sinp.msu.ru
CERN
lxshare0220.cern.ch
Padova
grid011.pd.infn.it
EGEE
Enabling Grids for e-Science in Europe – EGEE
• EU project approved to provide partial funding for operation of a general e-Science grid in Europe, including the supply of suitable middleware. EGEE is proposed as a project funded by the European Union under contract IST-2003-508833. Budget – about 32 Meuro per 2004-2005.
• EGEE provides funding for 70 partners, large majority of which have strong HEP ties.
• Russia: 8 institutes (SINP MSU, JINR, ITEP, IHEP, RRC KI, PNPI, KIAM RAS, IMPB RAS), budget 1 Meuro per 2004-2005
• Russian matching of the EC budget is in good shape (!)
EGEE Partner Federations
• Integrate regional Grid efforts
EGEE Timeline
Distribution of Service Activities over Europe: • Operations Management at CERN; • Core Infrastructure Centres in the UK, France, Italy, Russia (PM12) and at CERN, responsible for managing the overall Grid infrastructure;• Regional Operations Centres, responsible for coordinating regional resources, regional deployment and support of services. Russia: CIC – SINP MSU, RRC KI
ROC – IHEP, PNPI, IMPB RAS Dissemination&Outreach – JINR,
S.E. Europe, Russia: Catching Up Latin Am., Mid East, China: Keeping Up India, Africa: Falling Behind
ICFA SCIC Feb 2004
LHC Data ChallengesTypical example – transferring of 100 Gbyte of data from Moscow to CERN for one working day 50 Mbps of bandwidth !
GLORIAD 10 Gbps
REGIONAL CONNECTIVITY for RUSSIA HEP
Moscow 1 Gbps
IHEP 8 Mbps (m/w), under construction 100 Mbps fiber-optic (Q1-Q2 2004?)
JINR 45 Mbps, 100-155 Mbps (Q1-Q2 2004), Gbps (2004-2005)
INR RAS 2 Mbps+2x4Mbps(m/w)
BINP 1 Mbps, 45 Mbps (2004 ?), … GLORIAD
PNPI 512 Kbps (commodity Internet), and 34 Mbps f/o but (!) budget is only for 2 Mbps
INTERNATIONAL CONNECTIVITY for RUSSIA HEP
USA NaukaNET 155 MbpsGEANT 155 Mbps basic link, plus 155 Mbps additional link for GRID projectsJapan through USA by FastNET, 512 Kbps Novosibirsk(BINP) – KEK(Belle)