Upload
kristin-weaver
View
220
Download
2
Embed Size (px)
Citation preview
SJ – Mar 2003 1
The “opencluster” in
“openlab”
A technical overviewSverre Jarp IT Division
CERN
SJ – Mar 2003 2
Definitions
The “CERN openlab for DataGrid applications” is a framework for evaluating and integrating cutting-edge technologies or services in partnership with industry, focusing on potential solutions for the LCG.
The openlab invites members of the industry to join and contribute systems, resources or services, and carry out with CERN large-scale high-performance evaluation of their solutions in an advanced integrated environment.
“opencluster” projectThe openlab is constructing a pilot ‘compute and storage farm’ called the opencluster, based on HP's dual processor servers, Intel's Itanium Processor Family (IPF) processors, Enterasys's 10-Gbps switches and, at a later stage, a high-capacity storage system.
SJ – Mar 2003 3
LHC Computing Grid
02 03 04 05 06 07 08
LCGLCG
Project Selection, Integration (technology) and
deployment (full scale) Full Production service Worldwide scope
09
SJ – Mar 2003 4
CERN openlab
02 03 04 05 06 07 08
LCGLCG
Framework for collaboration Evaluation, integration, validation (of
cutting-edge technologies) 3-year lifetime
CERN openlabCERN openlab
SJ – Mar 2003 5
Technology onslaught
Large amounts of new technology will become available between now and LHC start-up. A few HW examples:
Processors SMT (Symmetric Multi-Threading) CMP (Chip Multiprocessor) Ubiquitous 64-bit computing (even in laptops)
Memory DDR II-400 (fast) Servers with 1 TB (large)
Interconnect PCI-X PCI-X2 PCI-Express (serial) Infiniband
Computer architecture Chipsets on steroids Modular computers
ISC2003 Keynote Presentation Building Efficient HPC Systems from Catalog Components
Justin Rattner, Intel Corp., Santa Clara, USA Disks
Serial-ATA Ethernet
10 GbE (NICs and switches) 1 Terabit backplanes
Not all, but some of this will definitely be used
by LHC
SJ – Mar 2003 6
Vision: A fully functional GRID cluster node
Gigabit long-haul link
CPU
Servers
WAN
Multi-gigabit LAN
Storagesystem
RemoteFabric
SJ – Mar 2003 7
opencluster strategy
Demonstrate promising technologies LCG and LHC on-line
Deploy the technologies well beyond the opencluster itself
10 GbE interconnect in the LHC Test-bed Act as a 64-bit Porting Centre
CMS and Alice already active; ATLAS is interested CASTOR 64-bit reference platform
Storage subsystem as CERN-wide pilot Focal point for vendor collaborations
For instance, in the “10 GbE Challenge” everybody must collaborate in order to be successful
Channel for providing information to vendors
Thematic workshops
SJ – Mar 2003 8
The opencluster today
Three industrial partners: Enterasys, HP, and Intel
A fourth partner to join Data storage subsystem
Which will “fulfill the vision”
Technology aimed at the LHC era Network switches at 10 Gigabits Rack-mounted HP servers 64-bit Itanium processors
Cluster evolution: 2002: Cluster of 32 systems (64 processors) 2003: 64 systems (“Madison” processors) 2004/05: Possibly 128 systems (“Montecito” processors)
SJ – Mar 2003 9
Activity overview Over the last few months
Cluster installation, middleware Application porting, compiler
installations, benchmarking Initialization of “Challenges” Planned first thematic workshop
Future Porting of grid middleware Grid integration and benchmarking Storage partnership Cluster upgrades/expansion New generation network switches
SJ – Mar 2003 10
opencluster in detail
Integration of the cluster: Fully automated network installations 32 nodes + development nodes RedHat Advanced Workstation 2.1 OpenAFS, LSF GNU, Intel, ORC Compilers (64-bit)
ORC (Open Research Compiler, used to belong to SGI)
CERN middleware: Castor data mgmt CERN Applications
Porting, Benchmarking, Performance improvements
CLHEP, GEANT4, ROOT, Sixtrack, CERNLIB, etc.
Database software (MySQL, Oracle?)
SJ – Mar 2003 11
The compute nodes HP rx2600
Rack-mounted (2U) systems Two Itanium-2 processors
900 or 1000 MHz Field upgradeable to next generation
2 or 4 GB memory (max 12 GB) 3 hot pluggable SCSI discs (36 or 73 GB) On-board 100 Mbit and 1 Gbit Ethernet 4 PCI-X slots:
full-size 133 MHz/64-bit slot(s) Built-in management processor
Accessible via serial port or Ethernet interface
SJ – Mar 2003 12
Opencluster - phase 1
Perform cluster benchmarks:
Parallel ROOT queries (via PROOF)
Observed excellent scaling: 2 4 8 16 32 64 CPUs
To be reported at CHEP2003
“1 GB/s to tape” challenge Network interconnect via 10 GbE
switches opencluster may act as CPU servers 50 StorageTek tape drives in parallel
“10 Gbit/s network Challenge” Groups together all openlab partners
Enterasys switch HP servers Intel processors and n/w cards CERN Linux and n/w expertise
MB/s
0
100
200
300
400
500
600
700
800
0 10 20 30 40 50 60 70
MB/s
SJ – Mar 2003 13
Enterasys extension 1Q2003
1-12 13-24 25-36 37-48
49- 60 61-72 73-84 85-96
1-12 13-24 25-36
4 4 4
4 444
2 2 2
2
2
1-96 FastEthernet
Disk Servers
Gig copperGig fiber10 Gig
32
32 node Itanium cluster 200+ node Pentium cluster
E1 OAS E1 OAS
E1 OAS E1 OAS
SJ – Mar 2003 14
Why a 10 GbE Challenge?
Demonstrate LHC-era technology All necessary components available
inside the opencluster Identify bottlenecks
And see if we can improve
We know that Ethernet is here to stay 4 years from now 10 Gbit/s should be
commonly available Backbone technology Cluster interconnect Possibly also for iSCSI and RDMA traffic
We want to advance the state-of-the-art !
SJ – Mar 2003 15
Demonstration of openlab partnership
Everybody contributes: Enterasys
10 Gbit switches Hewlett-Packard
Server with its PCI-X slots and memory bus Intel
10 Gbit NICs plus driver Processors (i.e. code optimization)
CERN Linux kernel expertise Network expertise Project management IA32 expertise
CPU clusters, disk servers on multi-Gbit infrastructure
Stop Press: We are up and running with
back-to-back connections
SJ – Mar 2003 16
Opencluster time line
Jan 03 Jan 04 Jan 05 Jan 06
Install 32 nodes
Start phase 1 - Systems expertise in place
Complete phase 1
Order/Install G-2 upgrades and 32 more nodes
Order/Install G-3 upgrades; Add nodes
op
en
Clu
ste
rin
teg
ratio
n EDG and LCG interoperability
Start phase 2
Complete phase 2Start phase 3
SJ – Mar 2003 17
Opencluster - future
Port and validation of EDG 2.0 software Joint project with CMS
Integrate opencluster alongside EDG testbed Porting, Verification
Relevant software packages (hundreds of RPMs) Understand chain of prerequisites Exploit possibility to leave control node as IA-32
Interoperability with EDG Testbeds and later with LCG-1
Integration into existing authentication scheme GRID benchmarks
To be defined later
SJ – Mar 2003 18
Recap:opencluster strategy
Demonstrate promising IT technologies File system technology to come
Deploy the technologies well beyond the opencluster itself
Focal point for vendor collaborations
Channel for providing information to vendors
SJ – Mar 2003 19
The Workshop
SJ – Mar 2003 20
IT Division 250 people
About 200 are at engineering level
10 groups: Advanced Projects’ Group (DI) (Farm) Architecture and Data Challenges (ADC) Communications Services (CS) Fabric Infrastructure and Operations (FIO) Grid Deployment (GD) Databases (DB) Internet (and Windows) Services (IS) User Services (US) Product Support (PS) Controls (CO)
Groups have both a development and a service responsibility
Most of today’s speakers are from ADC and DB
SJ – Mar 2003 21
High Energy Physics Computing Characteristics
Independent events (collisions) Trivial parallel processing
Bulk of the data is read-only versions rather than updates
Meta-data in databases linking to files
Chaotic workload – research environment - physics extracted by
iterative analysis, collaborating groups of physicists
Unpredictable unlimited demand
Very large aggregate requirements: computation, data, i/o
SJ – Mar 2003 22
SHIFT architecture
Three tiers Interconnected via Ethernet
StorageTek Powderhorn6,000 1/2” tape cartridges
CPU servers(no permanent data)
DISK servers(cached data)
Tape robots
SJ – Mar 2003
reconstruction
simulation
analysis
interactivephysicsanalysis
batchphysicsanalysis
batchphysicsanalysis
detector
event summary data
rawdata
eventreprocessing
eventreprocessing
eventsimulation
eventsimulation
analysis objects(extracted by physics topic)
Data Handling and Computation for
Physics Analysisevent filter(selection &
reconstruction)
event filter(selection &
reconstruction)
processeddata
les.
rob
ert
son
@ce
rn.c
h
CERN
SJ – Mar 2003 24
Backup