30
Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 René Brun CERN http://hepix.caspur.it/spring2006/

Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren Brun CERN

Embed Size (px)

DESCRIPTION

René Brun, CERNReport about HEPIX Spring

Citation preview

Page 1: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

Report about HEPIXRoma April 3-7

SFT Group MeetingMay 5, 2006

René BrunCERN

http://hepix.caspur.it/spring2006/

Page 2: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 2

HEPIX Spring 2006 in Rome

• The meeting was held in the Italian National Research Council (CNR) , a very comfortable auditorium although the networking should have been more stable than it was all week. Initially there were hardware problems but by mid-week it was the presence of locally broadcasting nodes in the room which created much instability and which could not be traced.

• Alongside the traditional HEPiX sessions, there were a number of special meetings such as the LCG2 GDB3, the OPN4 working group and others so the total registration count was over 120 although not all were present all week.

• Unlike previous meetings, this one was mostly separated into topics with a convener appointed for each topic. Also, as in the past two HEPiX meetings in Europe, this meeting attracted a noticeable number of representatives of LCG Tier 2 sites, from across Europe especially.

Page 3: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 3

Page 4: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 4

• Computer room cooling and air conditioning systems were mentioned in a majority of site reports. Several sites are having to build or equip new computer rooms to get round capacity restrictions in existing facilities.

• As usual at recent HEPiX meetings, there were a number of benchmarks presented with very detailed overheads well worth a look if you are interested in performance or costs. ・

• New format for HEPiX with half-day sessions on dedicated topics; such as networking, performance optimization and databases were new to HEPiX, with corresponding invited speakers.

• Collaboration on and re-use of HEP-developed tools was not particularly emphasized . On the other hand, there were, as often the case, a few examples of wheels being re-invented for no obvious reason.

• Also some random tools CERN/IT might want to look at : Imperia for web page content management (PSI site report); Subversion, mentioned several times by DES Group as a possible replacement for CVS for code management, seems to have arrived on at least a couple of HEP sites.

• Virtualisation, Virtualisation, Virtualisation• What to do about Bird flu by Bob Cowles( security talk)

Highlights

Page 5: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 5

Site Reports

• TRIUMF, CASPUR, RAL, CERN,DESY, FZK, CNAF, JLAB, LAL, NIKHEF, PSI, RZG, SLAC, BNL

• Nearly all sites installing thousands of Opteron machines.

Page 6: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 6

Plenary talks

• LCG status by Les• CPU technologies (Bernd Panzer)• Power consumption issues (Yannick Perret

IN2P3)• Dual-Core batch Nodes (Manfred Alef FZK)• Benchmarking AMD64 and EMT64 (Ian Fisk)• Networking technologies

Page 7: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 7

INTEL and AMD roadmapsINTEL and AMD roadmaps

INTEL has moved now to 65 nm fabrication

new micro-architecture based on mobile processor development, Merom design (Israel)

Woodcrest (Q3) claims + 80% performance compared with 2.8 GHz while 35% power decrease some focus on SSE improvements (included in the 80%)

AMD will move to 65nm fabrication only next year

focus on virtualization and security integration

need to catch up in the mobile processor area

currently AMD processors are about 25% more power efficient

INTEL and AMD offer a wide and large variety of processor typeshard to keep track with new code names

Page 8: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 8

Multi core developmentsMulti core developments

dual core dual CPU available right now

quad core dual CPU expected in the beginning of 2007

8-core CPU systems are under development , but not expected to come into market before 2009

(http://www.multicore-association.org/) cope with change in programming paradigm, multi-threading, parallel

Heterogeneous and dedicated multi-core systems Cell processor system PowerPC + 8 DSP cores Vega 2 from Azul Systems 24/48 cores for Java and .Net CSX600 from ClearSpeed (PCI-X, 96 cores, 25 Gflops, 10W)

Rumor : AMD is in negotiations with ClearSpeed to use their processor board revival of the co-processor !?

Page 9: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 9

Game machinesGame machines

Microsoft Xbox 360 (available, ~450 CHF)PowerPC based, 3 cores (3.2 GHz each), 2 hardware threads per core512 MB memorypeak performance = ~ 1000 GFLOPS

Sony Playstation 3 (Nov 2006)Cell processor, PowerPC + 8 DSP cores512 MB memorypeak performance = ~ 1800 GFLOPS

problem for High Energy physics : Linux on Xbox Focus is on floating point calculations, graphics manipulation Limited memory, no upgrades possible

INTEL P4 3.0 GHz = ~ 12 GFLOPS ATI X1800XT graphics card = ~ 120 GFLOPSuse the GPU as a co-processor, 32 node cluster at Stony BrookCPU for task parallelism GPU for data parallelism

compiler exists , quite some code already portedwww.gpgpu.org

Page 10: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 10

Market trendsMarket trends

The market share of AMD + INTEL in the desktop PC, notebook PC and server are is about 98 % (21% + 77%)

On the desktop the relative share is INTEL = 18% , AMD = 82% (this is the inverse ratio of their respective total revenues)

In the notebook area INTEL leads with 63%

The market share in the server market is growing for AMD, 14% currently

Largest growth capacity is in the notebook (mobile) market

Page 11: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 11

Page 12: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 12

Page 13: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 13

Page 14: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 14

Page 15: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 15

Batch Systems

• ATLAS (Laura Perini)• CMS (Stefano Belforte)• LHCb (Andrei Tsaregorotsev)• ALICE (Federico Carminati)

Page 16: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 16

Databases (convener Dirk)• Introduction: Dirk described how LCG databases

are kept up to date via asynchronously replication via Streams. He compared the concerns of local and central site managers and how these must be reconciled to provide an overall reliable service.

• Database Service for Physics at CERN (Luca Canali)• Database Deployment at CNAF (Barbara Martelli)• Database Deployment at RAL (Gordon Brown)

Page 17: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 17

Optimisation and Bottlenecks (Convener Wojciech Wocjik)

• Performance and Bottleneck Analysis (Sverre Jarp) this is work done in the framework of CERN openlab collaboration with industry. One of the first choices to make is which compiler gets the best performance from your chip; then which compiler parameters have which effect? Having explained the methodology and emphasized the importance of selecting good tools, knowing the chip architecture and how your algorithm maps to this, he then presented some results obtained from the openlab collaboration with Intel.

• Code/Compiler Problems (Rene Brun) threading and the importance of making programmes thread-safe in order to take full advantage of multi-core chips.

• Controlling Bottlenecks with BQS (Julien Deveny)

• Optimising dCache and the DPM (Greg Cowan). Each Tier 2 site has unique policies and constraints. This leads to various combinations of middleware components. The University of Edinburgh chose dCache and LCG DPM (Disc Pool Manager). Using XFS in the DPM tests showednoticeably better performance but not on the dCache tests.

Page 18: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 18

Storage day (1)

• Tape Technology (Don Petravick) At Fermi, tape capacity doubles every 18-24 months, LTO-3 drives currently store 400GB but there is no inherent tape density limit as there is for disc technology. In summary he claims tape offers high quality retention technology and simple, reliable units of expansion but it does complicate Hierarchical Storage Management data handling and it requires specialised skills to manage and operate. And the future roadmap appears to face no fundamental engineering limitations.

• Disc Technology (Martin Gasthuber) He presented various disc configurations such as FC36 SAN37, SCSI FC and others. Important components are not only the discs themselves but also the interconnects and the disc and network controllers. Expected performance is 40MB/sec throughput per TB of storage. He listed issues to consider when acquiring discs. Discs are getting just too slow and price per GB is flattening out. He offered with some predictions no further increase in FC use but rather Serial Attached SCSI (SAS) which will come with smaller form factors; SATA38 will be around for a while but there will be no real improvement in performance. He ended by describing Object Storage Devices (OSD) which he believes will come in the coming years storage in a box and offering multiple protocols.

Page 19: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 19

Storage day (2)• Hardware Potpourri (Andrei Maslennikov) Andrei described what he called

a fat disc server contender. He compared what CERN requires for CASTOR performance with what his configuration can achieve and he believes it could satisfy the needs for CASTOR for a cheaper price.

• GPFS and StoRM (Luca dell'Agnello)

• Local File Systems (Peter Kelemen) Comparison of XFS and ext3.

• AFS/OSD Project (Ludovico Giammarino) this is being developed in CASPUR in conjunction with CERN and FZK. The principle goal is to improve AFS performance and scalability.

• WAN Access to a Distributed File System (Hartmut Reuter).

• Disk to Tape Migration Introduction (Michael Ernst)

• CASTOR 2 (Sebastien Ponce) a quick overview of CASTOR 2 and how it has changed from version 1.

• dCache (Patrick Fuhrmann)

• HPSS (Andrei Moskalenko)

Page 20: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 20

Virtual Servers for Windows (Alberto Pace) • Alberto started with a demo of creating a couple of virtual systems on

his desktop (one Windows, one Linux using SLC) and while they were being created, he started the presentation with a history of how virtual computers have long been a dream of computer scientists.

• As the Intel X86 architecture is becoming by far the most commonly-found system in our environments, running virtual X86 systems on real X86 systems is more attractive than previous implementations of virtual computers.

• In CERN there is an ever-increasing number of requests for dedicated servers running individual applications or services. But limitations of space, management overhead and the often-underused CPU load on many of these servers makes virtualisation an interesting option.

• The CERN team has built a number of different configurations of Windows 2003-based servers and Linux (both SLC3 and SLC4) virtual systems which can be called up on demand. The scheme uses the Microsoft Virtual Hosting Server. The user can configure the hardware down to the size of memory, the presence of a floppy or CD/DVD, the number of discs, etc. He or she can request use of the server for a finite time or long-term and more options will be offered in the future.

Page 21: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 21

Why Virtual servers• More and more requests for dedicated servers in the

CERN computer centre• Excellent network connectivity, to the internet and to the

CERN backbone (10 Gbit/s)• Uninterruptible power supply• 24x365 monitoring with operator presence• Daily backup with fast tape drives• Hardware maintenance, transparent for the “customer” • Operating system maintenance, patches, security scans• “customer” focus only on “his application”.• Customer not willing to share his server with others, but

ready to pay lot of $$, €€, CHF• Frame for this server hosting service:

• http://cern.ch/Win/Help/?kbid=251010

Page 22: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 22

However, after an inside look …

• Installing and maintaining custom servers is time consuming …• Lot of management overhead

• Space in the computer centre is a scarce resource

• Several of these servers are underused• Hardly more than 2-3 % CPU usage

• Excellent candidate for virtualization

Page 23: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 23

Goal of virtualization• Clear separation of hardware management from Server

(Software) management• Could be even be made by independent teams

• Hardware management• Ensure enough server hardware is globally available to satisfy the

global CPU + Storage demand• Manages a large pool of identical machines• Hardware maintenance

• Server (Software) management• Manages server configuration• Allocates server images to machines in the pool• Plenty of optimization possible

• Automatic reallocation to different HW according to past performances• Little overhead

• Emulation of PC on real PC is very efficient

Page 24: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 24

Server on Demand• Chose from a set of “predefined” images

• Windows server 2003• Windows Server 2003 + IIS + Soap + Streaming• Windows Server 2003 + Terminal Server Services• …• Scientific Linux CERN 3 or 4• …

• Takes resources from the pool of available HW• Multiple, different, OS can be hosted in the same box

• Available within 10 minutes• Before: between one week and one months

• Cost: much cheaper, especially manpower• Performances: unnoticeable difference

Page 25: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 25

What’s next ?• We can expect request for more “Server types”

• Various combinations of OS and applications• We can expect request for custom server types

• User creates and manages his server images• Future server on demand

• “I need 20 servers with this image for one month” • “I need an image for this server replicated 10 times”• “I need more CPU / Memory for my server”• “I do not need my server for 2 months, give me an image I

can reuse later”• “I need a test environment, OS version n+1, to which I can

migrate my current production services”• I need 10 Macintosh instances …• …

Page 26: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 26

Conclusion• Server virtualization a strategic direction for (windows)

server management at CERN• HW and SW management can be independent• We can expect consequences also for traditional batch

systems• Instead of allocating CPU time for jobs submitted for a rigid

OS configuration one could allocate bare “virtual PC time”• User would submit “PC image hosting the job”. Farm

independent of OS, less security implication (for the farm management), unprecedented flexibility for users

Page 27: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 27

Scientific Linux• Status and Plans (Troy Dawson) Current usage of SL is

at least 16,000 installations (total of SL3 and SL4). Fermilab itself is standardising on SLF 4.2 and trying to phase out all the unsupported distributions (those before SL3). They are gearing up for SL5, although they are bound by Redhat release date for RHEL485 and they realise it will not arrive in time to be packaged and deployed before LHC startup. He asked if there is a need long-term for Itanium releases or any other architecture; the answer, at least from this audience, was no.

• SLC (Jarek Polok) 2100 individual SLC3 installations, 3559 centrally-managed installations and 2400 SLC3 installations outside CERN. SLC 4.3 is just coming into use after its official release at the beginning of April. As explained above, the projected release date of RHEL 5 (only next year) means that SLC4 will be the officially-supported release for LHC startup. It is planned to start migrating to it on the central clusters in September this year.

Page 28: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 28

Security (Bob Cowles)• Bob covered a range of topics, starting with the

dangers and risks of Skype, especially of becoming a Supernode when connected to a powerful network; apparently this does not happen to systems behind NAT49 boxes.

• Skype is banned at CERN and monitored at SLAC. Turning to topical matters, service providers should be concerned about the risks of a bird flu epidemic if people start seriously to get infected and have to stay home, how to run the operation; what happens if they use infected home PCs to login?

• He displayed the list of some 30 passwords he had sniffed during the week from among the HEPiX attendees.

• He listed 10 tips to improve security (see overheads).

Page 29: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 29

Passwords

• POP3• kastela3, Romania2,

ecdMJee4dD, baum2kid, ghbghb, 1@roma06, ubc789, 84relax, 4q63wbg, light2484, tDsfCxJs

• IMAP• Dadoes63, cal1pat0

dnow12i, Bruck5BD *hoFK87, 1etsg0, 21filipch ckmckmir,obheyto, authum1808R2gsumb0, rugbybearv3sm9r-EGEE, k7u0naDad123Red345, 123456Tuesday, ippin, nk0

• SMTP• lworib4u, iosara44,

tuesday, ha66il33

• ICQ• gg147231, lalamisi

xircom12, power0123stell, B7A8

• FTP• !!

Page 30: Report about HEPIX Roma April 3-7 SFT Group Meeting May 5, 2006 Ren  Brun CERN

René Brun, CERN Report about HEPIX Spring 2006 30

Next Meeting

• Next meeting at Jefferson Lab, 9th October,• followed by DESY in Spring 2007.