Upload
hakiet
View
213
Download
0
Embed Size (px)
Citation preview
SARAComputing & Networking Services
Ronald van der [email protected]
TF-NOC Preparation Meeting, Copenhagen, 3 May 2010
Outline
About SARANational and International CollaborationsNational and International CollaborationsOverview of ServicesMain Operational TasksOrganisationOrganisationOperational ProceduresTools Used
TF-NOC Preparation Meeting, Copenhagen, 3 May 2010
About SARA
SARA is the Dutch national e-science support center with services in the area of high-performance computing andservices in the area of high performance computing and networking, scientific visualisation, masss data storage and grid servicesNot for profit organisation, based in AmsterdamUsers: Higher Education & Research CommunityFirst supercomputer in The Netherlands at SARA in 1984 (Control Data CYBER 205)One of the European PRACE supernode candidates
TF-NOC Preparation Meeting, Copenhagen, 3 May 2010
National Collaborations
BioRange LOFARStichting Nationale Computer
Faciliteiten BioRangewww.nbic.nl
LOFARwww.lofar.nlwww.nwo.nl/ncf
NL-Grid, BIG-Gridwww.nwo.nl/ncfwww.nikhef.nl
Virtual Lab e-Sciencewww.vl-e.nl
SURFnet6 networkwww.surfnet.nlwww.gigaport.nl
www.nbic.nl
TF-NOC Preparation Meeting, Copenhagen, 3 May 2010
International Collaborations
Visualization & networkingOptIPuter www.optiputer.netCineGrid www.CineGrid.org
Data storage and processingEGEE gridwww.eu-egee.org
SupercomputingDEISA gridwww.deisa.org
Lambda networkingGLIF, Netherlightwww.glif.is
TF-NOC Preparation Meeting, Copenhagen, 3 May 2010
Supercomputing ServicesNational Supercomputer Huygens (capability computing)
65 Tflop/s IBM Power 575 “hydro cluster”2nd half 2008 – end 20112nd half 2008 – end 20113456 processors16 TeraByte memory972 TeraByte directly connected disk spaceWater cooled
National Compute Cluster Lisa (capacity computing)536 nodes536 nodes2 Intel Quad Core Xeon (2.26, 2.33 and 2.5) GHz CPUs per nodeTopspin low-latency high bandwidth Infiniband networkperformance: 19 Tflop/sperformance: 19 Tflop/s48 TB disk space
TF-NOC Preparation Meeting, Copenhagen, 3 May 2010
Remote Visualisation Service
Remote Desktop/TPD
RenderVisualizationData
Remote p
Display
TurboVNC
wor
kne
twTF-NOC Preparation Meeting, Copenhagen, 3 May 2010
High Resolution Visualisationg
CosmoGrid: Dutch Computing Challenge Project: DCCP 2008 – 2009 /DEISA Extreme Computing Initiative: DECI 2008 1 1 M core hours / 3 15 M core hours (2 2 / 4 65)
TF-NOC Preparation Meeting, Copenhagen, 3 May 2010
Computing Initiative: DECI 2008, 1.1 M core hours / 3.15 M core hours (2.2 / 4.65), Storage: 110 TB, DCCP: Huygens Amsterdam + Cray XT4 Tokyo: coupled via lightpathA cosmological N-body simulation with 8,589,934,592 particles
Main Operational Tasks
Operations & support for the Dutch National Supercomputer Huygens (capability computing)Supercomputer Huygens (capability computing)Operations & support for the Dutch National Cluster Computer Lisa (capacity computing)Mass Storage (LHC TIER-1, LOFAR, BioRange, …)g ( , , g , )Grid & e-science services (EGEE, …)Visualisation services (Render Cluster, Tiled Panels, …)Network infrastructure (IPv4 + IPv6, Ethernet, CWDM)( , , )Operations of SURFnet6 (Dutch NREN network)Operations of NetherLight (Dutch optical exchange point)
TF-NOC Preparation Meeting, Copenhagen, 3 May 2010
Organisation
SARA has around 60 employeesOperations User Support and InnovationOperations, User Support and Innovation
Divided in six groupsSupercomputingNetworkinggCluster Computinge-Science SupportMass StorageVisualisation
Operations divided in three areasSupercomputingNetworkingNetworkingGrid & Mass Storage
TF-NOC Preparation Meeting, Copenhagen, 3 May 2010
SupercomputingOperations Procedures
B i d t (9 00 17 00)Business day support (9:00-17:00)Incident reports via telephone and emailEach day 1 person is responsible for accepting and dispatching incidentsdispatching incidentsRest of group is actively monitoring systems
TF-NOC Preparation Meeting, Copenhagen, 3 May 2010
NetworkingOperations Procedures
24 7 t24x7 supportWorking days from 8:00 to 20:00 (2 shifts)Outside these hours on-call duty engineer
ITIL basedITIL basedIncident reports via telephone and email (8:00 – 20:00)Active monitoring (nagios) outside business hoursOn call duty engineer alerted by beeper via activeOn-call duty engineer alerted by beeper via active monitoring software
TF-NOC Preparation Meeting, Copenhagen, 3 May 2010
Grid & Mass StorageOperations Procedures
B i d t (9 00 17 00)Business day support (9:00-17:00)Incident reports via grid ticketing systems (GGUS, etc) and mailing listsEach day 1 person is responsible for accepting andEach day 1 person is responsible for accepting and dispatching incidentsRest of group is actively monitoring systems
TF-NOC Preparation Meeting, Copenhagen, 3 May 2010
Tools Used
NagiosGangliaGangliaCactiPHP-Syslog-NGRancid / CVS for version controlRancid / CVS for version controlcfengineEmail notificationsWiki tracWiki, tracHome built softwareRemedy ARS workflow systemGrid ticketing systems like GGUSGrid ticketing systems like GGUSTicket tool to inform users about networking issues
TF-NOC Preparation Meeting, Copenhagen, 3 May 2010