Prof. Nectarios Koziris Vice Chairman, GRNET ICCS, NTUA

Preview:

DESCRIPTION

Prof. Nectarios Koziris Vice Chairman, GRNET ICCS, NTUA. Building a nation-wide production Grid infrastructure: the HELLASGRID project. HELLASGRID in a nutshell. National Grid Initiative led by GRNET (2002), deployed on top of 2,5 Gbps GRNET network - PowerPoint PPT Presentation

Citation preview

Prof. Nectarios KozirisProf. Nectarios Koziris Vice Chairman, GRNETVice Chairman, GRNET

ICCS, NTUAICCS, NTUA

Building a nation-wide production Grid infrastructure:

the HELLASGRID project

2

HELLASGRID HELLASGRID in a nutshellin a nutshell•National Grid Initiative led by

GRNET (2002), deployed on top of 2,5 Gbps GRNET network

•HG-01 cluster @ Demokritos: 64 CPU, 10TB FC SAN, 12TB Tape Library, LCG2/EGEE middleware

•HG02-HG06 clusters located @NDC, IASA, AUTH, FORTH, CTI research centers.

~800 CPUs (x86_64, 2 GB RAM,

80GB HDD, 2x Gbit) ~30 TBytes total raw SAN storage capacity~80TBytes Tape Library4 Access Grid nodes

Leased lambda 2,5 Gbps PoS

Athens MAN (2,5 Gbps PoS)

Dark Fibre (not yet lit)

Patra

Larissa

Heraclion

Syros

Athens

Chania

Rethymnon

Xanthi

Thessaloniki

Ioannina

Leased lambda 1,25 Gbps Gigabit Ethernet

HG-06-AUTH

HG-04-IASA

HG-05-NDC

HG-03- CTI-CEID

HG-01-GRNET Isabella @ Demokritos

HG-02- ICS-FORTH

Hellasgrid Grid Node

http://www.hellasgrid.gr/infrastructure

3

HG structureHG structure

Main site: HG-01-GRNET (Isabella, cslab@ICCS/NTUA)

HG-02…HG-06 sites + operation centers (NDC, IASA, AUTH, FORTH, CTI)

5 smaller sites (AUTH, UoM, FORTH, Demokritos, HEP-NTUA)

HG CA and VOMS (GridAUTH, Dept. of Physics, AUTH) HG helpdesk (CTI) Regional monitoring tools (FORTH) HG user support/apps (Demokritos + all site teams) 4 AccessGRID sites

HG membership: more than 20 Universities + 15 Research Institutes

Feb 06 update: 6+5 infrastructure sites, > 900 CPUs in total

4

HG-01-GRNET HG-01-GRNET IsabellaIsabella

5

HellasGrid Infrastructure, Phase II, NDC HellasGrid Infrastructure, Phase II, NDC (2/2006)(2/2006)

6

HG-01: operations targetsHG-01: operations targets

High node availability Through HW and SW redundancy

Security Timely resolution of problems

Efficient collaboration between team members; ticketing system, interface with EGEE ticketing

Close cooperation with VOs

7

HW/SW RedundancyHW/SW Redundancy

RAID1 on Service Nodes and WNs Reliable Data Storage Infrastructure

• RAID5 volumes on storage array• Redundant FC controllers• Redundant FC links in failover mode for

GPFS storage nodes• Node redundancy at the GPFS level

Redundant GPFS storage nodes• One primary / one secondary per Network Storage

Device (NSD) Redundant network service instances

• DNS two on-site, two off-site servers

8

Security: OpenVPNSecurity: OpenVPN

Management interfaces unreachable from the outside Secure remote access to management VLAN using the free

OpenVPN tool Certificate-based authentication, SSL-based encryption

security hierarchy with different levels Platinum: Backup server, Remote Console Access Gold: Management Server Copper: Worker Nodes for the Grid

Encrypted Virtual

TUN/TAP Ethernet

Interface

Encrypted Ethernet

frames

encapsulated in

UDP/IP or TCP/IP

Virtual Ethernet Switch

(Ethernet Bridge)

between tap1 and eth1

aurum.isabella.grnet.gr

(assigned to bridge interface)

Internet

Management VLAN

tap1

eth1

Cisco Router

(DHCP Server)

tap1Secure extension of Management

VLAN

eth0

Virtual Ethernet Switch

(Ethernet Bridge)

between eth0, tap1

Workstation 2

9

Day-to-day OperationsDay-to-day Operations

Operations in shifts, faster response Use of global EGEE monitoring tools Local monitoring tools: Ganglia, MRTG Vendor-specific tools

IBM Cluster Systems Management• Monitors various node health parameters • Sends e-mail alerts which are routed to mobiles

Web-based ticketing system, also for archiving

Weekly meetings

10

Day2Day CollaborationDay2Day Collaboration

Request Tracker (RT) Web-based Ticketing System. E-mails to hg-01-grnet@hellasgrid.gr are

automatically added to the Request Tracker. Used mainly for day2day collaboration and

maintenance. Problem reports are rare. Permanent archive of information on all

events during shifts. Facilitates integration of new team members. Acts as an HG Knowledge base.

11

RT ticketing system: the big pictureRT ticketing system: the big picture

12

Introduction of HG-0x sites:Introduction of HG-0x sites:

Streamlining of new site installations Guide for new HW installations Customized instructions for OS deployment

Certification Period Certification SFTs run by the HG-01-GRNET

team for all yet uncertified sites Site enters production when the tests have

run without problems for 5 days

13

HG Local Users distribution per HG Local Users distribution per DisciplineDiscipline

14

CPU Hours per SiteCPU Hours per Site

15

HellasGrid CA statistics:HellasGrid CA statistics:HellasGrid CA statistics:HellasGrid CA statistics:

16

CPU time: distribution of overall EGEE CPU time: distribution of overall EGEE VOs usage of HG infrastructureVOs usage of HG infrastructure

Normalised CPU Time

28%

10%12%

40%

8% 2% atlas

biomed

cms

lhcb

see

others

17

Cornerstones to Hellasgrid:Cornerstones to Hellasgrid:

1. Widely available distributed e-Infrastructures (network, storage, computer nodes)

2. GRID aware communities • GOCs

• Infrastructure integrators

• middleware developers

• end-users

3. Need for GRID enabled Applications!

“glue && cement” for the cornerstones above?

GRNET network

18

HG international roleHG international role

Synergies: South East Europe

SEE Regional Operations Centre Coordination of SEEGRID project by

GRNET EU Research Infrastructures (EGEE-I

& EGEE-II) EU GRID projects: operations and

research (EumedGrid, EuChinaGrid, GRIDCC, e-IRGSP)

19

GRNET NGI role:GRNET NGI role:

HG follows NGI-EGO paradigm: GRNET:

• Provides/deploys/operates GRID research infrastructure (RI)

• Coordinates national GRID efforts/activities• Has an application neutral role

GRNET acts as an early champion for the SEE area

(See Tuesday’s talk (11:50) by Ognjen Prnjat,GRNET: ”Towards Production Grids in Greenfield Regions” )

20

Conclusions/ExperiencesConclusions/Experiences

Local “incubators” for GRID technology needed-experts in local sites

Training, Training, Training! Users+Apps, Users+Apps, Users+Apps!

Pilot Applications: GRID-APP call (GSRT) received 45 proposals!

GOC&NOCs cooperation, NRENs play vital role International coordination/concertation in

middleware, VOs, infrastructures for Prod.Level.GRID New application calls from GSRT should be planned

21

Welcome to Hellas...GRID!

www.hellasgrid.grFor more:

Recommended