19
EU 2nd Year Review – 04-05 Feb. 2003 – WP2/WP7 demonstration – 1 WP2/WP7 Demonstration WP7 High Throughput Data transfers WP2/WP7 Replica Selection based on Network Cost Functions WP2 Replica Location Service

WP2/WP7 Demonstration

  • Upload
    lou

  • View
    36

  • Download
    0

Embed Size (px)

DESCRIPTION

WP2/WP7 Demonstration. WP7 High Throughput Data transfers WP2/WP7 Replica Selection based on Network Cost Functions WP2 Replica Location Service. High Throughput Data Transfers. Richard Hughes-Jones Jules Wolfrat. NIKHEF. CERN. Demo Setup. - PowerPoint PPT Presentation

Citation preview

Page 1: WP2/WP7 Demonstration

EU 2nd Year Review – 04-05 Feb. 2003 – WP2/WP7 demonstration – n° 1

WP2/WP7 Demonstration

WP7 High Throughput Data transfers

WP2/WP7 Replica Selection based on Network Cost Functions

WP2 Replica Location Service

Page 2: WP2/WP7 Demonstration

EU 2nd Year Review – 04-05 Feb. 2003 – WP2/WP7 demonstration – n° 2

High Throughput Data Transfers

Richard Hughes-Jones

Jules Wolfrat

Page 3: WP2/WP7 Demonstration

EU 2nd Year Review – 04-05 Feb. 2003 – WP2/WP7 demonstration – n° 3

Demo Setup We will show data transfers from Mass Storage system at CERN to

Mass Storage system at NIKHEF/SARA

2 systems at CERN, Geneva with datasets from experiment LHCb

4 Linux systems at NIKHEF/SARA, Amsterdam to where data has to be transferred; each with a disk sub-system I/O bandwidth of ~70 MB/s

All systems have Gigabit Ethernet connectivity

Use GridFTP and Measure disk to disk performance

SurfNetSurfNet

NIKHEF

CERN

GEANT

GEANT

Page 4: WP2/WP7 Demonstration

EU 2nd Year Review – 04-05 Feb. 2003 – WP2/WP7 demonstration – n° 4

Demo Consists of:

Raid0Disk

Data over TCP Streams

Raid0Disk

GridFTP GridFTP

Dante MonitoringNode Monitoring Site Monitoring

Page 5: WP2/WP7 Demonstration

EU 2nd Year Review – 04-05 Feb. 2003 – WP2/WP7 demonstration – n° 5

SuperJANET4

CERN

Sara & NIKHEFSURFnet

European Topology: NRNs, Geant, Sites

Page 6: WP2/WP7 Demonstration

EU 2nd Year Review – 04-05 Feb. 2003 – WP2/WP7 demonstration – n° 6

Some Measurements of Throughput CERN -SARA

Standard TCP txlen 100 25 Jan03

0

100

200

300

400

500

1043509370 1043509470 1043509570 1043509670 1043509770

Time

I/f

Rat

e M

bits

/s

00.20.40.60.811.21.41.61.82

Re

cv. R

ate

Mbi

ts/s

Out Mbit/s In Mbit/s

Hispeed TCP txlen 2000 26 Jan03

0

100

200

300

400

500

1043577520 1043577620 1043577720 1043577820 1043577920Time

I/f

Rat

e M

bits

/s

00.20.40.60.811.21.41.61.82

Rec

v. R

ate

Mbi

ts/s

Out Mbit/s

In Mbit/s

Using the GÉANT Backup Link

1 GByte file transfers

Standard TCP

Average Throughput 167 Mbit/s

Users see 5 - 50 Mbit/s!

High-Speed TCP

Average Throughput 345 Mbit/s

Scalable TCP

Average Throughput 340 Mbit/s

Scalable TCP txlen 2000 27 Jan03

0

100

200

300

400

500

1043678800 1043678900 1043679000 1043679100 1043679200Time

II/f

Rat

e M

bits

/s

00.20.40.60.811.21.41.61.82

Re

cv. R

ate

Mb

its/s

Out Mbit/s

In Mbit/s

Page 7: WP2/WP7 Demonstration

EU 2nd Year Review – 04-05 Feb. 2003 – WP2/WP7 demonstration – n° 7

WP7 High Throughput Achievements

Close Collaboration with Dante

“Low” layer QOS testing over GEANT

LBE

IP premium

iGrid 2002 and ER 2002 : UDP with LBE

Network performances evaluation

EU Review 2003 : application level transfer with real data between EDG sites

proof of concept

Page 8: WP2/WP7 Demonstration

EU 2nd Year Review – 04-05 Feb. 2003 – WP2/WP7 demonstration – n° 8

Conclusions More research on the TCP stacks and its implementation is

needed

Continued the collaboration with Dante to: Understand the behavior of GEANT backbone

Learn the benefits of QoS deployment

WP7 is taking the “Computer Science” research and knowledge of the TCP protocol & implementation and applying it to the network for real Grid users

Enabling Knowledge Transfer to sysadmins and end users EDG release 1.4.x has configuration scripts for TCP parameters for SE

and CE

Firewalls rules recommendations

Network tutorials for end users

Work with users – focus on 1 or 2 sites to try to get improvements

Page 9: WP2/WP7 Demonstration

EU 2nd Year Review – 04-05 Feb. 2003 – WP2/WP7 demonstration – n° 9

WP2/WP7 Replica Selection based on Network Cost Functions

Franck Bonnassieux (WP7)

Kurt Stockinger (WP2)

Page 10: WP2/WP7 Demonstration

EU 2nd Year Review – 04-05 Feb. 2003 – WP2/WP7 demonstration – n° 10

NetworkCost functionality

13,084,046,534,5CNAF

7,086,2410,385,03IN2P3

2,6611,863,2511,13NIKHEF

4,357,122,447,46RAL

35,4444,8777,7846,75 CERN

CNAFIN2P3NIKHEFRALCERN

getNetworkCost

FileSize = 10 MBResults = time to transfer (sec.)

CERNRALNIKHEFIN2P3CNAF

CERNRALNIKHEFIN2P3CNAF

Page 11: WP2/WP7 Demonstration

EU 2nd Year Review – 04-05 Feb. 2003 – WP2/WP7 demonstration – n° 11

NetworkCost Architecture

PCP

Distributed Data CollectorRaw

IPerf UDPmon GridFTPPingEr

Measure

CollectAndStorage

Processing

Archive

NetworkCost

R-GMAGlobus MDS

Page 12: WP2/WP7 Demonstration

EU 2nd Year Review – 04-05 Feb. 2003 – WP2/WP7 demonstration – n° 12

NetworkCost model

The current cost model is designed for data intensive computing and especially large files transfers

The most relevant metric for that cost model is available throughput

Implementation

Iperf Measurements (current)

GridFTP Logs (future)

Other metrics (future) : UDP, RTT, Jitter, ...

Synchronisation (PCP)

Page 13: WP2/WP7 Demonstration

EU 2nd Year Review – 04-05 Feb. 2003 – WP2/WP7 demonstration – n° 13

Replica Management Services

Replica Management

Services

Optimization

Replica Metadata

Replica Location Service RLS

File Transfer: GridFTP

Replica Manager Client

Information Service

VO Membership Service

Page 14: WP2/WP7 Demonstration

EU 2nd Year Review – 04-05 Feb. 2003 – WP2/WP7 demonstration – n° 14

Testbed Sites & Replica Manager Commands

edg-rm copyAndRegisterFile -l lfn:higgs CERN LYON

edg-rm listReplicas -l lfn:higgs

edg-rm replicateFile -l lfn:higgs NIKHEF

edg-rm listBestFile -l lfn:higgs CERN

edg-rm getAccessCost -l lfn:higgs CERN NIKHEF LYON

edg-rm getBestFile -l lfn:higgs CERN

edg-rm deleteFile -l lfn:higgs LYON

edg-rm listBestFile -l lfn:higgs CERN

Page 15: WP2/WP7 Demonstration

EU 2nd Year Review – 04-05 Feb. 2003 – WP2/WP7 demonstration – n° 15

WP2 Replica Location Service

Peter Kunszt

WP2 – Data Management

Page 16: WP2/WP7 Demonstration

EU 2nd Year Review – 04-05 Feb. 2003 – WP2/WP7 demonstration – n° 16

Replica Location Service RLS

Local Catalogs hold the actual name mappings

Remote Indices redirect inquiries to LRCs actually having the file

LRCs are configured to send index updates to any number of RLIs

Indexes are Bloom Filters

LRC LRC LRC LRC

RLI RLI RLI

Replica Location Index Nodes

Local Replica Catalogs

Page 17: WP2/WP7 Demonstration

EU 2nd Year Review – 04-05 Feb. 2003 – WP2/WP7 demonstration – n° 17

RLS Demo at SC2002

possum

emu

wombat

koala

Melbourne

RLIs

LRCs

n16

n19

n17

n18

dc-n1

dc-n4

dc-n2

dc-n3

a33

a36

a34

a35

rls01

rls02

rls02

rls01

ANL(Chicago)

ISI(Los Angeles)

SC2002(Baltimore)

SLAC(Palo Alto)

Replica Location Index Nodes

Local Replica Catalogs

0342

pcr25

0343

pcr24

0344

grid03

0345

grid01

0346 grid8 grid6

grid7.mi

grid1

grid7.pi

CERN(Geneva)

Glasgow INFN(Pisa)

INFN(Milan)

Replica Location Index Nodes

Local Replica Catalogs

Australia Sites Unites States Sites Europe Sites

Page 18: WP2/WP7 Demonstration

EU 2nd Year Review – 04-05 Feb. 2003 – WP2/WP7 demonstration – n° 18

RLS Demo Topology Today

CERNlxshare0342.cern.ch

Glasgowgrid01.ph.gla.ac.uk

Californiadc-n2.isi.edu

Melbournekoala.unimelb.edu.au

Local Replica Catalog

Glasgowgrid03.ph.gla.ac.uk

Californiadc-n4.isi.edu

CERNlxshare0344.cern.ch

Melbournewombat.unimelb.edu.au

Replica Location Index

Page 19: WP2/WP7 Demonstration

EU 2nd Year Review – 04-05 Feb. 2003 – WP2/WP7 demonstration – n° 19

SUMMARY

Replica Optimization

WP7 Network cost functions are integrated into the Replica Management functionality providing an essential functionality that was missing up to now.

This gives us the necessary framework to start work on high-level optimization algorithms.

Replica Location Service

Scalable distributed catalog as a much-needed replacement for the current Replica Catalog.

Addresses all issues brought up by the experiments. Tests have been conducted with very large catalogs

The lookup time for an entry is independent of the number of catalog. Tested for up to 108 entries.

The catalog withstands simultaneous user queries of over 1000 queries or inserts per second.