20
TeraPaths: A QoS Enabled Collaborative Data Sharing Infrastructure for Petascale Computing Research The TeraPaths Project Team The TeraPaths Project Team CHEP 06 CHEP 06

TeraPaths: A QoS Enabled Collaborative Data Sharing Infrastructure for Petascale Computing Research The TeraPaths Project Team CHEP 06

Embed Size (px)

Citation preview

TeraPaths: A QoS Enabled Collaborative Data Sharing Infrastructure for Petascale

Computing Research

The TeraPaths Project TeamThe TeraPaths Project Team

CHEP 06CHEP 06

2

The TeraPaths Project Team

Scott Bradley, BNLScott Bradley, BNL

Frank Burstein, BNLFrank Burstein, BNL

Les Cottrell, SLACLes Cottrell, SLAC

Bruce Gibbard, BNLBruce Gibbard, BNL

Dimitrios Katramatos, BNLDimitrios Katramatos, BNL

Yee-Ting Li, SLACYee-Ting Li, SLAC

Shawn McKee, U. MichiganShawn McKee, U. Michigan

Razvan Popescu, BNLRazvan Popescu, BNL

David Stampf, BNLDavid Stampf, BNL

Dantong Yu, BNLDantong Yu, BNL

3

Outline

IntroductionIntroduction

The TeraPaths projectThe TeraPaths project

The TeraPaths system architecture The TeraPaths system architecture

Experimental deployment and testingExperimental deployment and testing

Future workFuture work

4

Introduction

The problem: The problem: support efficient/reliable/predictable peta-scale support efficient/reliable/predictable peta-scale

data movement in modern high-speed networksdata movement in modern high-speed networks Multiple data flows with varying priority

Default “best effort” network behavior can cause performance and

service disruption problems

Solution:Solution: enhance network functionality with QoS features to enhance network functionality with QoS features to

allow prioritization and protection of data flowsallow prioritization and protection of data flows

5

Tier 1 Tier 1 site

Online System CERNCERN

Tier 1 site BNLBNL

Tier 3 site

Workstations

~GBps

100-1000 Mbps

~PBps

~10-40 Gbps

~10 Gbps

Tier 0+1

Tier 2

e.g. ATLAS Data Distribution

Tier 2 site Tier 2 site Tier 2 site

Tier 3

Tier 4

ATLAS experiment

~2.5-10 Gbps

Tier 3 site Tier 3 site UMichUMich

muon calibration

6

The QoS Arsenal

IntServIntServ RSVP: end-to-end, individual flow-based QoS

DiffServDiffServ Per-packet QoS marking

IP precedence (6+2 classes of service)

DSCP (64 classes of service)

MPLS/GMPLSMPLS/GMPLS Uses RSVP-TE

QoS compatible

Virtual tunnels, constraint-based routing, policy-based routing

7

Prioritized vs. Best Effort Traffic

Network QoS with Three Classes: Best Effort, Class 4 and EF

0

200

400

600

800

1000

1200

0 100 200 300 400 500 600 700 800 900 1000

Time (Seconds)

Uti

lized

Ban

dw

idth

(M

bit

/sec

on

d)

Best Effort Class 4 Express Forwarding TOTAL Wire Speed

8

The TeraPaths project investigates the integration and use of LAN The TeraPaths project investigates the integration and use of LAN QoS and MPLS/GMPLS-based differentiated network services in the QoS and MPLS/GMPLS-based differentiated network services in the ATLAS data intensive distributed computing environment in order to ATLAS data intensive distributed computing environment in order to manage the network as a manage the network as a critical resourcecritical resource

DOE: The collaboration includes BNL and the University of Michigan, DOE: The collaboration includes BNL and the University of Michigan, as well as OSCARS (ESnet), Lambdaas well as OSCARS (ESnet), Lambda Station (FNAL), and DWMI Station (FNAL), and DWMI (SLAC)(SLAC)

NSF: BNL participates in UltraLight to provide the network advances NSF: BNL participates in UltraLight to provide the network advances required in enabling petabyte-scale analysis of globally distributed required in enabling petabyte-scale analysis of globally distributed data data

NSF: BNL participates in a new network initiative: PLaNetS (Physics NSF: BNL participates in a new network initiative: PLaNetS (Physics Lambda Network System ), led by CalTechLambda Network System ), led by CalTech

The TeraPaths Project

9

BNL Site Infrastructure

LAN/MPLS

TeraPathsresource manager

MPLS requests

traffic identification:addresses, port #, DSCP bits

grid AAABandw

idth

Requests &

Releases

OSCARS

ingress / egress

LA

N Q

oS

M10

data transfer management

monitoring

GridFtp & dCache/SRM

SE

networkusagepolicy

ESnet

remoteTeraPaths

Remote LAN QoS requests

10

Envisioned Overall Architecture

TeraPaths

TeraPaths

TeraPaths

TeraPaths

Site A

Site B

Site C

Site D

WAN 1

WAN 2

WAN 3

service invocation

data flow

peering

11

Automate MPLS/LAN QoS Setup

QoS reservation and network configuration system for data flowsQoS reservation and network configuration system for data flows Access to QoS reservations:

Manually,through interactive web interface Manually,through interactive web interface From a program, through APIsFrom a program, through APIs

Compatible with a variety of networking components

Cooperation with WAN providers and remote LAN sites

Access Control and Accounting

System monitoring

Design goal:Design goal: enableenable the reservation of end-to-end network the reservation of end-to-end network

resources to assure a specified “Quality of Service”resources to assure a specified “Quality of Service” User requests minimum bandwidth, start time, and duration

System either grants request or makes a “counter offer”

Network is setup end-to-end with one user request

12

TeraPaths System Architecture

Site A (initiator) Site B (remote)

WAN

web services web services

WAN monitoring

WAN web services

hardware drivershardware drivers

Web page

APIs

Cmd line

QoS requests

user manager

scheduler

site monitor

router manager

user manager

scheduler

site monitor

router manager

13

TeraPaths Web Services

TeraPaths modules implemented as “web services”TeraPaths modules implemented as “web services” Each network device (router/switch) is accessible/programmable from at

least one management node Site management node maintains reservation etc. databases and distributes

network programming by invoking web services on subordinate management nodes

Remote requests to/from other sites invoke corresponding web services (destination site’s TeraPaths or WAN provider’s)

Web services benefitsWeb services benefits Standardized, reliable, and robust environment Implemented in Java and completely portable Accessible via web clients and/or APIs Compatible and easily portable into Grid services and the Web Services

Resource Framework (WSRF in GT4)

14

TeraPaths Web Services Structure

AAAModule(AAA)

RemoteNegotiation

Module(RNM)

Network

ProgrammingModule (NPM)

AdvanceReservation

Module (ARM)

HardwareProgramming

Module(HPM)

HardwareProgramming

Module(HPM)

HardwareProgramming

Module(HPM)

RemoteRequestModule(RRM)

Network

ConfigurationModule (NCM)

DiffServModule(DSM)

Route

PlanningModule(RPM)

MPLSModule(MSM)

WebInterface

APIs

future capability

RemoteInvocations

TeraPaths

15

Site Bandwidth Partitioning Scheme

Minimum Best Effort traffic

Dynamic bandwidth allocationShared dynamic class(es) Dynamic microflow policing

Mark packets within a class using DSCP bits, police at ingress, trust DSCP bits downstream

Dedicated static classes Aggregate flow policing

Shared static classes Aggregate and microflow policing

16

Route Planning with MPLS

WAN

WAN monitoring

WAN web services

TeraPaths

TeraPaths

site m

onito

ring

site

mon

itor

ing

(future capability)

17

Experimental Setup

Full-featured LAN QoS simulation Full-featured LAN QoS simulation testbed using a private network testbed using a private network environment: environment: Two Cisco switches (same models

as production hardware) interconnected with 1Gb link

Two managing nodes, one per switch

Four host nodes, two per switch All nodes have dual 1Gb Ethernet

ports, also connected to BNL campus network

Managing nodes run web services, database servers, have exclusive access to switches

Demo of prototype TeraPaths Demo of prototype TeraPaths functionality given at SC’05 functionality given at SC’05

18

Acquired Experience

Enabled, tested, and verified LAN QoS inside BNL campus networkEnabled, tested, and verified LAN QoS inside BNL campus network

Tested and verified MPLS paths between BNL and LBL, SLAC (Network Tested and verified MPLS paths between BNL and LBL, SLAC (Network Monitoring Project), FNAL, also MPLS/QoS path between BNL and UM Monitoring Project), FNAL, also MPLS/QoS path between BNL and UM for SC’05for SC’05

Integrated LAN QoS with MPLS paths reserved with OSCARSIntegrated LAN QoS with MPLS paths reserved with OSCARS

Installed DWMI network monitoring toolsInstalled DWMI network monitoring tools

Determined effectiveness of OSCARS in guaranteeing and policing Determined effectiveness of OSCARS in guaranteeing and policing

bandwidth reservations on production ESnet paths and its effect on bandwidth reservations on production ESnet paths and its effect on

improving jitter for applications requiring predictable delays improving jitter for applications requiring predictable delays

http://www-iepm.slac.stanford.edu/dwmi/oscars/http://www-iepm.slac.stanford.edu/dwmi/oscars/

Examined impact of prioritized traffic on overall network performance and Examined impact of prioritized traffic on overall network performance and

the effectiveness and efficiency of MPLS/LAN QoSthe effectiveness and efficiency of MPLS/LAN QoS

19

Simulated (testbed) and Actual Traffic

BNL to Umich. – 2 bbcp dtd xfers with iperf background traffic through ESnet MPLS tunnel

Testbed demo – competing iperf streams

20

In Progress / Future Work

Develop and deploy remote negotiation/response, etc. services to fully Develop and deploy remote negotiation/response, etc. services to fully

automate end-to-end QoS establishment across multiple network domainsautomate end-to-end QoS establishment across multiple network domains

Dynamically configure and partition QoS-enabled paths to meet time-Dynamically configure and partition QoS-enabled paths to meet time-

constrained network requirementsconstrained network requirements

Develop site-level network resource manager for multiple VOs vying for Develop site-level network resource manager for multiple VOs vying for

limited WAN resourceslimited WAN resources

Support dynamic bandwidth/routing adjustments based on resource usage Support dynamic bandwidth/routing adjustments based on resource usage

policies and network monitoring data (provided by DWMI)policies and network monitoring data (provided by DWMI)

Integrate with software from other network projects: OSCARS, lambda Integrate with software from other network projects: OSCARS, lambda

station, and DWMIstation, and DWMI

Further goal: widen deployment of QoS capabilities to tier 1 and tier 2 sites Further goal: widen deployment of QoS capabilities to tier 1 and tier 2 sites

and create services to be honored/adopted by CERN ATLAS/LHC tier 0and create services to be honored/adopted by CERN ATLAS/LHC tier 0