51
1 Networking for the Future of Science Networking for the Future of DOE Science William E. Johnston ESnet Department Head and Senior Scientist [email protected], www.es.net This talk is available at www.es.net Energy Sciences Network Lawrence Berkeley National Laboratory Networking for the Future of Science Institute of Computer Science (ICS) of th Foundation for Research and Technology – Hellas (FORTH)., Feb. 15, 2008

[.ppt]

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: [.ppt]

1

Networking for the Future of Science

Networking for the Futureof DOE Science

William E. Johnston ESnet Department Head and Senior

Scientist

[email protected], www.es.netThis talk is available at www.es.net

Energy Sciences NetworkLawrence Berkeley National Laboratory

Networking for the Future of Science

Institute of Computer Science (ICS) of theFoundation for Research and Technology –

Hellas (FORTH)., Feb. 15, 2008

Page 2: [.ppt]

2

DOE’s Office of Science: Enabling Large-Scale Science

• The U.S. Dept. of Energy’s Office of Science (SC) is the single largest supporter of basic research in the physical sciences in the United States, … providing more than 40 percent of total funding … for the Nation’s research programs in high-energy physics, nuclear physics, and fusion energy sciences. (http://www.science.doe.gov) – SC funds 25,000 PhDs and PostDocs

• A primary mission of SC’s National Labs is to build and operate very large scientific instruments - particle accelerators, synchrotron light sources, very large supercomputers - that generate massive amounts of data and involve very large, distributed collaborations

• ESnet - the Energy Sciences Network - is an SC program whose primary mission is to enable the large-scale science of the Office of Science that depends on:

– Sharing of massive amounts of data– Supporting thousands of collaborators world-wide– Distributed data processing– Distributed data management– Distributed simulation, visualization, and computational steering– Collaboration with the US and International Research and Education

community

Page 3: [.ppt]

3

ESnet Defined

• A national optical circuit infrastructure– ESnet shares an optical network on a dedicated national fiber

infrastructure with Internet2 (US national research and education (R&E) network)• ESnet has exclusive use of a group of 10Gb/s optical channels on this

infrastructure

• A large-scale IP network– A tier 1 Internet Service Provider (ISP) (direct connections with all

major commercial networks providers)

• A large-scale science data transport network– With multiple high-speed connections to all major US and international

research and education (R&E) networks in order to enable large-scale science

– Providing virtual circuit services specialized to carry the massive science data flows of the National Labs

• An organization of 30 professionals structured for the service– The ESnet organization designes, builds, and operates the network

• An operating entity with an FY06 budget of about $26.5M

Page 4: [.ppt]

= the R&E source or destination of ESnet’s top 100 sites (all R&E)(the DOE Lab destination or source of each flow is not shown)

ESnet as a Large-Scale Science Data Transport Network

A few hundred of the 25,000,000 hosts that the Labs connect to every month account for 50% of all ESnet traffic (which primarily high energy

physics, nuclear physics, and climate data at this point)

Most of ESnet’s traffic (>85%) goes to and comes from outside of ESnet. This reflects the highly collaborative nature of large-scale science (which is one of the main focuses of

DOE’s Office of Science).

Page 5: [.ppt]

LVK

SNLL

YUCCA MT

BECHTEL-NV

PNNLLIGO

LANL

SNLAAlliedSignal

PANTEX

ARM

KCP

NOAA

OSTI

ORAU

SRS

JLAB

PPPL

Lab DCOffices

MIT/PSFC

BNL

AMES

NREL

LLNL

GA

DOE-ALB

DOE GTNNNSA

NNSA Sponsored (13+)Joint Sponsored (3)Other Sponsored (NSF LIGO, NOAA)Laboratory Sponsored (6)

~45 end user sites

SINet (Japan)Russia (BINP)CA*net4

FranceGLORIAD (Russia, China)Korea (Kreonet2

Japan (SINet)Australia (AARNet)Canada (CA*net4Taiwan (TANet2)Singaren

ELPA

WA

SH

commercial peering points

PAIX-PAEquinix, etc.

ESnet Provides Global High-Speed Internet Connectivity for DOE Facilities and Collaborators (12/2007)

ESnet core hubs

CERN(USLHCnet:

DOE+CERN funded)

GÉANT - France, Germany, Italy, UK, etc

NEWY

SUNN

Ab

ilen

e

JGI

LBNL

SLACNERSC

SNV1

SDSC

Equinix

ALBU

OR

NL

CHIC

MRENStarTapTaiwan (TANet2, ASCC)

NA

SA

Am

es

AU

AU

SEA

CH

I-SL

Specific R&E network peers

UNM

MAXGPoPNLR

AMPATH(S. America)

AMPATH(S. America)

R&Enetworks

Office Of Science Sponsored (22)

ATLA

NSF/IRNCfunded

IARC

Pac

Wav

e

KAREN/REANNZODN Japan Telecom AmericaNLR-PacketnetAbilene/I2

KAREN / REANNZInternet2SINGARENODN Japan Telecom America

NETL

ANLFNAL

Starlight

USLH

CN

et

NLR

International (1-10 Gb/s)10 Gb/s SDN core (I2, NLR)10Gb/s IP coreMAN rings (≥ 10 Gb/s)Lab supplied linksOC12 / GigEthernetOC3 (155 Mb/s)45 Mb/s and less

Salt Lake

PacWave

Ab

ilen

e

Equinix DENV

DOE

SUNN

NASH

Geography isonly representational

Inte

rnet

2N

YS

ER

Net

MA

N L

AN

Other R&E peering points

US

HL

CN

et

to G

ÉA

NT

INL

2700 miles / 4300 km

162

5 m

iles

/ 2

545

km

Page 6: [.ppt]

6

For Scale

270

0 m

iles

/ 4

300

km

1625 miles / 2545 km

The relatively large

geographic ESnet scale of

makes it a challenge for a

small organization to

build, maintain, and operate the

network.

Page 7: [.ppt]

7

ESnet History

ESnet0/MFENet mid-1970s-1986

ESnet0/MFENet 56 Kbps microwave and satellite links

ESnet11986-1995

ESnet formed to serve the Office of Science

56 Kbps to45 Mbps X.25

ESnet21995-2000

Partnership with Sprint to build the first national footprint ATM network

IP over 155 Mbps ATM net

ESnet32000-2007

Partnership with Qwest to build a national Packet over SONET network and optical channel Metropolitan Area fiber

IP over 2.5 and 10Gbps SONET

ESnet42007-2012

Partnership with Internet2 and US Research & Education community to build a dedicated national optical network

IP and virtual circuits on a configurable optical infrastructure with at least 5-6 optical channels of 10-100 Gbps each

Page 8: [.ppt]

8

Operating Science Mission Critical Infrastructure

• ESnet is a visible and critical piece of DOE science infrastructure– if ESnet fails,10s of thousands of DOE and University users know it

within minutes if not seconds

• Requires high reliability and high operational security in the systems that are integral to the operation and management of the network– Secure and redundant mail and Web systems are central to the

operation and security of ESnet• trouble tickets are by email

• engineering communication by email

• engineering database interfaces are via Web

– Secure network access to Hub routers

– Backup secure telephone modem access to Hub equipment

– 24x7 help desk and 24x7 on-call network engineer

[email protected] (end-to-end problem resolution)

Page 9: [.ppt]

9

LBNL

PPPL

BNL

AMES

Remote Engineer• partial duplicate infrastructure

DNS

Remote Engineer• partial duplicate

infrastructure

TWCRemoteEngineer

Disaster Recovery and Stability

• The network must be kept available even if, e.g., the West Coast is disabled by a massive earthquake, etc.

ATL HUB

SEA HUB

ALBHUB

NYC HUBS

DC HUB

ELP HUB

CHI HUB

SNV HUB Duplicate InfrastructureDeploying full replication of the NOC databases and servers and Science Services databases in the NYC Qwest carrier hub

Engineers, 24x7 Network Operations Center, generator backed power

• Spectrum (net mgmt system)• DNS (name – IP address

translation)• Eng database• Load database• Config database• Public and private Web• E-mail (server and archive)• PKI cert. repository and

revocation lists• collaboratory authorization

service

Reliable operation of the network involves• remote Network Operation Centers (4) • replicated support infrastructure• generator backed UPS power at all critical

network and infrastructure locations

• high physical security for all equipment• non-interruptible core - ESnet core

operated without interruption throughoN. Calif. Power blackout of 2000othe 9/11/2001 attacks, andothe Sept., 2003 NE States power blackout

RemoteEngineer

Page 10: [.ppt]

10

ESnet Availability 8/2006 through 7/2007

0

500

1000

1500

2000

2500

OR

NL

100

.000

SL

AC

100

.000

NE

RS

C 9

9.99

8

SN

LL

99.

998

LB

L 9

9.99

7

PN

NL

99.

997

LL

NL

99.

996

FN

AL

99.

995

IAR

C 9

9.99

5

SR

S 9

9.99

5

Yu

cca

99.9

93

BJC

99.

991

AN

L 9

9.99

0

JGI

99.9

90

LA

NL

-DC

99.

986

LL

NL

-DC

99.

986

MIT

99.

985

JLab

99.

982

PP

PL

99.

982

GA

99.

973

BN

L 9

9.97

0

DO

E-A

LB

99.

955

LA

NL

99.

955

SN

LA

99.

955

All

ied

99.

951

Pan

tex

99.9

49

Bec

hte

l 99

.948

NR

EL

99.

941

DO

E-N

NS

A 9

9.91

8

DO

E-G

TN

99.

889

INL

99.

883

OR

AU

99.

878

OS

TI

99.8

68

Am

es-L

ab 9

9.85

1

NO

AA

99.

782

Lam

on

t 99

.624

Ou

tag

e M

inu

tes

Aug

Sep

Oct

Nov

Dec

Jan

Feb

Mar

Apr

May

Jun

Jul

ESnet is a Highly Reliable Infrastructure

“5 nines” (>99.995%) “3 nines” (>99.5%)“4 nines” (>99.95%)

Dually connected sites Note: These availability measures are only for ESnet infrastructure, they do not include site-related problems. Some sites, e.g. PNNL and LANL, provide circuits from the site to an ESnet hub, and therefore the ESnet-site demarc is at the ESnet hub (there is no ESnet equipment at the site. In this case, circuit outages between the ESnet equipment and the site are considered site issues and are not included in the ESnet availability metric.

Page 11: [.ppt]

11

ESnet is An Organization Structured for the Service

Applied R&D for new network services(Circuit services and end-to-end monitoring)

Network operations and user support(24x7x365, end-to-end problem resolution

Network engineering, routing and network services, WAN security

Science collaboration services(Public Key Infrastructure certification authorities,

AV conferencing, email lists, Web services)

5.5

8.3

7.4

3.0

5.5

3.5

Internal infrastructure, disaster recovery,security

Management, accounting, compliance

Deployment and WAN maintenance

Page 12: [.ppt]

12

ESnet FY06 Budget is Approximately $26.6M

Total expenses: $26.6M

Target carryover:

$1.0M

Management and compliance: $0.7M

Collaboration services: $1.6

Internal infrastructure,

security, disaster recovery: $3.4M

Operations: $1.1M

Engineering & research: $2.9M

WAN equipment:

$2.0M

Special projects (Chicago and LI MANs):

$1.2M

Total funds: $26.6M

SC R&D: $0.5M

Carryover: $1MSC Special

Projects: $1.2M

Circuits & hubs: $12.7M

SC funding:$20.1M

Other DOE: $3.8M

Approximate Budget Categories

Page 13: [.ppt]

13

A Changing Science Environment is the Key Driver of the Next Generation ESnet

• Large-scale collaborative science – big facilities, massive data, thousands of collaborators – is now a significant aspect of the Office of Science (“SC”) program

• SC science community is almost equally split between Labs and universities

– SC facilities have users worldwide

• Very large international (non-US) facilities (e.g. LHC and ITER) and international collaborators are now a key element of SC science

• Distributed systems for data analysis, simulations, instrument operation, etc., are essential and are now common (in fact dominate in the data analysis systems that now generate 50% of all ESnet traffic)

Page 14: [.ppt]

LHC will be the largest scientific experiment and generate the most data the scientific community has ever tried to manage.

The data management model involves a world-wide collection of data centers that store, manage, and analyze the data and that are integrated through network connections with typical speeds in the 10+ Gbps range.

closely coordinated and interdependent

distributed systems that must have predictable intercommunicati

on for effective functioning

[ICFA SCIC]

Page 15: [.ppt]

FNAL Traffic is Representative of all CMS Traffic

Accumulated data (Terabytes) received by CMS Data Centers (“tier1” sites) and many analysis centers (“tier2” sites) during the past 12 months

(15 petabytes of data) [LHC/CMS]

FNAL

Page 16: [.ppt]

FNAL Outbound Traffic

When A Few Large Data Sources/Sinks Dominate Trafficit is Not Surprising that Overall Network Usage Follows the

Patterns of the Very Large Users - This Trend Will Reverse in the Next Few Weeks as the Next Round of LHC Data Challenges Kicks Off

Page 17: [.ppt]

17

What Does this Aggregate Traffic Look like at a Site

FNAL Outbound CMS Traffic

Max= 1064 MBy/s (8.5 Gb/s), Average = 394 MBy/s (3.2 Gb/s)

Mbytes/sec

8 Gb/s

Page 18: [.ppt]

18

Total ESnet Traffic Exceeds 2 Petabytes/Month

Bytes Accepted

0.00E+00

5.00E+14

1.00E+15

1.50E+15

2.00E+15

2.50E+15

3.00E+152.7 PBytes in

July 2007

1 PBytes in April 2006

ESnet traffic historically has increased 10x every 47 months

Overall traffic tracks the very large

science use of the network

Page 19: [.ppt]

19

Large-Scale Data Analysis Systems (Typified by the LHC) Generate Several Types of Requirements

• The systems are data intensive and high-performance, typically moving terabytes a day for months at a time

• The system are high duty-cycle, operating most of the day for months at a time in order to meet the requirements for data movement

• The systems are widely distributed – typically spread over continental or inter-continental distances

• Such systems depend on network performance and availability, but these characteristics cannot be taken for granted, even in well run networks, when the multi-domain network path is considered

The applications must be able to get guarantees from the network that there is adequate bandwidth to accomplish the task at hand

The applications must be able to get information from the network that allows graceful failure and auto-recovery and adaptation to unexpected network conditions that are short of outright failure

This slide drawn from [ICFA SCIC]

Page 20: [.ppt]

20

Enabling Large-Scale Science

• These requirements are generally true for systems with widely distributed components in order to be reliable and consistent in performing the sustained, complex tasks of large-scale science

• To meet these requirements networks must provide

– high capacity

– communication capabilities that are service-oriented: configurable, schedulable, predictable, reliable, and informative

– and the network and its services must be scalable

Page 21: [.ppt]

21

ESnet Response to the Requirements

• For capacity: A new network architecture and implementation strategy

• Provide two networks: IP and circuit-oriented Science Data Netework

• Rich and diverse network topology for flexible management and high reliability

• Dual connectivity at every level for all large-scale science sources and sinks

• A partnership with the US research and education community to build a shared, large-scale, R&E managed optical infrastructure

• For service-oriented capabilities– Develop and deploy a virtual circuit service

• Develop the service cooperatively with the networks that are intermediate between DOE Labs and major collaborators to ensure and-to-end interoperability

– Develop and deploy service-oriented, user accessible network monitoring systems

– Provide collaboration support services: Public Key Certificate infrastructure, video/audio/data conferencing

Page 22: [.ppt]

22

Networking for Large-Scale Science

• It is neither financially reasonable, nor necessary, to address this tremendous growth in traffic by growing the IP network– The routers in the IP network are capable of routing

(determining the path to the destination) of 6,000,000 packets a second, and this requires expensive hardware (~$1.5M/router)

– The paths of the large-scale science traffic are quite static - routing is not needed

• Therefore, ESnet is obtaining the required huge increase in network capacity by using a switched / circuit-oriented network (the switches are about20-25% of the cost of routers)

Page 23: [.ppt]

23

ESnet4 Network Architecture

ESnet sites

ESnet IP hubs

Metro area rings (MANs)

Other IP networks

Circuit connections to otherscience networks (e.g. USLHCNet)

ESnet SDN hubs

The multiple interconnected rings provide high reliability

IP network physical infrastructure is packet

over un-switched Ethernet

Science Data Network (SDN) physical

infrastructure is switched Ethernet circuits

Core-to-site-to-coredual connection for those sites not on metro area rings

Page 24: [.ppt]

DCb

Houston 10 Gb/s SDN core (NLR)10/2.5 Gb/s IP core (QWEST)10 Gb/s IP core (Level3)10 Gb/s SDN core (Level3)MAN rings (≥ 10 G/s)Lab supplied links

Clev.

Houston

Kansas City

Boston

Sunnyvale

Seattle

San Diego Albuquerque

El Paso

Chicago New York

City

Washington DC

Atlanta

Denver

Los Angeles

Nashville

Future ESnet Hub

ESnet Hub (IP andSDN devices)

20G

20G

20G

20G

20G

20G

20G

ESnet 4 Backbone Optical Circuit Configuration (December, 2008)

Approach to Satisfying Capacity Requiremetns - a Scalable Optical Infrastructure

Page 25: [.ppt]

25

Typical ESnet and Internet2 Core Hub

fiber eastfiber west

fiber north/south

CienaCoreDirector

groomingdevice

Infinera DTN

opticalinterface

to R&E

RegionalNets

Internet2ESnet

support devices:•measurement•out-of-band access•monitoring•security

support devices:•measurement•out-of-band access•monitoring•…….

T640

Network TestbedImplemented as aOptical Overlay

various equipment and experimental control plane management

systems

dynamicallyallocated and routed waves

(future)

ESnetmetro-areanetworks

Internet2/Level3/InfineraNational Optical Infrastructure

BM

MB

MM

M320

IPcore

SDNcore

switch

Page 26: [.ppt]

26

100 Gb/s

Conceptual Architecture of the Infinera DTN Digital Terminal

100 Gb/s

BM

M

100 Gb/s

DLM10x10 Gb/s

OE

/EO

TAMTOM

TOM

10 Gb/s

OSC

• Band Mux Module (BMM)– Multiplexes 100Gb/s bands onto

400 Gb/s or 800 Gb/s line

• Digital Line Module (DLM)– 100Gb/s DWDM line module (10 x

10Gb/s)– Integrated digital switch enables

add/drop & grooming at 2.5Gb/s (ODU-1) granularity

• Tributary Adapter Module (TAM)– Provides customer facing optical

interface– Supports variety of data rates and

service types – Up to 5 TAMs per DLM

• Tributary Optical Module (TOM)– Pluggable client side optical module

(XFP, SFP)– Supports 10GE, OC48&OC192 SONET

• Optical Supervisory Channel (OSC)– Management Plane Traffic—Includes

traffic from the remote management systems to access network elements for the purpose of managing them

– Control Plane Traffic—GMPLS routing and signaling control protocol traffic

– Datawire Traffic—Customer management traffic by interconnecting customer’s 10Mbps Ethernet LAN segments at various sites through AUX port interfaces

100 Gb/s

Notes:1) This conceptual architecture is based on W. Johnston’s

understanding of the Infinera DTN2) All channels are full duplex, which is not made explicit here3) The switch fabric operates within a DLM and also interconnects

the DLMs4) The Management Control Module is not shown

intra- andinter-DLM

switch fabric

WAN

Control Processor

Page 27: [.ppt]

27

MAN fiber ring: 2-4 x 10 Gbps channels provisioned initially,with expansion capacity to 16-64

Siterouter

Site gateway router

ESnet production IP core

Site

Large Science Site

ESnet Metropolitan Area Network Ring Architecture for High Reliability Sites

ESnet managedvirtual circuit servicestunneled through the

IP backbone

Virtual Circuits to Site

T320

Site LAN Site edge router

IP core router

ESnet SDNcore

ESnetIP core hub

ESnet switch

ESnet managedλ / circuit services

SDN circuitsto site systems

ESnet MANswitch

Virtual Circuit to

Site

SDN corewest

SDNcoreeast IP core

west

IP coreeast

ESnet production IP service

Independentport card

supportingmultiple 10 Gb/s line interfaces

USLHCnetswitch

SDNcore

switch

SDNcore

switch

USLHCnetswitch

MANsiteswitch

MANsite

switch

Page 28: [.ppt]

28

Typical Medium Size ESnet4 HubDC power controllers

10 Gbps network tester

Cisco 7609,Science Data

Network switch(about $365K,

list)

Juniper M320, core IP router,(about $1.65M,

list)

Juniper M7i,IP peering

router,(about $60K,

list)

secure terminal server

(telephone modem access)

local Ethernet

DC power controllers

Page 29: [.ppt]

Denver

Seattle

Su

nn

yv

ale

LA

San Diego

Chicago

Raleigh

Jacksonville

KC

El Paso

Albuq.Tulsa

Clev.

Boise

Wash. DC

Salt Lake City

Portland

BatonRougeHouston

Pitts. NYC

Boston

Philadelphia

Indianapolis

(>1 )

Atlanta

Nashville

Layer 1 optical nodes at eventual ESnet Points of Presence

ESnet IP switch only hubs

ESnet IP switch/router hubs

ESnet SDN switch hubs

Layer 1 optical nodes not currently in ESnet plans

Lab site

OC48

(0)

(1)

ESnet IP core (1)ESnet Science Data Network coreESnet SDN core, NLR links (existing)Lab supplied linkLHC related linkMAN linkInternational IP ConnectionsInternet2 circuit number(20)

5

4

4

4

4 4

4

5

5

5

3

44

5

5

5

5

54

(7)

(17)

(19) (20)

(22)(23)

(29)

(28)

(8)

(16)

(32)

(2)

(4)

(5)

(6)

(9)

(11)

(13) (25)

(26)

(10)

(12)

(27)

(14)

(24)

(15)

(30)3

3(3)

(21)

Note that the major ESnet sites are now directly on the ESnet “core” network

West Chicago MAN Long Island MAN

BNL

32 AoA, NYC

USLHCNet

JLab

ELITE

ODU

MATP

Wash., DC

FNAL

600 W. Chicago

Starlight

ANL

USLHCNet

Atlanta MAN

180 Peachtree

56 Marietta(SOX)

ORNL

Wash., DC

Houston

Nashville

San FranciscoBay Area MAN

LBNL

SLAC

JGI

LLNL

SNLL

NERSC

bandwidth into and out of FNAL

is equal to, or greater, than the

ESnet core bandwidth

Page 30: [.ppt]

30

ESnet4 IP + SDN, 2011 Configuration

Denver

Seattle

Su

nn

yv

ale

LA

San Diego

Chicago

Raleigh

Jacksonville

KC

El Paso

Albuq.Tulsa

Clev.

Boise

Wash. DC

Salt Lake City

Portland

BatonRougeHouston

Pitts. NYC

Boston

Philadelphia

Indianapolis

(>1 )

Atlanta

Nashville

Layer 1 optical nodes at eventual ESnet Points of Presence

ESnet IP switch only hubs

ESnet IP switch/router hubs

ESnet SDN switch hubs

Layer 1 optical nodes not currently in ESnet plans

Lab site

OC48

(0)

(1)

ESnet IP core (1)ESnet Science Data Network coreESnet SDN core, NLR links (existing)Lab supplied linkLHC related linkMAN linkInternational IP ConnectionsInternet2 circuit number(20)

5

4

4

4

4 4

4

5

5

5

3

44

5

5

5

5

54

(7)

(17)

(19) (20)

(22)(23)

(29)

(28)

(8)

(16)

(32)

(2)

(4)

(5)

(6)

(9)

(11)

(13) (25)

(26)

(10)

(12)

(27)

(14)

(24)

(15)

(30)3

3(3)

(21)

Page 31: [.ppt]

Cle

vela

nd

Europe(GÉANT)

Asia-Pacific

New York

Chicago

Washington DC

Atl

anta

CERN (30 Gbps)

Seattle

Albuquerque

Au

str

ali

a

San Diego

LA

Denver

South America(AMPATH)

South America(AMPATH)

Canada(CANARIE)

CERN (30 Gbps)Canada(CANARIE)

Asi

a-Pac

ific

Asia Pacific GLORIAD (Russia and

China)

Boise

HoustonJacksonville

Tulsa

Boston

Science Data Network Core

IP Core

Kansa

s

City

Au

str

ali

a

Core network fiber path is~ 14,000 miles / 24,000 km

162

5 m

iles

/ 2

545

km

2700 miles / 4300 km

Sunnyvale

Production IP core (10Gbps) ◄

SDN core (20-30-40Gbps) ◄

MANs (20-60 Gbps) or backbone loops for site access

International connections

IP core hubs

Primary DOE LabsSDN (switch) hubs

High speed cross-connectswith Ineternet2/AbilenePossible hubs

El Paso

ESnet4 Planed Configuration40-50 Gbps in 2009-2010, 160-400 Gbps in 2011-2012

Page 32: [.ppt]

32

Multi-Domain Virtual Circuits as a Service

• Guaranteed bandwidth service– User specified bandwidth - requested and managed in a Web Services

framework

• Traffic isolation and traffic engineering– Provides for high-performance, non-standard transport mechanisms

that cannot co-exist with commodity TCP-based transport– Enables the engineering of explicit paths to meet specific requirements

• e.g. bypass congested links, using lower bandwidth, lower latency paths

• Secure connections– The circuits are “secure” to the edges of the network (the site

boundary) because they are managed by the control plane of the network which is isolated from the general traffic

• End-to-end (cross-domain) connections between Labs and collaborating institutions

• Reduced cost of handling high bandwidth data flows– Highly capable routers are not necessary when every packet goes to

the same place– Use lower cost (factor of 5x) switches to relatively route the packets

Page 33: [.ppt]

33

Virtual Circuit Service Functional Requirements• Support user/application VC reservation requests

– Source and destination of the VC– Bandwidth, latency, start time, and duration of the VC– Traffic characteristics (e.g. flow specs) to identify traffic designated for the VC

• Manage allocations of scarce, shared resources– Authentication to prevent unauthorized access to this service– Authorization to enforce policy on reservation/provisioning– Gathering of usage data for accounting

• Provide virtual circuit setup and teardown mechanisms and security– Widely adopted and standard protocols (such as MPLS and GMPLS) are well understood

within a single domain– Cross domain interoperability is the subject of ongoing, collaborative development– secure and-to-end connection setup is provided by the network control plane– accommodate heterogeneous circuit abstraction (e.g.. MPLS, GMPLS, VLANs,

VCAT/LCAS)

• Enable the claiming of reservations– Traffic destined for the VC must be differentiated from “regular” traffic

• Enforce usage limits– Per VC admission control polices usage, which in turn facilitates guaranteed bandwidth– Consistent per-hop QoS throughout the network for transport predictability

Page 34: [.ppt]

ESnet Virtual Circuit Service: OSCARS(On-demand Secured Circuits and Advanced Reservation System)

UserApplication

Software Architecture (see ref. 9)• Web-Based User Interface (WBUI) will prompt the user for a

username/password (or Grid proxy cert.) and forward it to the AAAS.• Authentication, Authorization, and Auditing Subsystem (AAAS) will

handle access, enforce policy, and generate usage records.• Bandwidth Scheduler Subsystem (BSS) will track reservations and

map the state of the network (present and future).• Path Setup Subsystem (PSS) will setup and teardown the on-demand

paths (LSPs).

UserInstructions to

routers and switches to

setup/teardownLSPs

Web-BasedUser Interface

Authentication,Authorization,And AuditingSubsystem

BandwidthSchedulerSubsystem

Path SetupSubsystem

Reservation Manager

User app request via AAAS

Userfeedback

HumanUser

User request

via WBUI

Page 35: [.ppt]

35

Environment of Science is Inherently Multi-Domain

• End points will be at independent institutions – campuses or research institutes - that are served by ESnet, Abilene, GÉANT, and their regional networks– Complex inter-domain issues – typical circuit will involve five or more

domains - of necessity this involves collaboration with other networks

– For example, a connection between FNAL and DESY involves five domains, traverses four countries, and crosses seven time zones

FNAL (AS3152)[US]

ESnet (AS293)[US]

GEANT (AS20965)[Europe]

DFN (AS680)[Germany]

DESY (AS1754)[Germany]

Page 36: [.ppt]

36

• Motivation:– For a virtual circuit service to be successful, it must

• Be end-to-end, potentially crossing several administrative domains• Have consistent network service guarantees throughout the circuit

• Observation:– Setting up an intra-domain circuit is easy compared with coordinating an inter-

domain circuit

• Issues:– Cross domain authentication and authorization

• A mechanism to authenticate and authorize a bandwidth on-demand (BoD) circuit request must be agreed upon in order to automate the process

– Multi-domain Acceptable Use Policies (AUPs)• Domains may have very specific AUPs dictating what the BoD circuits can be used for

and where they can transit/terminate

– Domain specific service offerings• Domains must have way to guarantee a certain level of service for BoD circuits

– Security concerns• Are there mechanisms for a domain to protect itself (e.g. RSVP filtering)

Mostly “no” - so there will not be signaling across domain boundaries

Inter-domain Reservations: A Hard Problem

Page 37: [.ppt]

37

Environment of Science is Inherently Multi-Domain

• In order to set up end-to-end circuits across multiple domains without violating security or allocation management policy of any of the domains, the process of setting up end-to-end circuits is handled by a broker that

1) Negotiates for bandwidth that is available for this purpose

2) Requests that the domains each establish the “meet-me” state so that the circuit in one domain can connect to a circuit in another domain (thus maintaining domain control over the circuits)

FNAL (AS3152)[US]

ESnet (AS293)[US]

GEANT (AS20965)[Europe]

DFN (AS680)[Germany]

DESY (AS1754)[Germany]

broker

Local domaincircuit

manager

Local domaincircuit

manager

Local domaincircuit

manager

Local domaincircuit

manager

Local domaincircuit

manager

Page 38: [.ppt]

38

OSCARS Update• OSCARS is the ESnet control plane mechanism

– Establishes the circuit path– Transport is then via MPLS and VLANs

• Completed porting OSCARS from Perl to Java to better support web-services– This is now the common code base for OSCARS and I2's BRUW

• Paper on OSCARS was accepted by the IEEE GridNets

• Collaborative efforts– Working with I2 and DRAGON to support interoperability between

OSCARS/BRUW and DRAGON• currently in the process of installing an instance of DRAGON in ESnet

– Working with I2, DRAGON, and TeraPaths (Brookhaven Lab) to determine an appropriate interoperable AAI (authentication and authorization infrastructure) framework (this is in conjunction with GEANT2's JRA5)

– Working with DICE Control Plane group to determine schema and methods of distributing topology and reachability information

• DICE=Internet2, ESnet, GEANT, CANARIE/UCLP; see http://www.garr.it/dice/presentation.htm for presentations from the last meeting

– Working with Tom Lehman (DRAGON), Nagi Rao (USN), Nasir Ghani (Tennessee Tech) on multi-level, multi-domain hybrid network performance measurements (funded by OASCR)

Page 39: [.ppt]

39

Monitoring as aService-Oriented Communications Service

• perfSONAR is a community effort to define network management data exchange protocols, and standardized measurement data gathering and archiving

• Path performance monitoring is an example of a perfSONAR application– provide users/applications with the end-to-end, multi-domain traffic

and bandwidth availability

– provide real-time performance such as path utilization and/or packet drop

• Multi-domain path performance monitoring tools are in development based on perfSONAR protocols and infrastructure– One example – Traceroute Visualizer [TrViz] – has been deployed in

about 10 R&E networks in the US and Europe that have deployed at least some of the required perfSONAR measurement archives to support the tool

Page 40: [.ppt]

perfSONAR archtiecture

performance GUI

event subscription service

human user client (e.g. part of an application system

communication service manager)

topology aggregatormeasurement

archive(s)

m1m2

m3

architectural relationship

inte

rfac

ese

rvic

em

easu

rem

ent

poin

texamples

•real-time end-to-end performance graph (e.g. bandwidth or packet loss vs. time)

•historical performance data for planning purposes

•event subscription service (e.g. end-to-end path segment outage)

layer

measurement export measurement export

m1m4

m3

measurement export

m1m5

m3

m6

•The measurement points (m1….m6) are the real-time feeds from the network or local monitoring devices

•The Measurement Export service converts each local measurment to a standard format for that type of measurment

network domain 1 network domain 2 network domain 3

service locator

path monitor

Page 41: [.ppt]

41

Traceroute Visualizer

• Forward direction bandwidth utilization on application path from LBNL to INFN-Frascati (Italy)– traffic shown as bars on those network device interfaces that have an

associated MP services (the first 4 graphs are normalized to 2000 Mb/s, the last to 500 Mb/s)

1 ir1000gw (131.243.2.1)2 er1kgw 3 lbl2-ge-lbnl.es.net

4 slacmr1-sdn-lblmr1.es.net (GRAPH OMITTED)5 snv2mr1-slacmr1.es.net (GRAPH OMITTED)6 snv2sdn1-snv2mr1.es.net

7 chislsdn1-oc192-snv2sdn1.es.net (GRAPH OMITTED)8 chiccr1-chislsdn1.es.net

9 aofacr1-chicsdn1.es.net (GRAPH OMITTED)

10 esnet.rt1.nyc.us.geant2.net (NO DATA)11 so-7-0-0.rt1.ams.nl.geant2.net (NO DATA)12 so-6-2-0.rt1.fra.de.geant2.net (NO DATA)13 so-6-2-0.rt1.gen.ch.geant2.net (NO DATA)14 so-2-0-0.rt1.mil.it.geant2.net (NO DATA)15 garr-gw.rt1.mil.it.geant2.net (NO DATA)16 rt1-mi1-rt-mi2.mi2.garr.net

17 rt-mi2-rt-rm2.rm2.garr.net (GRAPH OMITTED)18 rt-rm2-rc-fra.fra.garr.net (GRAPH OMITTED)19 rc-fra-ru-lnf.fra.garr.net (GRAPH OMITTED)

2021 www6.lnf.infn.it (193.206.84.223) 189.908 ms 189.596 ms 189.684 ms

link capacity is also provided

Page 42: [.ppt]

42

Federated Trust Services

• Remote, multi-institutional, identity authentication is critical for distributed, collaborative science in order to permit sharing widely distributed computing and data resources, and other Grid services

• Public Key Infrastructure (PKI) is used to formalize the existing web of trust within science collaborations and to extend that trust into cyber space– The function, form, and policy of the ESnet trust services are driven

entirely by the requirements of the science community and by direct input from the science community

– International scope trust agreements that encompass many organizations are crucial for large-scale collaborations

• The service (and community) has matured to the point where it is revisiting old practices and updating and formalizing them

Page 43: [.ppt]

43

Federated Trust Services – Support for Large-Scale Collaboration

•CAs are provided with different policies as required by the science community

o DOEGrids CA has a policy tailored to accommodate international science collaboration

o NERSC CA policy integrates CA and certificate issuance with NIM (NERSC user accounts management services)

o FusionGrid CA supports the FusionGrid roaming authentication and authorization services, providing complete key lifecycle management

DOEGrids CA

NERSC CA

FusionGrid CA

…… CA

ESnet root CA

See www.doegrids.org

Page 44: [.ppt]

44

DOEGrids CA (Active Certificates) Usage Statistics

* Report as of Jan 17, 2008

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

5500

6000

6500

7000

7500

8000

8500

9000

Production service began in J une 2003

No.

of c

erti

fica

tes

or r

eque

sts

Active User Certificates

Active Service Certificates

Total Active Certificates

US, LHC ATLAS projectadopts ESnet CA service

Page 45: [.ppt]

45

DOEGrids CA Usage - Virtual Organization Breakdown

* DOE-NSF collab. & Auto renewals

** OSG Includes (BNL, CDF, CIGI, CMS, CompBioGrid, DES, DOSAR, DZero, Engage, Fermilab, fMRI, GADU, geant4, GLOW, GPN, GRASE, GridEx, GROW, GUGrid, i2u2, ILC, iVDGL, JLAB, LIGO, mariachi, MIS, nanoHUB, NWICG, NYGrid, OSG, OSGEDU, SBGrid, SDSS, SLAC, STAR & USATLAS)

DOEGrids CA Statistics(7547)

Others0.02%

OSG28.07%

LCG1.29%

FNAL30.68%

PPDG17.91%

PNNL0.02%

ORNL0.79%

NERSC1.70%

LBNL0.84%

iVDGL14.73%

FusionGRID0.49%

ESnet0.36%

ESG0.94%ANL

2.15%

Page 46: [.ppt]

46

ESnet Conferencing Service (ECS)

• An ESnet Science Service that provides audio, video, and data teleconferencing service to support human collaboration of DOE science– Seamless voice, video, and data teleconferencing is

important for geographically dispersed scientific collaborators

– Provides the central scheduling essential for global collaborations

– ECS serves about 1600 DOE researchers and collaborators worldwide at 260 institutions

• Videoconferences - about 3500 port hours per month

• Audio conferencing - about 2300 port hours per month

• Data conferencing - about 220 port hours per month Web-based, automated registration and scheduling for all of these services

Page 47: [.ppt]

47

ESnet Collaboration Services (ECS)

Page 48: [.ppt]

48

ECS Video Collaboration Service

• High Quality videoconferencing over IP and ISDN

• Reliable, appliance based architecture

• Ad-Hoc H.323 and H.320 multipoint meeting creation

• Web Streaming options on 3 Codian MCU’s using Quicktime or Real

• 3 Codian MCUs with Web Conferencing Options

• 120 total ports of video conferencing on each MCU (40 ports per MCU)

• 384k access for video conferencing systems using ISDN protocol

• Access to audio portion of video conferences through the Codian ISDN Gateway

Page 49: [.ppt]

49

ECS Voice and Data Collaboration• 144 usable ports

– Actual conference ports readily available on the system.

• 144 overbook ports– Number of ports reserved to allow for scheduling beyond the number of

conference ports readily available on the system.

• 108 Floater Ports– Designated for unexpected port needs.– Floater ports can float between meetings, taking up the slack when an extra

person attends a meeting that is already full and when ports that can be scheduled in advance are not available.

• Audio Conferencing and Data Collaboration using Cisco MeetingPlace

• Data Collaboration = WebEx style desktop sharing and remote viewing of content

• Web-based user registration

• Web-based scheduling of audio / data conferences

• Email notifications of conferences and conference changes

• 650+ users registered to schedule meetings (not including guests)

Page 50: [.ppt]

50

Summary

• ESnet is currently satisfying its mission by enabling SC science that is dependant on networking and distributed, large-scale collaboration

• ESnet has put considerable effort into gathering requirements from the DOE science community, and has a forward-looking plan and expertise to meet the five-year SC requirements– A DOE-lead external review of ESnet (Feb, 2006)

strongly endorsed the plan for ESnet4

Page 51: [.ppt]

51

References1. High Performance Network Planning Workshop, August 2002

– http://www.doecollaboratory.org/meetings/hpnpw

2. Science Case Studies Update, 2006 (contact [email protected])

3. DOE Science Networking Roadmap Meeting, June 2003– http://www.es.net/hypertext/welcome/pr/Roadmap/index.html

4. DOE Workshop on Ultra High-Speed Transport Protocols and Network Provisioning for Large-Scale Science Applications, April 2003

– http://www.csm.ornl.gov/ghpn/wk2003

5. Science Case for Large Scale Simulation, June 2003– http://www.pnl.gov/scales/

6. Workshop on the Road Map for the Revitalization of High End Computing, June 2003– http://www.cra.org/Activities/workshops/nitrd – http://www.sc.doe.gov/ascr/20040510_hecrtf.pdf (public report)

7. ASCR Strategic Planning Workshop, July 2003– http://www.fp-mcs.anl.gov/ascr-july03spw

8. Planning Workshops-Office of Science Data-Management Strategy, March & May 2004– http://www-conf.slac.stanford.edu/dmw2004

9. For more information contact Chin Guok ([email protected]). Also see- http://www.es.net/oscars

[LHC/CMS]http://cmsdoc.cern.ch/cms/aprom/phedex/prod/Activity::RatePlots?view=global

[ICFA SCIC] “Networking for High Energy Physics.” International Committee for Future Accelerators (ICFA), Standing Committee on Inter-Regional Connectivity (SCIC), Professor Harvey Newman, Caltech, Chairperson.

- http://monalisa.caltech.edu:8080/Slides/ICFASCIC2007/

[E2EMON] Geant2 E2E Monitoring System –developed and operated by JRA4/WI3, with implementation done at DFNhttp://cnmdev.lrz-muenchen.de/e2e/html/G2_E2E_index.htmlhttp://cnmdev.lrz-muenchen.de/e2e/lhc/G2_E2E_index.html

[TrViz] ESnet PerfSONAR Traceroute Visualizerhttps://performance.es.net/cgi-bin/level0/perfsonar-trace.cgi