106
4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado Springs [email protected] This tutorial is available at http://cs.uccs.edu/~chow/pub/agere/contentswit ch.ppt With agere as login and ag2003ere as password

4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

Embed Size (px)

Citation preview

Page 1: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 1

Introduction to Content Switch

C. Edward ChowDepartment of Computer Science

University of Colorado at Colorado [email protected]

This tutorial is available at http://cs.uccs.edu/~chow/pub/agere/contentswitch.ppt

With agere as login and ag2003ere as password

Page 2: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 2

Outline of the Talk

• Overview of Content Delivery Network and Linux Virtual Server Technologies.

• Overview of Content Switching Concepts• TCP Delayed Binding and Their Improvement • Conflict Detection in Content switching Rule Set • Persistent Issues • Problems Encountered in Content Processing and

their Solutions • Specific Implementations and Their Performance: • Achieving High Availability with Content Switch.

Page 3: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 3

Clients

Content Delivery Network (CDN)

Host Server

MindSpring

PSINetSprint

Gloobix

QWest

@Home

UUnet

Huge Requests

Server Crash

Slow Response

Clients

Clients

Page 4: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 4

Content Delivery Problems

http://www.akamai.com

Page 5: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 5

Use Client Cache/Client Side Cache Server

Host Server

MindSpring

PSINetSprint

Gloobix

@Home

UUnet

Fewer Requests

Clients

Clients

Clients

ClientCache

ClientSideCacheServer

QWest

Fast Response

Page 6: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 6

Use Mirror Sites

Host Server

MindSpring

PSINetSprint

Gloobix

QWest

@Home

UUnet

Fewer Requests

Server

Fast Response

Clients

Clients

Clients

Mirror Site

Mirror Site

Need improvement by guiding the selection of mirror servers with server load/network bandwidth measurement

Page 7: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 7

Edge Network Cache Servers

Host Server

MindSpring

PSINetSprint

Gloobix

QWest

@Home

UUnet

Fewer Requests

Server

Fast Response

Clients

ClientsClients

ClientCache

Mirror Site

Mirror SiteEdgeNetworkCacheServer

CacheServer

CacheServer

CacheServer

CacheServer

ClientSideCacheServer

Page 8: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 8

Content Delivery Problem

• Cache Location Problem: Where to put cache servers?

• How many are needed?• When/where/how to push/delivery the content?• How about dynamic content?

Page 9: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 9

Akamai Edge Delivery Service

• Peering Bottleneck Problem: Access traffic evenly spread over 7400+ networks (no one over 5%; most << 1%) Need to put edge servers in many networks.

• 11/2000, 4 billion bits/day for 2800 sites.• Source Http://www.akamai.com

Date # of Edge Servers

# of Networks # of Countries

11/2000 6000 335 54

6/2001 9700 650 56

Page 10: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 10

Caching Dynamic Content at Web Proxies

• Active Cache Project : [PeiCao 98] Univ. Wisconsin– Cache Java applet to be executed at proxies– Choice of passing to server, delivery cached copy,

or generate dynamically.• Edge Side Include (ESI):

– XML tag to specify ESI fragment in a web page.– Each ESI fragment can have different cache/

Page 11: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 11

Edge Side Include Examplehttp://www.esi.org/

<table><tr><td colspan=“2”><esi:try> <esi:attempt> <esi:include src=http://www.myxyz.com/news/top.html onerror=“contineu” /> </esi:attempt> <esi:except> <!- -esi This spot is reserved for your company’s advertising. For more info <a href=www.myxyz.com> click here </a> - - > </esi:except></esi:try></td></tr></table>

Page 12: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 12

Solution to First Mile Problem• First Mile Problem: Hugh requests at web site of CDN• High Bandwidth Connection• Caching

– End System Cache• Client Cache• Client Site Proxy Cache Server• Mirror Site Caches

– Cache Servers in Internet• Hierarchical Cache Servers, e.g., Squid/Harvest/Adaptive Web• Edge Servers of Akamai

• Faster Server/Server Farm (Server Side Caching+Cluster)• Layer4 Load balancer+Real Servers• Content Switch+Real Servers• Distributed Packet Rewrite

Page 13: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 13

Load Balancer

or

Content Switch

Real Server

Web Server ClusterLoad balancer can run at

• Application Level — Reverse Proxy

• Kernel level — Linux Virtual Server

Load balancer can distribute requests based on

• Layer 3-4 info — fixe field/fast hash

• Layer 7 info — var. length/slow parsing

Real Server

Real Server

Real Server

Page 14: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 14

Comparison of Load Balancers• Reverse Proxy runs as application process requires more

memory/packet copying.• Linux Virtual Server runs in kernelno memory copying

Name Type Level Layer Info

Reverse Proxy/Apache/Tomcat/Servlet

SW Application 3-7

Linux Virtual Server SW Kernel 3-4

Linux Content Switch SW Kernel/Appl. 3-7

Layer4 Switch (narrow def.) HW Embedded OS 3-4

Content/Web Switch HW Embedded OS 3-7

Page 15: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 15

Linux Virtual Server (LVS)• “Virtual server is a highly scalable and highly

available server built on a cluster of real servers. The architecture of the cluster is transparent to end users, and the users see only a single virtual server” with Virtual IP address (VIP).

• Http://www.linuxvirtualserver.org/

InternetVIP

Load Balancer/DirectorLinux Box

WAN/LAN

Real Server1

Real Server2

Real Server3

RIP1

RIP2

RIP3CIP

Client CIP: Client IP AddressVIP: Virutal IP AddressRIP: Real Server IP Address

Page 16: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 16

LVS-NAT Configuration (Network Address Translation)• All return traffic go through DirectorSlow• Modify IP addr/port #/Checksum at Director• Director and real servers at same LAN• No modification needed on real-servers• Port remapping: real web server can run

on 8080

InternetVIP

Director

Real Server1

Real Server2

Real Server3

RIP1

RIP2

RIP3CIP

Client

Switch

Page 17: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 17

LVS-NAT Configuration Step 2. Director routes Pkt

• Based on CIP, source port#, VIP and dst port#, director selects one of the real servers

• Change the dst IP addr or port # of pkt.

InternetVIP

Director

Real Server1

Real Server2

Real Server3

RIP1

RIP2

RIP3

1. request

2. Scheduling/Rewrite packet

CIP

Client

Switch

CIP VIPCIP RIP1

LVS RoutingScheduling Rules

ipvsadm cmd

Page 18: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 18

LVS-NAT Configuration Step 3. Real Server Replies

• Real server retrieves response.• All real servers set default gateway to Director; like any other

NAT or IP masquerade setup• Packet will be sent back to Director.

InternetVIP

Director

Real Server1

Real Server2

Real Server3

RIP1

RIP2

RIP3

1. request

2. Scheduling/Rewrite packet

CIP

3. ProcessRequest

Client

Switch

CIP VIPCIP RIP1

RIP1 CIP

Page 19: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 19

LVS-NAT Configuration Step 4. Director rewrites reply

• Director changes the dst IP addr. (RIP1) of pkt to VIP• Modify port # if needed.• Modify the checksum; send back pkt.

InternetVIP

Director

Real Server1

Real Server2

Real Server3

RIP1

RIP2

RIP3

1. request

2. Scheduling/Rewrite packet

CIP

3. ProcessRequest

4. Rewrite replyClient

Switch

CIP VIPCIP RIP1

RIP1 CIP

VIP CIP

Page 20: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 20

LVS-NAT Configuration (Network Address Translation)• All return traffic go through DirectorSlow• Modify IP addr/port #/Checksum at Director.• Director and real servers at same LAN

InternetVIP

Director

Real Server1

Real Server2

Real Server3

RIP1

RIP2

RIP3

1. request

2. Scheduling/Rewrite packet

CIP

3. ProcessRequest

4. Rewrite reply5. Receive reply

Client

Switch

CIP VIPCIP RIP1

RIP1 CIP

VIP CIP

Page 21: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 21

LVS-NAT Setup Commands

# make the director forward the masquerading packetsecho 1 > /proc/sys/net/ipv4/ip_forward ipchains -A forward -j MASQ -s 172.16.0.0/24 -d 0.0.0.0/0# Add virtual service and link a scheduler to it ipvsadm -A -t 202.103.106.5:80 -s wlc (Weighted Least-Connection

scheduling) ipvsadm -A -t 202.103.106.5:21 -s wrr (Weighted Round Robin scheduling ) #Add real servers and select forwarding method and weight ipvsadm -a -t 202.103.106.5:80 -R 172.16.0.2:80 -m ipvsadm -a -t 202.103.106.5:80 -R 172.16.0.3:8000 -m -w 2 ipvsadm -a -t 202.103.106.5:21 -R 172.16.0.2:21 -m

Page 22: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 22

LVS-Tunnel Configuration(IP Tunneling)

• Real Servers need to handle IP over IP packets.• Real Servers can be geographically separated and return traffic

go through different routes. • Security implication!

InternetVIPLoad Balancer

Linux Box

Real Server1

Real Server2

Real Server3

RIP1

RIP21. request

2. Scheduling/Put packet in IP Tunnel

CIP

3. ProcessRequest

4. Receive reply

Client

CIP VIPRIP0 RIP2 CIP VIP

IP TunnelIP Tunnel

IP TunnelRIP3

RIP0

VIP CIP

Page 23: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 23

LVS-Tunnel Setup Commands

#The load balancer (LinuxDirector), kernel 2.2.14echo 1 > /proc/sys/net/ipv4/ip_forward ipvsadm -A -t 172.26.20.110:23 -s wlc ipvsadm -a -t 172.26.20.110:23 -r 172.26.20.112 -i

#The real server 1, kernel 2.2.14echo 1 > /proc/sys/net/ipv4/ip_forward

# insert it if it is compiled as module insmod ipip ifconfig tunl0 172.26.20.110 netmask 255.255.255.255

broadcast 172.26.20.110 up route add -host 172.26.20.110 dev tunl0 echo 1 > /proc/sys/net/ipv4/conf/all/hidden echo 1 > /proc/sys/net/ipv4/conf/tunl0/hidden

Page 24: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 24

LVS-DR Configuration (Direct Routing)

• Real servers need to configure a non-arp alias interface with virtual IP address and that interface must share same physical segment with load balancer.

• Only Director’s interface replies to VIP ARP request.

• Director only rewrites server MAC address; IP packet not changed Fast!

Internet

VMACDirector Real

Server1

Real Server2

Real Server3

RMAC1

RMAC2

RMAC3

1. request

2. Scheduling/Rewrite packet

CIP

Client

Route/Switch

GMAC VMAC CIP VIP

VMAC RMAC3 CIP VIP

GMAC: Gateway MAC address

Page 25: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 25

LVS-DR Configuration Step 3. Process Request

• Real server returns request.

• Request goes directly throughswitch/router; not Director.

Internet

VMAC LinuxDirector Real

Server1

Real Server2

Real Server3

RMAC1

RMAC2

RMAC3

1. request

2. Scheduling/Rewrite packet

CIP 3. ProcessRequest

4. Receive replyClient

Switch

VIP CIP

GMAC VMAC CIP VIP

VMAC RMAC3 CIP VIP

RMAC3 GMAC VIP CIP

GMAC: Gateway MAC address

Page 26: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 26

LVS-DR Setup Commands #The load balancer (LinuxDirector), kernel 2.2.14 or later

echo 1 > /proc/sys/net/ipv4/ip_forward ipvsadm -A -t 172.26.20.110:23 -s wlc ipvsadm -a -t 172.26.20.110:23 -r 172.26.20.112 –g

#The real server 1, 172.26.20.112, kernel 2.2.14 or later

echo 1 > /proc/sys/net/ipv4/ip_forward ifconfig lo:0 172.26.20.110 netmask 255.255.255.255

broadcast 172.26.20.110 up route add -host 172.26.20.110 dev lo:0 echo 1 > /proc/sys/net/ipv4/conf/all/hidden echo 1 > /proc/sys/net/ipv4/conf/lo/hidden

Page 27: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 27

Performance of LVS-based Systems

“We ran a very simple LVS-DR arrangement with one PII-400 (2.2.14 kernel)directing about 20,000 HTTP requests/second to a bank of about 20 Web servers answering with tiny identical dummy responses for a few minutes. Worked just fine.” Jerry Glomph Black, Director, Internet & Technical Operations, RealNetworks.

“I had basically (1024) four class-Cs of virtual servers which were loadbalanced through a LinuxDirector (two, actually -- I used redundant directors) onto four real servers which each had the four different class-

Cs aliased on them.” "Ted Pavlic" <[email protected]>

Page 28: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 28

LVS Usage Survey 2/15/2001 Lorn KeyClusters 20 1 2 2 2

Directors

Per Cluster

2 2 2 2 2

Total Real Servers

170 12 4 15 6

RoutingMethods

DR/NAT DR NAT DR NAT

ScheduleMethods

RR/WLC WRR LC WLC WLC

Types of Real Servers

RH6.2 Linux WinLinux

LinuxSolaris

RH

ServiceOffered

WWW WWW/other

WWWDB

WWWSMTP

WWW

File SystemReplication

rsync rsync CodaNFS

Custom rsynccustom

MonitoringSoftware

Heartbeatldirectord

Nanny/Pulse

HeartbeatMon

NannyPulse

Heartbeat

Page 29: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

C. Edward ChowDepartment of Computer Science

University of Colorado at Colorado Springs

Sponsored by Computer Comm. Lab/ITRI

Page 30: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 30

Content Switch Topics

• What is a Content Switch?• What Services it Can Provide• Content Switch Example• Related Technologies• Content Switch Architecture and Basic Operations• TCP Delay Binding and Related Improvement• Content Switch Rule and Conflict Detection• Conclusion

Page 31: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 31

Content Switch (CS)

• Route packets based on high layer (Layer 5/7) headers and content.

• Examples:– Direct Web traffic based on pattern of

• URLs, cookies – URL Switching• XML Tag Value– Web Switching

– Can Route incoming email based on email address;Connect POP/IMAP based on login

• Web switches and Intel XML Director/accelerator are special cases of content switch.

Page 32: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 32

What Services It Can Provide

• Enabling premium services for e-commerce, ISP, and Web hosting providers

• Load Balancing and High Available Server Clusters: Web, E-commerce, Email, Computing, File, SAN

• Policy-based networking, differential/QoS services. • Firewall, Strengthening DoS protection, cache/firewall

load-balancing• ‘Flash-crowd' management• Email Spam Protection, Virus Detection/Removal• Applet Authentication/Filtering

Page 33: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 33

F5 VRM Solution

BIG-IP

Server Array

Webmaster

Site Inewyork.domain.com

Site IIItokyo.domain.com

Site IIlosangeles.domain.com

Userlondon.domain.com

Local DNS

3-DNS

GLOBAL-SITE

Router

BIG-IP

InternetInternet

Page 34: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 34

ServerIron 100 Web Switch

• Integrated Layer 2 through Layer 7 switching• Support for up to 7,000,000 concurrent sessions, and 20 Gbps of

throughput• High-availability server load balancing with active/active

configuration and stateful fail-over• Industry's most powerful content switching capabilities, including

URL, Cookie and SSL Session ID based switching• Content-aware cache switching• High performance VPN/Firewall load balancing• Robust protection against Denial of Service (DoS) attacks• Most comprehensive global server load balancing with DNS Proxy

and client proximity measurements

Page 35: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 35

Cisco CSS11000 Content Service Switch

comprises four high-speed RISC processors, with 512 MB of memory, and 20.0 Gbps of throughput, Distributed flow forwarding engines feature up to 16 port-level network processors with up to 128 MB of memory for wire-speed delivery of Web content. Support for "sticky" connections based on IP address, Secure Socket Layer (SSL) session ID, and cookies ensures reliability and security for e- commerce transactions. The unique Cisco content replication technology enables dynamic expansion of site capacity in response to sudden "flash crowds" for "hot" content or seasonal peaks in traffic that can overwhelm servers.

Page 36: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 36

Nortel Alteon Web Switch

• Provides wire-speed Layer 2/3 Ethernet switching, plus high-speed processing based on Layer 4 through 7 information (TCP ports, URLs, HTTP headers and cookies, SSL session ID, etc.)

• Processes hundreds of thousands of concurrent sessions each second on eight multi-rate Ethernet ports, (rate selectable per port), with one Gigabit or 100/1000 Mbps Ethernet uplink port

• Performs local and global server load balancing, application redirection, content filtering, streaming media load balancing, wireless Internet load balancing and content-aware Layer 7 switching

• Filters packets based on up to 2048 filtering rules (224 filtering rules for Alteon AD3/180e Web Switches), uniquely definable per switch and per port

• Meters, controls, and accounts for bandwidth use-by client, server farm, virtual service, application, user class, content type and other traffic classes-and supports guaranteed minimum, metered available, and maximum burst bandwidth rates

Page 37: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 37

Intel Netstructure XML Director 7280

• Example of Rule:Server1: create */order.asp & //Amount[Value >= 10000]

Page 38: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 38

Phobos In-Switch• Only load balancing switch in a PCI card form factor

• Plugs directly into any server PCI slot

• Supports up to 8,192 servers, ensuring availability and maximum performance

• Six different algorithms are available for optimum performance: Round Robin, Weighted Percentage, Least Connections, Fastest Response Time, Adaptive and Fixed.

• Provides failover to other servers for high-availability of the web site

• U.S. Retail $1995.00

Page 39: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 39

E-Commerce Example: 1. ClientClient submits via HTTP/Post (or SOAP) the following purchase in XML:<purchase>

<customerName>CCL</customerName><customerID>111222333</customerID><item><productID>309121544</productID>

<productName>IBM Thinkpad T21</productName><unitPrice>5000</unitPrice><noOfUnits>10</noOfUnits><subTotal>50000</subTotal>

</item><item><productID>309121538</productID>

<productName>Intel wireless LAN PC Card</productName><unitPrice>200</unitPrice><noOfUnits>10</noOfUnits><subTotal>2000</subTotal>

</item><totalAmount>52000</totalAmount>

</purchase>

Page 40: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 40

E-Commerce Example: 2. Content Switch

• Content switch receives the packet.• Recognize it is a http post request from http request line

POST /purchase.cgi HTTP/1.1• Recognize it is an XML document from the meta header

content-type: TEXT/XML• Parsing XML content• Extract values of tag sequences:

52000 purchase/totalAmount CCL purchase/customerName

• Rule 1 is matched and packet is routed to one of highSpeedServers.Rule 1: if (xml.purchase/totalAmount > 5000) routeTo(highSpeedServers);Rule 2: if (xml.purchase/customerName == CCL) routeTo(specialCustomerServers);

Page 41: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 41

No Free Lunch:Penalty of Having Content Switch

Increased packet processing time.• For XML Director/Accelerator, it needs to parse XML

document and match tag sequences. 1-3? order of processing time

Layer 4 Switching Layer 7 Switchingpacket header extraction fixed short fields varying length long fieldsswitch rule matching hash table look up pattern matching

Size of XML Document (Bytes) XML Content Extract Time (ms)600 14

7000 2167104 53

Page 42: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 42

Related Technologies

• Application level solution: Proxy server; Apache/Tomcat/Servlet; Microsoft NLB

• Kernel level layer 4 load balancing solution: http://www.linuxvirtualserver.org/– Joseph Mark’s presentation– LVS-NAT(Network Address Translation) web page– LVS-IP Tunnel web page– LVS-DR (Direct Routing) web page

• Hardware solution: Cisco 11000, F5 (Big IP), Alteon Web Systems, Foundry Networks (ServerIron),Excellent information on: Foundry ServerIron Installation and Configuration Guide, May 2000. http://www.foundrynet.com/services/documentation/siug/

Page 43: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 43

Basic Operations of Content Switching

CS Rule Matching Algorithm

HeaderContent

Extraction

Packet Classification

CSRules

Packet Routing(Load Balancing)

CS RuleEditor

IncomingPackets

ForwardPacket

To Servers

Network Path Info

Server Load Status

CS: Content Switching

Page 44: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 44

Content Switch ArchitectureApostolopoulos

Infocom 2000

Page 45: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 45

Content Switch Architecture

Client

HashTable

Case A: Controller findsthere is an entry in its Hash Table,Route request to “sticky connection” outgoing port

Real Server1

Page 46: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 46

Content Switch Architecture

Client

HashTable

Case B: Step 1. Controller findsthere is no entry in Hash Table,Route request to content switch processor Real

Server1

Page 47: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 47

Content Switch Architecture

Client

HashTable

Case B: Step 1. Controller findsthere is no entry in Hash Table,Route request to content switch processor

Real Server1

Step2. CS processora. Extract content/Match CS rules

b.Route requestc. Setup Sequence# modification

on server side port

CSRules

pktModification

info

Page 48: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 48

Content Switch Architecture

Client

HashTable

Case B: Step 1. Controller findsthere is no entry in Hash Table,Route request to content switch processor

Real Server1

Step2. CS processora. Extract content/Match CS rules

b.Route requestc. Setup Sequence# modification

on server side port

CSRules

pktModification

info

Step 3. At server side port,Return pkts are modified

Sequence#/IP addr/ChksumRoute back to client

Page 49: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 49

Efficient Content Switching Architecture

• Tasks: Million packets with thousand of rules to match and load balancing algorithms to run.

• How to assign tasks to the (network) processors and threads?– Packet Extraction

(Understand header formats, XML parsing)– Content Switching Rule Matching– Packet Routing

(Load Balancing, Bandwidth Control)• How Much Packet Processing Should Controllers Do?• What a controller can do?• A Typical Parallel Processing Problem?

Page 50: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 50

TCP Delay Binding (Splicing)client

content switch server

step1

step2

SYN(CSEQ)

SYN(DSEQ) ACK(CSEQ+1)

DATA(CSEQ+1) ACK(DSEQ+1)

step4

step9

step10

step5

step6

SYN(CSEQ)

SYN(SSEQ) ACK(CSEQ+1)

step8

DATA(CSEQ+1) ACK(SSEQ+1)

DATA(SSEQ+1) ACK(CSEQ+lenR+1)

DATA(DSEQ+1) ACK(CSEQ+LenR+1)

ACK(DSEQ+ lenD+1) ACK(SSEQ+lenD+1)

lenR: size of http request. lenD: size of return document.

ACK(DSEQ+1)

step3

step7

ACK(SSEQ+1)

DATA(?) 2nd request ACK(?)

step11

Page 51: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 51

Improve Content Switching

• Setup CS-Real Server connections ahead of time (Persistent HTTP Connections). NetScale Reduce TCP 3-way handshake time

• Pre-allocate Server Scheme (Guess Real Server based on the TCP Sync)

• Sequence# modification on every return pkt Need to recompute checksum also.

• Filter Scheme (Offload Sequence# modification/rule matching to real servers).

• Buffering/Pipeline (aggregate) Requests

Page 52: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 52

Pre-Allocate Server Schemeclient

content switch Pre-allocatedserver

step2

SYN(CSEQ)

SYN(SSEQ)

ACK(CSEQ+1)

DATA(CSEQ+1) step4

SYN(CSEQ)

SYN(SSEQ) ACK(CSEQ+1)

DATA(CSEQ+1)

ACK(SSEQ+1)

step5

step6

ACK(SSEQ+1)

DATA(SSEQ+1)ACK(CSEQ+lenR+1)

DATA(SSEQ+1)ACK(CSEQ+LenR+1)

ACK(SSEQ+lenD+1) ACK(SSEQ+lenD+1)

.

• Guess routing decision based on IP/Port#/History• Advantage:

• Faster than TCP delay binding.• Possible direct route between client and server• Reduce session processing overhead

no need to convert server sequence #

step1

step3ACK(SSEQ + 1) ACK(SSEQ+1)

Page 53: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 53

Degenerated to TCP Delayed Binding If Guess is Wrong

client content switch

Pre-allocatedserver

step1

SYN(CSEQ)

SYN(CSEQ)

step2SYN(SSEQ)/ ACK(CSEQ+1) SYN(SSEQ)/ ACK(CSEQ+1)

step12

DATA(RSEQ+1)/ACK(CSEQ+lenR+1)DATA(SSEQ+1)/ACK(CSEQ+LenR+1)

ACK(SSEQ+lenD+1 ACK(RSEQ+lenD+1)

step6

step7

step8

SYN(CSEQ) SYN(RSEQ)/ ACK(CSEQ+1)

DATA(CSEQ+1)/ACK(RSEQ+1)

Right server

Sequence # conversion neededfor right server now

step3ACK(SSEQ + 1) ACK(SSEQ+1)

DATA(CSEQ+1)/ ACK(SSEQ+1) step4 DATA(CSEQ+1)/ACK(SSEQ+1)

step5 DATA(SSEQ+1)

FIN(CSEQ+lenR+1))Server sent HTTP 404

ACK(RSEQ+1)step9

step10

step11

Page 54: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 54

Filter Process SchemeFilter Processrun on server

client content switch

server

step1

SYN(CSEQ)

step2SYN(DSEQ)/ACK(CSEQ+1)

DATA(CSEQ+1)/ACK(DSEQ+1)

step4

step5 a

step6

step8

step10

SYN(CSEQ)

SYN(SSEQ)/ ACK(CSEQ+1)

DATA(CSEQ+1)/ACK(SSEQ+1)

ACK(DSEQ+lenD+1) ACK(SSEQ+lenD+1)

step9DATA(SSEQ+1)

ACK(CSEQ+lenR+1)DATA(DSEQ+1)ACK(CSEQ+LenR+1)

step5bMigrate(Data, CSEQ, DSEQ)

ACK(DSEQ+1)

ACK(SSEQ+1)

step3

step7

Page 55: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 55

Pre-allocate performance plot

Plot of response time vs document size

020000400006000080000

100000120000140000160000180000200000220000240000260000280000300000320000340000360000380000400000420000440000460000480000500000

0 10000 20000 30000 40000

bytes

mic

ros

ec

on

ds

Series1

Series2

Series3

Series4

Figure 3. Performance of Pre-allocate Server Scheme

Series 1 - Basic scheme with no rule matching module inserted, i.e., using default IPVS.

Series 2 - Basic scheme with the rule matching module inserted.

Series 3 - Pre-allocate scheme with all hits, i.e., where all pre-allocate guesses were correct.

Series 4 - Pre-allocate scheme with all misses, i.e., where all pre-allocate guesses were wrong.

Page 56: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 56

Handling multiple requestsin a Keep-Alive connection

• Determine when new request arrives– Verify that previous request has been completely received– Request data size is > 0

• Key assumption is only one outstanding request is sent at a time by client, i.e., requests are not pipelined

• Reuse connections – Store each connection control information in a

hash table keyed by real server address, once it is established.

Page 57: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 57

Quiz

• Web server keeps the TCP connection alive, expecting the browser to return for images and in-line media files.

• How many keep-alive connections are setup on IE5 and Netscape 4.7 for web page with many .jpg/.gif images?

• Can these image requests be pipelined from client browser to web server?

Page 58: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 58

Multiple HTTP Requests from One TCP Connection

• A keep alive TCP connection may include multiple HTTP “GET” requests.• Content Switch examines each “GET” request and makes new routing decision.• Content Switch establishes another connection with a different server based on the routing decision.• Those HTTP responses from different servers need to be interleaved and seen by the user as if from the same server.• Solutions: In order delivery (buffer requirement); Out of order delivery (seq# tracking)?• Problems: Should we throw away earlier html requests if receive later requests?

.

.

.

client

NAT approach

cs.jpgrocky.mid

uccs.gif

Index.htm

ContentSwitch

server1

server2

server9

Page 59: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 59

Multiple HTTP Requests from One TCP Connection

• Can servers return documents directly to client in keep-alive session case?

• Can equivalent VS-Tunnel or VS-DR be implemented using Content Switch?

.

.

.

client

cs.gif

rocky.mid

uccs.jpg

ContentSwitch

server1

server2

server9

Page 60: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 60

Content Switch Rule Survey

Survey shows that existing switches support• rules in basic (condition action) or (action condition)

form• some define condition as class, then specify the

action in separate statement or command• simple single conditional term• command line interface (to facilitate incremental

update?)• Actions can include reject, forward, put in queue (for

bandwidth control, scheduling)

Page 61: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 61

Content Switch Rule Design• Rule syntax generic to support all Intended features.• Use simple C if statement syntax rule: if (condition) { action }

– Easy to read – Allow optimization using c compiler

• Condition consists of multiple terms of – variable relational_operator value

e.g. xml.purchase/totalAmount > 50000 smtp.to == “[email protected]

cookie.name == “servlet1” bitmatch(64, 8, 0xff) == 64 # above mean TTL=64 idea from netfilter universal filter

– suffix(variable, string) e.g. suffix(url, “gif”)– regex(variable, pattern) e.g. regex(url, “/purchase”)

• Action consists of reject, forward(server| queue)loadBalance(serverGroup, loadBalancingAlgorihtm)

Page 62: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 62

Efficient CS Rule Matching

• Brute force, strict priority: Rules are executed in sequential manner.

• Efficient Rule Matching Method:– Organize Rules so that rules can be skipped

based on existing content types.– Utilize compiler optimization technique.

Page 63: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 63

Simple CS Rule Editor GUI

Page 64: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 64

Conflict Detection on Content Switching Rules

• Detect conflicts among rules or rule set.• Absolute conflict type:

r1: if (xml.purchase/customerName == “CCL”) {routeTo(r1)}r2: if (xml.purchase/customerName == “CCL”) {routeTo(r2)}

• Potential conflict type: r1: if (xml.purchase/totalAmount > 5000) {routeTo(quickServers)}r2: if (xml.purchase/totalAmount >20000) {routeTo(superServers)}

• Algorithm: Build tree with the same variable, check operator and value to see if they are the same or lead to potential conflict, compare actions to decide conflict type or duplication.

• Developed conflict detection algorithm for rules with multiple term condition. Can be applied to policy-based rules conflict detection.

• Editor can build these trees while a user enters rules and warns about conflict right away.

Page 65: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 65

XML Tag Value Extraction

• A xmlContentExtract() is built to extract the tag values of a list of unique tag sequences.

• It is based on clark cooper’s expat 1.0 xmlparser.• Its argument include the pointer to an XML

document, the pointer to the array of strings (unique xml tag squences we follow the xsl selector syntax), and the number of sequences.

• It return the list of a structure node, with the tag sequence, its attribute, and its value.

• Currently, it supports one attribute and tag sequece needs to be unique.

Page 66: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 66

Persistence Handling in LVS• Some network applications require packets from same

users/sessions be routed to same real servers.– For consistent treatment?– For fast performance, e.g. servers maintain persistent

data/info for sessions • Tomcat web server returns cookie value so that return client

requests can be routed to the same Tomcat web server.• But cookie value is in HTTP header, a Layer 7 info. Layer 4

switch cannot access it.• This is so called persistence handling problem.• One solution: Sticky connection. Same IP address served

by same server.

Page 67: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 67

Persistent handling Problems

FTP Case:• Normally FTP uses port 21 for control, port 20 for data. • But for passive FTP, the server tells the clients the port that it

listens to. The client initiates the data connection connecting to that port.

• For the LVS/TUN and LVS/DR, LinuxDirector is only on the client-to-server half of the connection, so it is impossible for LinuxDirector to get the data port from the packet that goes to the client directly.

SSL Session Case: • port 443 for secure Web servers and port 465 for secure mail

server, • key for connection must be chosen/exchanged and only the initial

real server has the key. • Persistent or sticky connection is needed.

Page 68: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 68

Persistent Connection Solution

• When the client first accesses the service, LinuxDirector creates a template between the given client and the selected server, then create an entry for the connection in the hash table.

• The connections for any port from the client will send to the server before the template expires.

• The template expires in a configurable time, and the template won't expire until all its connections expire.

• The timeout of persistent templates can be configured by users, and the default is 300 seconds

Page 69: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 69

Problems Encountered in The Design of Linux-based Content

Switch• Handle a Request Contained in Multiple Packets• Handle Different Data Encoded Methods• Allow Referencing Specific XML Tags• Handle Long Transactions in SSL and Email network

services

Page 70: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 70

Handle a Request Contained in Multiple Packets

• For a long request, its headers and content will be carried by the multiple packets due to packet size limitation.

• We have observed Netscape 4.7 spliting a short request <1000 into two packets

• Due to interleaving with other sessions, packets of the same session may not be allocated consecutive memory.

• Even packets of the same session arrives without interleaved with packets of other sessions, application level data will be fragmented in kernel packet buffer such as skbuf.

• Matching application data pattern in the kernel is tricky.

Page 71: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 71

Example: Determine Content Length

TCP Segment n contains:POST /cgi-bin/cs622/purchase.pl HTTP/1.0\r\n Referer: http://archie.uccs.edu/~acsd/lcs/xmldemo.html\r\nConnection: Keep-Alive\r\n User-Agent: Mozilla/4.75 [en] (X11; U; Linux 2.2.16-22enterprise i686) \r\nHost: viva.uccs.edu\r\n Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*\

r\n Accept-Encoding: gzip\r\n Accept-Language: en\r\n Accept-Charset: iso-8859-1,*,utf-8\r\nContent-type: application/x-www-form-urlencoded\r\nContent-length: 7TCP Segment n+1 contains:53\r\ndata (753 bytes)

Page 72: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 72

Potential Solutions

• Allocate application data of a session in the consecutive memory Major rework on most kernel packet buffer allocation scheme.

• Use carry lookahead memory hardware.• Coding complicated pattern matching code that can

match pattern over fragmented data.• Use application level content switching bear the

overhead of data copying from kernel to application level.

Page 73: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 73

Handle Different Data Encoding Methods

• XML data can be passed in plain/text.• When submitting it with form, the XML request data

are encoded using the x-www-form-urlencoding method

• When extracting XML data for rule matching, different data encoding methods need to be detected through the content-type header.

Page 74: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 74

An E-Commerce XML ExampleClient submits via HTTP/Post (or SOAP) the following purchase in XML:<purchase>

<customerName>CCL</customerName><customerID>111222333</customerID><item><productID>309121544</productID>

<productName>IBM Thinkpad T21</productName><unitPrice>5000</unitPrice><noOfUnits>10</noOfUnits><subTotal>50000</subTotal>

</item><item><productID>309121538</productID>

<productName>Intel wireless LAN PC Card</productName><unitPrice>200</unitPrice><noOfUnits>10</noOfUnits><subTotal>2000</subTotal>

</item><totalAmount>52000</totalAmount>

</purchase>

Page 75: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 75

Allow Referencing Specific XML Tags

• An ambiguous XML tag sequence specification can match multiple instances.

• To avoid that and to speed up the matching, we propose the use of XML tag sequence specification that enables us to specify the specific XML tag sequence.

• For example, To specify a rule based on subTotal value present in the second item tag within the first purchase tag, the condition of the rule will be specified as “purchase:1.item:2.subTotal > 5000”.

• As another example, “purchase:2.totalAmount < 15000” specifies the condition of a rule based on the totalAmount tag present within the second purchase tag.

Page 76: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 76

Handle Long Transactions in SSL and Email network services

• some of the packet processing functions are better handled at the application level.

• For example, there are a lot of packages, including McAfee’s uvscan and AMAVis scanmail, mutt (recombine email component), for detecting and removing email virus, but almost all of them are implemented in application level and interact with the sendmail program. It will require significant effort to rewrite them as kernel modules.

• Same observations were derived on SSL processing.

Page 77: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 77

Web Switching/SSL processing overhead and Performance

differences btw Prefork and Dynamic fork

• Significant SSL processing overhead. 240 req/sec vs. 38 req/sec

• Content switching processing overhead may reduce the performance to lower than single web server. What we gain here? How we can improve it?

Overall WebBench Requests/Second

0.000

50.000

100.000

150.000

200.000

250.000

300.000

1_cli

ent

8_cli

ent

16_c

lient

24_c

lient

32_c

lient

40_c

lient

48_c

lient

56_c

lient

Clients

Req

ues

ts /

Sec

on

d

Request Per Second PreforkNonSSLProxy

Request Per Second DynamicNonSSLProxy

Request Per Second ApacheNonSSL

Request Per Second DynamicSSLProxy

Request Per Second PreforkSSLProxy

Request Per Second ApacheSSL

Page 78: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 78

IXP1200-based Content Switch

• We have ported OpenSSL and our Linux Secure Web System to run on IXP12EB with VxWork.

• Using WindRiver’s Tornado II IDE.• Preliminary version run purely on StrongArm core.• Currently working on offload header extraction and

rule matching code to run as hardware threads on microengines.

Page 79: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 79

Intel IXP1200 NP and IXP12EB

• The IXP 1200 Network Processor• The IXP12EB Evaluation Board:

– PCI form factor board based on IXP1200 Network Processor

– eight 10/100 Mbps ports– two Gigabit Ethernet ports– PCI back-plane and an Ethernet Network

Interface Card (NIC)

Page 80: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 80

IXP 1200 Network Processor

Page 81: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 81

Packets Receiving & Transmitting

Page 82: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 82

Agere Network Processor

The following figures are from Douglas Comer’s new text

“Network System Design using Network Processors”

Page 83: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 83

Agere’s FPP

Page 84: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 84

Agere’s RSP

Page 85: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 85

Alchemy’s Au1000

Page 86: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 86

Applied Micro Circuit Corp

nP7510

Page 87: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 87

Cisco ParalleleXpress

Forwarding(PXF)

Page 88: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 88

Cognigine’s Reconfigurable Communication Unit (RCU)

Page 89: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 89

EZChip NP-1

Page 90: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 90

IBM PowerNP

Page 91: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 91

IBM NPEmbeded Processor

Complex

Page 92: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 92

Motorola’s C-Port

Page 93: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 93

MotorolaSingle CP

Page 94: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 94

Packet Flow and IXP2400

Page 95: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 95

Intel IXP2400

Page 96: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 96

HA-LVS ConfigurationHigh Available

Internet LinuxDirector

Real Server1

Real Server2

Real Server3

CIPClient

HeartBeat

MON

BackupDirector

MON1. When Backup Director detects Linux Director failurethrough heart beat protocol,

“graciously negotiate”the take-over of VIP

Provide fault-tolerant

2. Monitor server processes run on real servers

Route requests to server processesthat are alive. Initiate restart/repair

Page 97: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 97

High Available Web Server Cluster

Internet

WebSwitch1

Real Server1

Real Server2

Real Server3

CIPClient

HeartBeat

MON

WebSwitch2

MON

2. Web switch monitors server processes run on real servers.When they die, • route requests to server processes that are alive. • Rewrite web switching rule. Initiate restart/repair

1. Web Switch detects the failure of other web

switchTake over the

processing of routing request.

Page 98: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 98

Status of UCCS ACSD Project• Two versions of Linux Kernel -based LCS content switch, LCS01, LCS02 were

developed.• A Linux Application level secure web switch (LSWS) was developed using OpenSSL

package.• LSWS is ported to run on Intel IXP12EB and IXP1200 network processor with

Windriver VxWork. • Part of the above research projects are sponsored by CCL/ITRI. • Based on Linux-2.2.16-3, current release LCS02.• Being ported to Linux-2.4.18 and integrated with KTCPVS.• ip_forward.c, ip_masq.c, ip_vs.c are modified to implement basic TCP delay binding.• ip_cs.c are added for most of the content switching functions with http header

extraction and xml content extraction.• A simple Java-based ruleEdit program was created for rule editing and conflict

detection. A C-based program can detect conflicts among rules with regular expression in their condition expression.

• Rule translate program to convert the rule set into a Linux kernel module and allow dynamic replacement of rule without restarting the system.

• Currently working on integrating KTCPVS and provide unified configuration/monitor command

Page 99: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 99

LCS Demo

• We set up viva.uccs.edu as a content switch and wait and ace as two real servers.

• URL Switching demo:http://viva.uccs.edu/~lcs1/ route to ace.uccs.eduhttp://viva.uccs.edu/~lcs2/ route to wait.uccs.edu

• XML Web Switching (E-commerce applications)http://archie.uccs.edu/~acsd/lcs/xmldemo.htmlWhen the 2nd subtotal tag >=50000, route to ace.When the 2nd subtotal tag <50000, route to wait.

• Let us know if you have problem accessing them.My students may be working on LCS extension.

Page 100: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 100

LCS Rule ExampleR4: if (atoi(rule_fields[1].value) >= 50000) { return route_to("ace", NON_STICKY, saddr); }R5: if ((atoi(rule_fields[1].value) > 0) && (atoi(rule_fields[1].value) < 50000)){ IP_RULE_MSG("serevr=wait\n"); return route_to("wait", NON_STICKY, saddr); }R10: if (strstr(url, "lcs1") != NULL) { IP_RULE_MSG("server=ace\n"); return route_to("ace", NON_STICKY, saddr); }R11: if(strstr(url, "lcs2") != NULL){ IP_RULE_MSG("server=wait\n"); return route_to("wait", NON_STICKY, saddr); }

Page 101: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 101

Intel 7280 Demo• http://cs.uccs.edu/~chow/pub/master/ycai/doc/csdemo.html

Page 102: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 102

Related Load Balancing Research Results

• Modified Apache status module to report– Total bytes to be transferred by child processes– Average document transfer speed

• Modified LB-DNS to receive server status and bandwidth probing results.

• LB-DNS returns IP-address of the best server based a weight contributed by both server load and bandwidth.

• Modified WebStone benchmark to test the performance of load balancing web server clusters.

Page 103: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 103

Load balancing Systems

Modified Web Server1

Modified Web Servern

Statistics GatheringDaemon

LBA: ModifiedDNS

Server Delay

Request for Web pages

Server Ranking/tmp/StatFile

Bandwidth Probe Results

Page 104: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 104

Connection Rate: LBA vs. Round-RobinServer connection rate for 4 servers

0

200

400

600

800

1000

Update for LBA , per sec

Conn

ectio

ns/s

ec

load balancing system round-robin

load balancing system 418.2 656.6 907.9 420 636.7 322.6 711.6 420.5 638.3 670.6 683.4 899

round-robin 327.6 327.6 327.6 327.6 327.6 327.6 327.6 327.6 327.6 327.6 327.6 327.6

1 2 3 4 5 6 7 8 9 10 11 12

Round robin only run once

Page 105: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 105

Conclusion• Content Delivery Network improves internet content retrieval• LVS provides a low cost layer 4 switching service for cluster.• Linux Content Switch with generic rules can be easily

configured for wide-variety of value-added services:– Premium services– Load balancing/High Available server farm.– Firewall– Bandwidth control/Traffic shaping

• Require efficient SW/HW architecture and rule matching algorithms to reduce processing overhead.

• Content rule design/conflict detection are important and challenging.

• TCP delay binding can be improved.

Page 106: 4/11/2003 Edward Chow Content Switch 1 Introduction to Content Switch C. Edward Chow Department of Computer Science University of Colorado at Colorado

4/11/2003 Edward Chow Content Switch 106

References• http://www.linuxvirtualserver.org/• http://www.akamai.com/• http://cs.uccs.edu/~chow/pub/contentsw/talk/contentswitching.ppt• [Aron2000] Aron, Mohit, “Differential and predictable QoS in web server systems”, Ph.D

dissertation Rice University, Oct. 2000.• [Zhang97] Lixia Zhang, Sally Floyd, and Van Jacobson, “Adaptive Web Caching,” April 25,

1997. http://www-nrg.ee.lbl.gov/floyd/web.html• [Esi2001] Edge Side Includes, http://www.esi.org/. • [Chow2001a] C. Edward Chow and Indira Semwal, “Web Load Balancing Through More

Accurate Server Report,” Proceeding of PDCAT 2001, Taipei, Taiwan.• [Chow2001b] C. Edward Chow, Ganesh Godavari, and Jianhua Xie, “Content Switch Rules

and their Conflict Detection,” Proceeding of PDCAT 2001, Taipei, Taiwan.• [Chow2001c] C. Edward Chow and Weihong Wang, “The Design and Implementation of

Linux LVS-based Content Switch”, Proceeding of PDCAT 2001, Taipei, Taiwan.• [Aversa2000] Luis Aversa and Azer Bestavros, “Load Balancing a Cluster of Web Servers:

Using Distributed Packet Rewriting,” Proceedings of IPCCC 2000. • [Cao98] PeiCao, Jin Zhang and Kevin Beach, “Active Cache: Caching Dynamic Contents on

the Web” http://www.cs.wisc.edu/~cao/papers/active-cache.ps