23
Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak Yoav Helfman A Director of distributed A Director of distributed array of web servers array of web servers

Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

  • View
    216

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 1Director 1.0

TECHNION Department of Computer Science

The Computer Communication Lab (236340)

Summer 2002

 Submitted by:David SchwartzIdan ZakYoav Helfman

A Director of distributed array of A Director of distributed array of web serversweb servers

Page 2: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 2Director 1.0

1. Introduction 1. Introduction

1.1. General

• Our goal was to develop a Layer-5 director that switches itself into a layer-4 directorafter making the "request routing" decision, based on the URL. Then it should assign a new connection to the requesting client by using NAPT(Network Address & Port Translation).

• As a platform for the director we use Linux operating system. 

Page 3: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 3Director 1.0

2. General Layout2. General Layout

InternetVIP

Load Balancer/DirectorLinux Box

WAN/LAN

Real Server1

Real Server2

Real Server3

RIP1

RIP2

RIP3

CIP

ClientCIP: Client IP Address & PortVIP: Virtual IP Address & PortRIP: Real Server IP Address & Port

• The Director is connected to two networks: the web servers farm network, and the network representing the outside world (Internet).

Page 4: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 4Director 1.0

General Layout cont.General Layout cont.

• The Director reads HTTP requests (on port number 80) from the global network adapter

• at this stage the Director and NAPT are working together, processes them (using a hash function in order to find the server that holds the URL) and sends the requests to the selected web server through the local network adapter.

• From this moment on NAPT “takes initiative” performing the translation between the actual physical server and the client, actually the Layer 5 level has finished it’s part at this stage.

Page 5: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 5Director 1.0

3. Modules.3. Modules.

The project consists of four main modules:

Layer 5 URL Director

NAPT timeout manager

Layer 3/4 NAPT

Debug

DirectorDirector

Page 6: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 6Director 1.0

Modules cont.Modules cont.

•Layer 5 URL Director

Accept: Examines each “GET” request and makes new routing decision based on a hashing function of the URL.

Connect: Initiate a new connection to the selected server.

•Layer 3/4 NAPT

Listener: Receives new packets and Classifies them.

Connection Establisher: Manages the NAPT table entries.

NAPT: Redirects the "packet's flow” to the real WEB server and back to the client.

Page 7: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 7Director 1.0

Modules cont.Modules cont.

•NAPT timeout manager

Timeout: Terminates inactive client-server connections and removes finished connections NAPT entries.

•Debug

Print: Enables a real time Director tables view.

Page 8: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 8Director 1.0

PacketsBuffer

Modules cont.Modules cont.

HeaderContent

Extraction NAPTEntries

Packet Routing(Load Balancing)

IncomingPackets

ForwardPacket

Layer 5 URL Director

Page 9: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 9Director 1.0

Raw SocketsRaw Sockets

• We used raw sockets in order to intercept the raw data directly from layer 3

• Raw Sockets allows the user to receive the packets directly to the user level without passing through all the network layers on the way

• A copy of the packets is sent to us by the Raw Sockets and the real packet continues it’s way to the TCP stack

• Raw Socket intercept the packets before the packets are processed by the TCP/IP, therefore we can receive and send data even if the TCP/IP is blocked

• The use of Raw Sockets is identical to intercepting the packets in the kernel level in terms of the data received

Page 10: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 10Director 1.0

Algorithms usedAlgorithms used

•Layer 5 Director - Accept

Initializes the layer 4 threads and tables

Calls accept() waiting for new connections

When a new connection arrives we create a new thread which connects to the client.

Loops back to accept()

Page 11: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 11Director 1.0

Algorithms cont.Algorithms cont.

•Layer 5 Director - Connect

Reads the request from the client

Calculates the length of the URL and decides which server to connect to.

Calls Connect() with the address of the server containing the requested page.

Builds a semi-complete NAPT entry and inserts it into the semi-complete table.

The thread finishes and exits

Page 12: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 12Director 1.0

Algorithms cont.Algorithms cont.

• Layer 3/4 Director - Listener

Creates a raw-socket and calls Recv() on the socket

After intercepting a packet we categorize it (only TCP packets are inspected - by looking at the protocol field in the IP header we can tell which packets are TCP):

1. SYN packet – discarded

2. SYN-ACK packet – inserted into the SYN-ACK queue.

3. All the other packets are inserted into the Data queue.

Page 13: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 13Director 1.0

Algorithms cont.Algorithms cont.

• Layer 3/4 Director – Connection Establisher

In order to extract the sequence numbers we examine the SYN-ACK packets which are stored in the SYN-ACK queue.

Removes a packet from the queue and searches for the semi-complete entry which has the same port and IP.

Updates the sequence numbers according to the direction of the packet (client-server or server-client)

Inserts the seq. no. into the ACK-3 queue (explained later)

If both directions are updated, the entry is removed from the semi-complete table and entered into the NAPT table.

Loop back to remove a new SYN-ACK packet

Page 14: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 14Director 1.0

Algorithms cont.Algorithms cont.

• Layer 3/4 Director - NAPT

Removes a packet from the Data queue.

Checks if the packet is the ACK packet from one of the handshakes (by comparing its sequence number to the sequence numbers stored in the ACK-3 queue.

Searches for an entry in the NAPT table which has the same port and IP.

If no entry is found the packet is discarded.

If an entry is found we fix the source and destination port and IP, the sequence numbers and the checksums.

We update the time field in the NAPT entry.

The packet is sent onwards (to the server or to the client).

Page 15: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 15Director 1.0

Algorithms cont.Algorithms cont.

If an entry has received RST (from any direction), the entry is removed.

• NAPT timeout manager - Timeout

Every 10 seconds the thread wakes up and goes over all the entries in the NAPT and semi-complete tables.

If an entry is found which has not been used in over 24 hours, it is removed from the tables.

If an entry has received both FINs (from each direction) and at least 60 seconds have passed, the entry is removed.

Page 16: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 16Director 1.0

Algorithms cont.Algorithms cont.

• Debug - Print

At any time we can examine all the tables and queues by hitting a number and pressing enter – a thread is waiting all the time to print the contents of the threads.

Page 17: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 17Director 1.0

Tables and QueuesTables and Queues

• NAPT and Semi-Complete tables:

Each entry consists of:

Source and destination IP

Source and destination port

Client-director sequence and ack numbers

Director-server sequence and ack numbers

Time stamp

Socket file descriptors (client-director and director-server)

Flags - indicating whether we’ve received both FINs

Page 18: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 18Director 1.0

Tables Tables and Queues cont. Queues cont.

Functions:

The table is implemented as a queue:

Enqueue – add a new entry at the head of the queue.

Dequeue – remove an entry from the end of the queue.

Find – finds an address by the given source and destination port and IP.

Page 19: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 19Director 1.0

Tables and Queues cont.Tables and Queues cont.

• Data and SYN-ACK queues

These queues hold the packet as received off the raw socket, with the link layer headers removed – just the IP and TCP layer headers are saved.

Functions:

Enqueue – add a new entry at the head of the queue

Dequeue – remove an entry from the end of the queue

Page 20: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 20Director 1.0

Tables and Queues cont.Tables and Queues cont.

• ACK-3 queues

This queue hold the sequence no. as received in the SYN-ACK packet. This queue is used to identify the 3rd packet of the handshake, so that it won’t be passed on to the server.

Functions:

Enqueue – add a new item at the head of the queue.

Remove – find remove an item from the queue if it exists.

Page 21: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 21Director 1.0

Tables and Queues cont.Tables and Queues cont.

• Address table

This table is used for storing the addresses of the servers and the clients for the use of the raw socket.

The table consists of the IP and port of the address, and a struct sockaddr_ll.

Functions:

Enqueue – add a new address to the table.

Remove – find remove an address from the table.

Find – finds an address by the given port and IP

Page 22: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

Page: 22Director 1.0

• In order to avoid having the kernel automatically send an ack for every TCP packet received we used the built in linux firewall:

After sending the SYN-ACK packet to the client we insert a rule in to the firewall that blocks all TCP traffic to this client (port and IP).

After calling connect() we add a rule that blocks all TCP traffic to the server we just connected to (port and IP).

When the entry is removed from the NAPT table, the rule is removed from the firewall too.

Although we are blocking all output traffic to some servers, we can still send raw data to those server using the Raw Sockets.

NotesNotes

Page 23: Page: 1 Director 1.0 TECHNION Department of Computer Science The Computer Communication Lab (236340) Summer 2002 Submitted by: David Schwartz Idan Zak

23F0_4553_c1 © 1999, Cisco Systems, Inc.

Questions?Questions?