36
CSE 124 Networked Services Fall 2009 Lecture 2: Networking architectures and Network Software APIs B. S. Manoj, Ph.D http://cseweb.ucsd.edu/ classes/fa09/cse124 9/29/2009 UCSD CSE 124 Networked Services slides are adapted from various sources/individuals including but ahdat, Prof. James Kurose, Prof. Keith Ross, and UNIX/Linux softwa projects and associated sources. Use of these slides other than f urpose for CSE 124, may require explicit permissions from the respe

B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

  • Upload
    cachet

  • View
    47

  • Download
    1

Embed Size (px)

DESCRIPTION

CSE 124 Networked Services Fall 2009 Lecture 2: Networking architectures and Network Software APIs. B. S. Manoj, Ph.D http://cseweb.ucsd.edu/classes/fa09/cse124. Some of these slides are adapted from various sources/individuals including but not limited to - PowerPoint PPT Presentation

Citation preview

Page 1: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

CSE 124 Networked Services

Fall 2009

Lecture 2: Networking architectures and Network Software APIs

B. S. Manoj, Ph.Dhttp://cseweb.ucsd.edu/classes/fa09/cse124

9/29/2009 UCSD CSE 124 Networked Services

Some of these slides are adapted from various sources/individuals including but not limited toProf. Amin Vahdat, Prof. James Kurose, Prof. Keith Ross, and UNIX/Linux software documentation projects and associated sources. Use of these slides other than for pedagogical purpose for CSE 124, may require explicit permissions from the respective sources.

Page 2: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

Networking architecture

• There are two popular network architectural models– The TCP/IP architecture

– The OSI (Open Systems Interconnection) reference model from International Organization for Standradization (OSI)

9/29/2009 UCSD CSE 124 Networked Services

Page 3: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

A comparison of the two models/architectures

ISO-OSI TCP/IP

A reference model than a successful architecture

A successful architecture

Model is defined before any prototypes existed

A model is retrofitted to the popular TCP/IP protocol suite

No working systems existed while modeling (lessons from TCP/IP model has contributed to the design)

No real model was intended in the original version, but later it was split between TCP and IP making it closer to a model!

Some example systems exist (e.g., X.25) The Internet is a successful and growing working system

Some layers are not essential (session layer) and some important functions are missing (security)

Some of the important functions are not defined (e.g., security)

Design is influenced by administrative bodies

Design is influenced by the popular Internet Technology Development Culture (“We reject kings, presidents, and voting. We believe in rough consensus and a working code.”

9/29/2009 UCSD CSE 124 Networked Services

Page 4: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

Layer-wise comparison between OSI mdoel and TCP/IP suite

9/29/2009 UCSD CSE 124 Networked Services

Page 5: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

Today’s popular 5-layer protocol stack

FTP HTTP SSH TFTP RTP

TCP UDP

IP

802.11802.3 ATM

DSSS/OFDM SONETEthernet

Hour Glass

9/29/2009 UCSD CSE 124 Networked Services

Page 6: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

Hour Glass model

• Hour glass model highlights the critical use of IP as the key integrator – of a variety of diverse applications and – Heterogeneous networks

9/29/2009 UCSD CSE 124 Networked Services

Page 7: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

Application

Transport

Network

Datalink

PHY

Network software

Hardware

Application software

Application

software

Software (Kernel modules)

Application

Network

TCPUDP

•Network software is usually implemented as a set of functions in the OS kernel•A part of the MAC and PHY resides in the hardware•In most cases, a part of Network and transport layers are implemented in kernel

•Real implementations is not strictly layer-wise•The sequence of function calls make a layered operation•It is possible for direct communication between application to the network layer

9/29/2009 UCSD CSE 124 Networked Services

Page 8: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

Network Software APIs

• Network software is usually part of the OS• A common set of APIs, called Network APIs,

are provided• Applications use Network APIs for accessing

network services• These network software APIs called socket

APIs

9/29/2009 UCSD CSE 124 Networked Services

Page 9: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

The main socket APIs are• int socket(int domain, int type, int protocol)• int bind(int socketfd, struct sockaddr* addr, int

addr_len)• int listen(int socketfd, int backlog)• int accept(int socket, struct sockaddr* addr, int

addr_len)• int connect(int socket, struct sockaddr *addr, int

addr_len)• int send(int socket, char* message, int msg_len, int

flags)• int recv(it socket, char *buffer, int buf_len, int flags)• int select(int n, fd_set *readfds, fd_set *writefds,

fd_set *exceptfds, struct timeval *timeout)• int close(int socket)9/29/2009 UCSD CSE 124 Networked Services

Page 10: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

socket APIs in detail• int socket(int domain, int type, int protocol) • domain

– The domain parameter specifies a communication family domain

– helps to have a single socket() api for a number of protocol families

– this selects the protocol family which will be used for communication. Sometimes called Address Family (AF_xxxx in unix systems)

– PF_INET for Internet IPV4; PF_INET6 for Internet IPV6– PF_UNIX/PF_LOCAL for local communication using Unix pipes– PF_PACKET for direct network access

• Packet sockets are used to receive or send raw packets at the device driver (OSI Layer 2) level.

• They allow the user to implement protocol modules in user space on top of the physical layer.

9/29/2009 UCSD CSE 124 Networked Services

Page 11: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

socket APIs in detail• int socket(int domain, int type, int protocol) • type (usually defined in <sys/socket.h> file)

– defines the type of communication; end-to-end communication semantics

– SOCK_STREAM Provides sequenced, reliable, two-way, connection-based byte streams. (usually used for TCP-like reliable transport protocols)

– SOCK_DGRAM Supports datagrams (connectionless, unreliable messages of a fixed maximum length). usually used for UDP like connection less transport protocols

– SOCK_RAW Provides raw network protocol access. (used in association with PF_RAW (or old PF_PACKET) protocol family domains)

– SOCK_RDM Provides a reliable datagram layer that does not guarantee ordering.

– Some socket types may not be implemented by all protocol families; for example, SOCK_SEQPACKET is not implemented for PF_INET.

9/29/2009 UCSD CSE 124 Networked Services

Page 12: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

socket APIs in detail• int socket(int domain, int type, int protocol) • protocol

– defines the protocol to be used for communication– Normally only a single protocol exists to support a particular socket

type within a given protocol family, in which case protocol can be specified as 0 or UNSPEC

– usually unused as the protocol to be used for the socket is defined by the domain and type parameters

– e.g, PF_INET and SOCK_STREAM defines implies the use of TCP– PF_INET and SOCK_DGRAM defines the use of UDP etc – However, it is possible that many protocols may exist in a certain

protocol family, in which case a particular protocol must be specified using the protocol field.

• Return value – On success a file descriptor for the new socket is returned. – On error, -1 is returned, and errno is set appropriately.

9/29/2009 UCSD CSE 124 Networked Services

Page 13: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

socket API in detail• int bind(int socket_fd, struct sockaddr* addr, int addr_len)

– bind() socket function call binds or attaches the newly created socket with local address addr

– It is necessary to assign a local address using bind() before a SOCK_STREAM socket may receive connections

– At a server, that listens to incoming connections must need bind before it can accept connection requests

• int socket_fd– bind() socket function call applies to the socket defined by the identifier

socket_fd• struct sockaddr* addr:

– this structure contains the address – The actual structure passed for the my_addr argument will depend on the

address family– The sockaddr structure is defined as something like:

• struct sockaddr { • sa_family_t sa_family; • char sa_data[14]; }

• int addr_len specifies the length of the addr field and the length depends on the protocol family9/29/2009 UCSD CSE 124 Networked Services

Page 14: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

bind in a code example……int sfd;struct sockaddr_un addr;

sfd = socket(AF_UNIX, SOCK_STREAM, 0); /* socket is opened*/

if (sfd == -1) { perror("socket"); exit(EXIT_FAILURE); }

memset(&addr, 0, sizeof(struct sockaddr_un)); /* Clear structure */ addr.sun_family = AF_UNIX; strncpy(addr.sun_path, MY_SOCK_PATH, sizeof(addr.sun_path) - 1);

/* address binding */ if (bind(sfd, (struct sockaddr *) &addr, sizeof(struct sockaddr_un)) == -1) { perror("bind"); exit(EXIT_FAILURE); }……

9/29/2009 UCSD CSE 124 Networked Services

Page 15: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

socket API in detail• int listen(int socketfd, int backlog)• listen() API specifies the willingness to accept incoming

connections and a queue limit for incoming connections on a newly created socket

• required for server side sockets • int socketfd

– the soccket on which listen() is to be carried out• int backlog

– The backlog parameter defines the maximum length the queue of pending connections may grow to.

– If a connection request arrives with the queue full the client may receive an error with an indication of ECONNREFUSED

• returns 0 if success else -1 where the errno will be set with an appropriate error code

9/29/2009 UCSD CSE 124 Networked Services

Page 16: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

socket API in detail• int accept(int sockfd, struct sockaddr *addr, socklen_t

*addrlen); – this system call is used with connection-based socket

types (e.g. SOCK_STREAM)– It extracts the first connection request on the queue of

pending connections– Creates a new connected socket– Returns a new file (socket) descriptor referring to that

socket– The original socket sockfd is unaffected by this call– The newly created socket is not in the listening state

• The argument int sockfd – is a socket that has been created with socket(.), bound to a

local address with bind(.), and is listening for connections after a listen(.) socket API call.9/29/2009 UCSD CSE 124 Networked Services

Page 17: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

socket API in detail• int accept(int sockfd, struct sockaddr *addr, socklen_t

*addrlen); • The argument struct sockaddr *addr

– a pointer to a sockaddr structure. – This structure is filled in with the address of the peer

(remote host’s) socket that is accepted to the communication session

• The argument socklen_t *addrlen– The addrlen argument is a value-result argument– it should initially contain the size of the structure pointed

to by addr– on return it will contain the actual length (in bytes) of the

address returned– When addr is NULL nothing is filled in

9/29/2009 UCSD CSE 124 Networked Services

Page 18: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

• int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen); – a socket can be either blocking or non-blocking– a blocking socket API call does not return until the call is completed

with a result– e.g, accept(.) can block the caller function until a connection is

present (which sometimes can result in a long wait)– a socket can be made non-blocking by system call select(..)– If the socket is marked non-blocking and no pending connections are

present on the queue, accept(.) fails (returns) with the error EAGAIN– the caller function need not infinitely wait for the accept(.) call to

return • Return values

– On success, accept(.) returns a non-negative integer that is a descriptor for the accepted socket

– On error, -1 is returned, and errno is set appropriately

socket API in detail

9/29/2009 UCSD CSE 124 Networked Services

Page 19: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

• int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout)

– select() allows • a program to monitor multiple file descriptors, waiting until one or more of the file descriptors become

"ready" for some class of I/O operation (e.g., for read) without blocking. – nfds is the highest-numbered file descriptor in any of the three sets + 1– Readfds

• The file descriptors listed in readfds will be watched to see if characters become available for reading (more precisely, to see if a read will not block

• in particular, a file descriptor is also ready on end-of-file)– writefds

• File descriptors will be watched to see if a write will not block for writing– exceptfds

• File descriptors in this structure will be watched for exceptions. • On exit, the sets are modified in place to indicate which file descriptors actually changed status.

– Each of the three file descriptor sets may be specified as NULL if no file descriptors are to be watched for the corresponding class of events.

– Three macros are provided to manipulate the file descriptor sets. • FD_ZERO() clears a set. • FD_SET() and FD_CLR() respectively add and remove a given file descriptor from a set. • FD_ISSET() tests to see if a file descriptor is part of the set (this is useful after select() returns).

socket API in detail

9/29/2009 UCSD CSE 124 Networked Services

Page 20: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

select() examplefd_set read_sockfds; struct timeval tv; int retval; /* Watch read_sock to see when it has input. */

FD_ZERO(&read_sockfds); FD_SET(0, &read_sockfds);

tv.tv_sec = 0; tv.tv_usec = 1000; /* Wait up to 1 milli second. */

retval = select(read_sockfds +1, &read_sockfds, NULL, NULL, &tv); /* Don't rely on the value of tv now! */

if (retval == -1) perror("select()");

else if (retval) printf("Data is available now.\n"); /* FD_ISSET(0, &read_sockfds) will be true. */

else printf("No data within five seconds.\n");

struct timeval { time_t tv_sec; /* seconds */ suseconds_t tv_usec; /* microseconds */ };

9/29/2009 UCSD CSE 124 Networked Services

Page 21: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

socket API details• There are two other methods to make a socket

non-blocking– pselect(.)

• Similar to select, except that it can take time in nano seconds, it does not modify the timeval struct and it takes sigmask additional parameter.

– fcntl(.) • int fcntl(int fd, int cmd, long arg);• By setting the arg with O_NONBLOCK flag can make a socket

non-blocking– recv(.) with appropriate flags set

• O_NONBLOCK flag set• May not work on all implementations of network socket API

9/29/2009 UCSD CSE 124 Networked Services

Page 22: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

socket API details• send(), sendto(), and sendmsg()

– The system calls send(), sendto(), and sendmsg() are used to transmit a message to another socket.

– The send() call may be used only when the socket is in a connected state (so that the intended recipient is known)

– ssize_t send(int socket_fd, const void *buf, size_t len, int flags);• socket_fd is the socket on which send is to be carried out• *buf carries the data to be sent• len carries the length of the message• flags define special control signals that needs to be considered for transmission (e.g,

bitwise OR of MSG_DONTROUTE, MSG_MORE, MSG_OOB messages) – In non-blocking mode it would return EAGAIN in this case.– Return values

• On success, these calls return the number of characters sent• On error, -1 is returned, and errno is set appropriately

– ssize_t sendto(int socketfd, const void *buf, size_t len, int flags, const struct sockaddr *to, socklen_t tolen);

– ssize_t sendmsg(int socketfd, const struct msghdr *msg, int flags);• Preferred for UDP like connection-less services

9/29/2009 UCSD CSE 124 Networked Services

Page 23: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

Socket API details• recv(), recvfrom() and recvmsg() are used for receiving

data from a socket– ssize_t recv(int socketfd, void *buf, size_t len, int flags); – The recv() call is normally used only on a connected socket

where the remote address is known– recv() can be a blocking call unless explicitly made non-

blocking– A blocking recv() can be indefinitely waiting till it gets data

from the socket– In certain implementations flags can help a non-blocking

call– recvfrom() and recvmsg() are mainly for message-based

communications such as for UDP

9/29/2009 UCSD CSE 124 Networked Services

Page 24: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

socket API details• int connect(int sockfd, const struct sockaddr

*serv_addr, socklen_t addrlen)– Request a connection using the socket referred to by the

file descriptor sockfd to the address specified by serv_addr

– Usually called by a client host to get connected to a server host

– Can be used both for STREAM and DGRAM sockets

– For DGRAM sockets, the serv_addr is the remote host address to which default data is sent

9/29/2009 UCSD CSE 124 Networked Services

Page 25: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

Socket API details• Two ways to close a socket

– close() and shutdown()• int close(int socketfd)

– closes a socket descriptor, so that it no longer refers to any socket and may be reused.

– Not checking the return value of close() is a serious programming error.

• int shutdown(int socketfd, int how); – The shutdown() call causes all or part of a full-duplex connection on

the socket associated with socketfd to be shut down– Argument how determines communication after shutdown

• If how is SHUT_RD, further receptions will be disallowed. • If how is SHUT_WR, further transmissions will be disallowed. • If how is SHUT_RDWR, further receptions and transmissions will be

disallowed. – Return values

• On success, zero is returned. • On error, -1 is returned, and errno is set appropriately. 9/29/2009 UCSD CSE 124 Networked Services

Page 26: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

What happens when you click on a web link?

9/29/2009 UCSD CSE 124 Networked Services

Page 27: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

UCSD CSE 124 Networked Services

source

application

transportnetwork

linkphysical

HtHn M

segment Ht

datagram

destination

application

transportnetwork

linkphysical

HtHnHl M

HtHn M

Ht M

M

networklink

physical

linkphysical

HtHnHl M

HtHn M

HtHn M

HtHnHl M

router

switch

Encapsulationmessage M

Ht M

Hn

frame

9/29/2009

Page 28: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

Major steps in downloading a web page

• Extract hostname from URL– http://www.google.com/index.html to www.google.com

• Use DNS to translate www.google.com to IP address – Used for Internet routing

• Establish a TCP (socket) connection to the IP address (e.g., 66.102.7.104) – Protocol agreement for browser and server to speak HTTP– TCP handle network problems (drops, corruption, etc.)– TCP layered on top of IP/Ethernet

• Internet Routers determine efficient path to 66.102.7.104

9/29/2009 UCSD CSE 124 Networked Services

Page 29: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

Address types and their context

• Domain name (e.g. www.google.com)– Global, human readable

• IP Address (e.g. 66.102.7.104)– Global, works across all networks

• Ethernet (e.g. 08-00-2b-18-bc-65)– Local, works on a particular network

9/29/2009 UCSD CSE 124 Networked Services

Page 30: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

Name to address translation

9/29/2009 UCSD CSE 124 Networked Services

Page 31: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

Address resolution for finding the local address

9/29/2009 UCSD CSE 124 Networked Services

Page 32: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

Protocol stack efficiency• Efficiency of a protocol stack depends on its

implementation• There are two ways for implementation of the

program execution in a protocol stack• Process-per-protocol• Process-per-message

• Process-per-protocol– According to this strategy, every protocol in a layer is

implemented as a separate process– One process per protocol in a layer– A process is an abstraction mechanism that enables

concurrent execution of tasks before the OS.– A certain amount of resources such as address and data

space and CPU cycles are reserved for every process– Most applications are executed as a single process

9/29/2009 UCSD CSE 124 Networked Services

Page 33: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

• When a message moves up/down the protocol stack– Many context switches

happen – Context switches happen

between two layers/protocols

– Many times memory copy is required

– Severe performance degradation can result

Application

Transport

Network

Datalink

PHY

Protocol

Protocol

Protocol

Protocol

Protocol

A protocol process

Process per protocol model

9/29/2009 UCSD CSE 124 Networked Services

Page 34: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

Process-per-message model

• In the process-per-message model, a process is associated with a message

• Each protocol becomes a static piece of code

• At each protocol/layer, the only process responsible for the message calls the layer-specific procedures

Application

Transport

Network

Datalink

PHY

A message processfor transmission

A message processfor reception

9/29/2009 UCSD CSE 124 Networked Services

Page 35: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

Protocol stack efficiency

• Modern network protocol stack prefers Process-per-message– Because it creates lower number of context

switches– Because memory is slower than the processor– Memory access is very expensive

9/29/2009 UCSD CSE 124 Networked Services

Page 36: B. S. Manoj, Ph.D cseweb.ucsd/classes/fa09/cse124

Summary

• Network Protocol stack• Protocol Stack API• What happens when you click on a web link?• Efficiency issues

9/29/2009 UCSD CSE 124 Networked Services