67
Reliable Distributed Systems RPC and Client-Server Computing

Reliable Distributed Systems RPC and Client-Server Computing

  • View
    240

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Reliable Distributed Systems RPC and Client-Server Computing

Reliable Distributed Systems

RPC and Client-Server Computing

Page 2: Reliable Distributed Systems RPC and Client-Server Computing

Remote Procedure Call

Basic concepts Implementation issues, usual

optimizations Where are the costs? Firefly RPC, Lightweight RPC,

Winsock Direct and VIA Reliability and consistency Multithreading debate

Page 3: Reliable Distributed Systems RPC and Client-Server Computing

A brief history of RPC Introduced by Birrell and Nelson in 1985 Pre-RPC: Most applications were built

directly over the Internet primitives Their idea: mask distributed computing

system using a “transparent” abstraction Looks like normal procedure call Hides all aspects of distributed interaction Supports an easy programming model

Today, RPC is the core of many distributed systems

Page 4: Reliable Distributed Systems RPC and Client-Server Computing

More history Early focus was on RPC “environments” Culminated in DCE (Distributed Computing

Environment), standardizes many aspects of RPC

Then emphasis shifted to performance, many systems improved by a factor of 10 to 20

Today, RPC often used from object-oriented systems employing CORBA or COM standards. Reliability issues are more evident than in the past.

Page 5: Reliable Distributed Systems RPC and Client-Server Computing

The basic RPC protocol

client server“binds” to

serverregisters with name service

Page 6: Reliable Distributed Systems RPC and Client-Server Computing

The basic RPC protocol

client server“binds” to

server

prepares, sends request

registers with name service

receives request

Page 7: Reliable Distributed Systems RPC and Client-Server Computing

The basic RPC protocol

client server“binds” to

server

prepares, sends request

registers with name service

receives requestinvokes handler

Page 8: Reliable Distributed Systems RPC and Client-Server Computing

The basic RPC protocol

client server“binds” to

server

prepares, sends request

registers with name service

receives requestinvokes handlersends reply

Page 9: Reliable Distributed Systems RPC and Client-Server Computing

The basic RPC protocol

client server“binds” to

server

prepares, sends request

unpacks reply

registers with name service

receives requestinvokes handlersends reply

Page 10: Reliable Distributed Systems RPC and Client-Server Computing

Compilation stage Server defines and “exports” a header file

giving interfaces it supports and arguments expected. Uses “interface definition language” (IDL)

Client includes this information Client invokes server procedures through

“stubs” provides interface identical to the server version responsible for building the messages and

interpreting the reply messages passes arguments by value, never by reference may limit total size of arguments, in bytes

Page 11: Reliable Distributed Systems RPC and Client-Server Computing

Binding stage Occurs when client and server program

first start execution Server registers its network address

with name directory, perhaps with other information

Client scans directory to find appropriate server

Depending on how RPC protocol is implemented, may make a “connection” to the server, but this is not mandatory

Page 12: Reliable Distributed Systems RPC and Client-Server Computing

Data in messages We say that data is “marshalled” into a

message and “demarshalled” from it Representation needs to deal with byte

ordering issues (big-endian versus little endian), strings (some CPUs require padding), alignment, etc

Goal is to be as fast as possible on the most common architectures, yet must also be very general

Page 13: Reliable Distributed Systems RPC and Client-Server Computing

Request marshalling Client builds a message containing arguments,

indicates what procedure to invoke Do to need for generality, data representation

a potentially costly issue! Performs a send I/O operation to send the

message Performs a receive I/O operation to accept the

reply Unpacks the reply from the reply message Returns result to the client program

Page 14: Reliable Distributed Systems RPC and Client-Server Computing

Costs in basic protocol? Allocation and marshalling data into

message (can reduce costs if you are certain client, server have identical data representations)

Two system calls, one to send, one to receive, hence context switching

Much copying all through the O/S: application to UDP, UDP to IP, IP to ethernet interface, and back up to application

Page 15: Reliable Distributed Systems RPC and Client-Server Computing

Schroeder and Burroughs Studied RPC performance in O/S

kernel Suggested a series of major

optimizations Resulted in performance

improvments of about 10-fold for Xerox firefly workstation (from 10ms to below 1ms)

Page 16: Reliable Distributed Systems RPC and Client-Server Computing

Typical optimizations? Compile the stub “inline” to put arguments

directly into message Two versions of stub; if (at bind time) sender

and dest. found to have same data representations, use host-specific rep.

Use a special “send, then receive” system call (requires O/S extension)

Optimize the O/S kernel path itself to eliminate copying – treat RPC as the most important task the kernel will do

Page 17: Reliable Distributed Systems RPC and Client-Server Computing

Fancy argument passing RPC is transparent for simple calls with a small

amount of data passed “Transparent” in the sense that the interface to the

procedure is unchanged But exceptions thrown will include new exceptions

associated with network What about complex structures, pointers, big

arrays? These will be very costly, and perhaps impractical to pass as arguments

Most implementations limit size, types of RPC arguments. Very general systems less limited but much more costly.

Page 18: Reliable Distributed Systems RPC and Client-Server Computing

Overcoming lost packets

client serversends request

Page 19: Reliable Distributed Systems RPC and Client-Server Computing

Overcoming lost packets

client serversends request

retransmit

ack for request duplicate request: ignored

Timeout!

Page 20: Reliable Distributed Systems RPC and Client-Server Computing

Overcoming lost packets

client serversends request

retransmit

ack for request

reply

Timeout!

Page 21: Reliable Distributed Systems RPC and Client-Server Computing

Overcoming lost packets

client serversends request

retransmit

ack for request

reply

ack for reply

Timeout!

Page 22: Reliable Distributed Systems RPC and Client-Server Computing

Costs in fault-tolerant version? Acks are expensive. Try and avoid

them, e.g. if the reply will be sent quickly supress the initial ack

Retransmission is costly. Try and tune the delay to be “optimal”

For big messages, send packets in bursts and ack a burst at a time, not one by one

Page 23: Reliable Distributed Systems RPC and Client-Server Computing

Big packets

client serversends request as a burst

ack entire burst

reply

ack for reply

Page 24: Reliable Distributed Systems RPC and Client-Server Computing

RPC “semantics” At most once: request is processed 0 or

1 times Exactly once: request is always

processed 1 time At least once: request processed 1 or

more times... but exactly once is impossible because

we can’t distinguish packet loss from true failures! In both cases, RPC protocol simply times out.

Page 25: Reliable Distributed Systems RPC and Client-Server Computing

Implementing at most/least once Use a timer (clock) value and a unique id, plus

sender address Server remembers recent id’s and replies with

same data if a request is repeated Also uses id to identify duplicates and reject

them Very old requests detected and ignored by

checking time Assumes that the clocks are working In particular, requires “synchronized” clocks

Page 26: Reliable Distributed Systems RPC and Client-Server Computing

RPC versus local procedure call Restrictions on argument sizes and

types New error cases:

Bind operation failed Request timed out Argument “too large” can occur if, e.g., a

table grows Costs may be very high ... so RPC is actually not very

transparent!

Page 27: Reliable Distributed Systems RPC and Client-Server Computing

RPC costs in case of local destination process

Often, the destination is right on the caller’s machine!

Caller builds message Issues send system call, blocks, context switch Message copied into kernel, then out to dest. Dest is blocked... wake it up, context switch Dest computes result Entire sequence repeated in reverse direction If scheduler is a process, context switch 6 times!

Page 28: Reliable Distributed Systems RPC and Client-Server Computing

RPC example

Source does

xyz(a, b, c)

Dest on same site

O/S

Page 29: Reliable Distributed Systems RPC and Client-Server Computing

RPC in normal case

Source does

xyz(a, b, c)

Dest on same site

O/S

Destination and O/S are blocked

Page 30: Reliable Distributed Systems RPC and Client-Server Computing

RPC in normal case

Source does

xyz(a, b, c)

Dest on same site

O/S

Source, dest both block. O/S runs its scheduler, copies message from source out-

queue to dest in-queue

Page 31: Reliable Distributed Systems RPC and Client-Server Computing

RPC in normal case

Source does

xyz(a, b, c)

Dest on same site

O/S

Dest runs, copies in message

Same sequence needed to return results

Page 32: Reliable Distributed Systems RPC and Client-Server Computing

Important optimizations: LRPC Lightweight RPC (LRPC): for case of

sender, dest on same machine (Bershad et. al.)

Uses memory mapping to pass data Reuses same kernel thread to reduce

context switching costs (user suspends and server wakes up on same kernel thread or “stack”)

Single system call: send_rcv or rcv_send

Page 33: Reliable Distributed Systems RPC and Client-Server Computing

LRPC

Source does

xyz(a, b, c)

Dest on same site

O/S

O/S and dest initially are idle

Page 34: Reliable Distributed Systems RPC and Client-Server Computing

LRPC

Source does

xyz(a, b, c)

Dest on same site

O/S

Control passes directly to dest

arguments directly visible through remapped memory

Page 35: Reliable Distributed Systems RPC and Client-Server Computing

LRPC performance impact On same platform, offers about a 10-

fold improvement over a hand-optimized RPC implementation

Does two memory remappings, no context switch

Runs about 50 times faster than standard RPC by same vendor (at the time of the research)

Semantics stronger: easy to ensure exactly once

Page 36: Reliable Distributed Systems RPC and Client-Server Computing

Fbufs Peterson: tool for speeding up layered

protocols Observation: buffer management is a major

source of overhead in layered protocols (ISO style)

Solution: uses memory management, protection to “cache” buffers on frequently used paths

Stack layers effectively share memory Tremendous performance improvement

seen

Page 37: Reliable Distributed Systems RPC and Client-Server Computing

Fbufs

control flows through stack of layers, or pipeline of processes

data copied from “out” buffer to “in” buffer

Page 38: Reliable Distributed Systems RPC and Client-Server Computing

Fbufs

control flows through stack of layers, or pipeline of processes

data placed into “out” buffer, shaded buffers are mapped into address space but protected against access

Page 39: Reliable Distributed Systems RPC and Client-Server Computing

Fbufs

control flows through stack of layers, or pipeline of processes

buffer remapped to eliminate copy

Page 40: Reliable Distributed Systems RPC and Client-Server Computing

Fbufs

control flows through stack of layers, or pipeline of processes

in buffer reused as out buffer

Page 41: Reliable Distributed Systems RPC and Client-Server Computing

Fbufs

control flows through stack of layers, or pipeline of processes

buffer remapped to eliminate copy

Page 42: Reliable Distributed Systems RPC and Client-Server Computing

Where are Fbufs used?

Although this specific system is not widely used Most kernels use similar ideas to

reduce costs of in-kernel layering And many application-layer libraries

use the same sorts of tricks to achieve clean structure without excessive overheads from layer crossing

Page 43: Reliable Distributed Systems RPC and Client-Server Computing

Active messages Concept developed by Culler and von

Eicken for parallel machines Assumes the sender knows all about the

dest, including memory layout, data formats

Message header gives address of handler

Applications copy directly into and out of the network interface

Page 44: Reliable Distributed Systems RPC and Client-Server Computing

Performance impact? Even with optimizations, standard RPC

requires about 1000 instructions to send a null message

Active messages: as few as 6 instructions! One-way latency as low as 35usecs

But model works only if “same program” runs on all nodes and if application has direct control over communication hardware

Page 45: Reliable Distributed Systems RPC and Client-Server Computing

U/Net Low latency/high performance communication

for ATM on normal UNIX machines, later extended to fast Ethernet

Developed by Von Eicken, Vogels and others at Cornell (1995)

Idea is that application and ATM controller share memory-mapped region. I/O done by adding messages to queue or reading from queue

Latency 50-fold reduced relative to UNIX, throughput 10-fold better for small messages!

Page 46: Reliable Distributed Systems RPC and Client-Server Computing

U/Net concepts Normally, data flows through the O/S to

the driver, then is handed to the device controller

In U/Net the device controller sees the data directly in shared memory region

Normal architecture gets protection from trust in kernel

U/Net gets protection using a form of cooperation between controller and device driver

Page 47: Reliable Distributed Systems RPC and Client-Server Computing

U/Net implementation Reprogram ATM controller to

understand special data structures in memory-mapped region

Rebuild ATM device driver to match this model

Pin shared memory pages, leave mapped into I/O DMA map

Disable memory caching for these pages (else changes won’t be visible to ATM)

Page 48: Reliable Distributed Systems RPC and Client-Server Computing

U-Net Architecture

User’s address space has a direct-mapped communication region

ATM device controller sees whole region and can transfer directly in and out of it

... organized as an in-queue, out-queue, freelist

Page 49: Reliable Distributed Systems RPC and Client-Server Computing

U-Net protection guarantees No user can see contents of any other

user’s mapped I/O region (U-Net controller sees whole region but not the user programs)

Driver mediates to create “channels”, user can only communicate over channels it owns

U-Net controller uses channel code on incoming/outgoing packets to rapidly find the region in which to store them

Page 50: Reliable Distributed Systems RPC and Client-Server Computing

U-Net reliability guarantees With space available, has the same

properties as the underlying ATM (which should be nearly 100% reliable)

When queues fill up, will lose packets Also loses packets if the channel

information is corrupted, etc

Page 51: Reliable Distributed Systems RPC and Client-Server Computing

Minimum U/Net costs? Build message in a preallocated buffer in the

shared region Enqueue descriptor on “out queue” ATM immediately notices and sends it Remote machine was polling the “in queue” ATM builds descriptor for incoming message Application sees it immediately: 35usecs

latency

Page 52: Reliable Distributed Systems RPC and Client-Server Computing

Protocols over U/Net

Von Eicken, Vogels support IP, UDP, TCP over U/Net

These versions run the TCP stack in user space!

Later in course will look at other complex protocols over U/Net

Page 53: Reliable Distributed Systems RPC and Client-Server Computing

VIA and Winsock Direct Windows consortium (MSFT, Intel,

others) commercialized U/Net: Virtual Interface Architecture (VIA) Runs in NT Clusters

But most applications run over UNIX-style sockets (“Winsock” interface in NT)

Winsock direct automatically senses and uses VIA where available

Today is widely used on clusters and may be a key reason that they have been successful

Page 54: Reliable Distributed Systems RPC and Client-Server Computing

Broad comments on RPC RPC is not very transparent Failure handling is not evident at all: if an RPC

times out, what should the developer do? Reissuing the request only makes sense if there is

another server available Anyhow, what if the request was finished but the

reply was lost? Do it twice? Try to duplicate the lost reply?

Performance work is producing enormous gains: from the old 75ms RPC to RPC over U/Net with a 75usec round-trip time: a factor of 1000!

Page 55: Reliable Distributed Systems RPC and Client-Server Computing

Contents of an RPC environment

Standards for data representation Stub compilers, IDL databases Services to manage server directory,

clock synchronization Tools for visualizing system state

and managing servers and applications

Page 56: Reliable Distributed Systems RPC and Client-Server Computing

Closely Related Topic Multithreading is a common

performance-enhancing technique Idea is that server is often idle while

doing I/O for one client, so use extra threads to allow concurrent request processing

In the limit, leads to database transactional concurrency model, but many non-transactional servers use threads for enhanced performance

Page 57: Reliable Distributed Systems RPC and Client-Server Computing

Multithreading debate Three major options:

Single-threaded server: only does one thing at a time, uses send/recv system calls and blocks while waiting

Multi-threaded server: internally concurrent, each request spawns a new thread to handle it

Upcalls: event dispatch loop does a procedure call for each incoming event, like for X11 or PC’s running Windows.

Page 58: Reliable Distributed Systems RPC and Client-Server Computing

Single threading: drawbacks Applications can deadlock if a request cycle

forms: I’m waiting for you and you send me a request, which I can’t handle

Much of system may be idle waiting for replies to pending requests

Harder to implement RPC protocol itself (need to use a timer interrupt to trigger acks, retransmission, which is awkward)

Page 59: Reliable Distributed Systems RPC and Client-Server Computing

Multithreading Idea is to support internal concurrency as

if each process was really multiple processes that share one address space

Thread scheduler uses timer interrupts and context switching to mimic a physical multiprocessor using the smaller number of CPU’s actually available

Page 60: Reliable Distributed Systems RPC and Client-Server Computing

Multithreaded RPC Each incoming request is handled by

spawning a new thread Designer must implement appropriate

mutual exclusion to guard against “race conditions” and other concurrency problems

Ideally, server is more active because it can process new requests while waiting for its own RPC’s to complete on other pending requests

Page 61: Reliable Distributed Systems RPC and Client-Server Computing

Negatives to multithreading Users may have little experience with

concurrency and will then make mistakes Concurrency bugs are very hard to find due to

non-reproducible scheduling orders Reentrancy can come as an undesired surprise Threads need stacks hence consumption of

memory can be very high Deadlock remains a risk, now associated with

concurrency control Stacks for threads must be finite and can

overflow, corrupting the address space

Page 62: Reliable Distributed Systems RPC and Client-Server Computing

Threads: can spawn too many

SCHED

event

Page 63: Reliable Distributed Systems RPC and Client-Server Computing

Threads: can spawn too many

SCHED

event

Thread spawned, but blocks

Page 64: Reliable Distributed Systems RPC and Client-Server Computing

Threads: can spawn too many

SCHED

eventEventually, application becomes bloated, begins to thrash. Performance drops and clients may think the server has failed

Page 65: Reliable Distributed Systems RPC and Client-Server Computing

Upcall model Common in windowing systems Each incoming “event” is encoded as a

small descriptive data structure User registers event handling

procedures Dispatch loop calls the procedures as

new events arrive, waits for the call to finish, then dispatches a new event

Page 66: Reliable Distributed Systems RPC and Client-Server Computing

Upcalls combined with threads

Perhaps the best model for RPC programming

Each handler can be tagged: needs thread, or can be executed “unthreaded”

Developer must still be very careful where threads are used

Page 67: Reliable Distributed Systems RPC and Client-Server Computing

Recent RPC history RPC was once touted as the transparent

answer to distributed computing Today the protocol is very widely used ... but it isn’t very transparent, and

reliability issues can be a major problem Today the strongest interest is in Web

Services and CORBA, which use RPC as the mechanism to implement object invocation