49
ICS362 Distributed Systems Dr Ken Cosh Week 5

ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Embed Size (px)

Citation preview

Page 1: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

ICS362 Distributed Systems

Dr Ken Cosh

Week 5

Page 2: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Review

Communication– Fundamentals– Remote Procedure Calls (RPC)– Message Oriented Communication– Stream Oriented Communication– Multicast Communication

Page 3: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

This Week

Naming– Names, Identifiers & Addresses– Flat Naming– Structured Naming

Page 4: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Names

A string of bits / characters referring to an entity.– Entity could be resources, hosts, printers, disks,

files, processes, users, mailboxes, newsgroups, webpages, messages…

Entities can be operated on through their interfaces– But for that we need an access point – or

address

Page 5: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Access points

An entity can have more than one access point– We have more than one telephone– A host offers multiple ports

An entity can change its access points– A new IP address in a new network– A new email address

Page 6: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Entity <-> Access Point

It appears an access point is tightly associated with an entity

But the name of the entity and the name of the access point should be independent– Making a naming system which is more flexible

and easier to use.

Page 7: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Identifiers

Uniquely refer to an entity– An identifier refers to at most one entity.– Each entity is referred to by at most one identifier.– An identifier always refers to the same entity (i.e.,

it is never reused).

Page 8: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Human Friendly Names

Most names are represented in machine readable form, i.e. a bit string.

Human Friendly Names convert this to a character string.

Page 9: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Name Resolution

The crucial aspect is how to resolve names, identifiers and addresses?

– Close link to message routing Simply a table of name<->address pairs

– With a large distributed system this becomes a large table which can’t be centralised.

Most of this section will deal with alternative approaches to name resolution

– Flat Naming– Structured Naming– Attribute Based Naming

Page 10: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Flat Naming

Generally names are just random bit strings – i.e. nothing about the name gives any indication of where the access point is.

– (In contrast to cis.payap.ac.th for example)

Alternatives here include:– Broadcast Based– Home Based– Distributed Hash Tables– Hierarchical Based

Page 11: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Broadcasting

Message sent out to all machines in network– Broadcast a message containing the entity of that

is being looked for– Each machine checks if they have the entity– Those with an access point respond accordingly

As the network grows it becomes inefficient– Wasted Bandwidth– Too many hosts being interrupted with messages

they can’t answer

Page 12: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Multicasting

Multicasting can improve things as only a specified group of machines will receive the ‘broadcast’

Page 13: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Forwarding Pointers

When an entity moves, it leaves a forwarding pointer at its last address– Once an entity has been found we can find the

current address by following forwarding pointers Drawbacks

– The chain for a mobile entity can become very long!

– What happens if part of the chain is unreliable? Scalability?

Page 14: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Home Based Approaches

A Home Location keeps track of the current location of an entity.– This is the ‘Care of’ address of the entity

If a request comes it is first routed to the home location, but then forwarded to the current location– With the client being updated with the new

location.

Page 15: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Home Based Approaches

Page 16: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Home Location Drawbacks

Communication latency due to potential distances between locations

What if the Home Location doesn’t exist or is unavailable?

What is the entity decides to move permanently?

Page 17: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Distributed Hash Tables

A hash function is used to allocated random identifiers to nodes and keys to entities

An entities with key k is under the jurisdiction of the node with the smallest id >= k

If a node needs to find an entity that isn’t under it’s jurisdiction it could simply check with it’s predecessor or succeeding node.– This is made more efficient by storing a finger

table of nearby nodes.

Page 18: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Distributed Hash Table

Page 19: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Distributed Hash Tables

With randomly assigned ids the requests could be routed across long distances

Topology based assignments of node identifiers– Make sure that nearby nodes get nearby ids

Proximity Routing– By storing multiple successors & predecessors a node can

choose to check with a nearby node assuming it satisfies the conditions (< or >) of the key

Page 20: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Hierarchical Approaches

The network is divided into a collection of domains, each with subdomains until you reach a leaf domain

Each domain has an associated directory node dir(D) which leads to a tree of directory nodes.– With a root directory node at the top.

Page 21: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Hierarchical Approaches

Root DirectoryTop Level Domain

Subdomain

Leaf Domain

Page 22: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Location Records

Each directory node has a location record for each entity within its directory– If an entity is within a subdomain then it contains

a location record of the subdomain containing the entity.

If an entity has multiple locations (is replicated) a directory may contain more than one reference for the entity

Page 23: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Location Records

Page 24: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Look Up

Look Up is done through ever increasing circles – based on locality.Consider Worst Case?

Page 25: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Insertion

Page 26: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Structured Naming

Flat names are convenient for machines,– But not really for humans

File naming & host naming allow convenient human friendly names.

Here we discuss Namespaces & Name Resolution

Page 27: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Namespaces

Names can be represented as a labeled, directed graph.

2 Types of node– Leaf Nodes

The address of a named entity, or the actual entity. No outgoing edges

– Directory Nodes Named nodes with a number of outgoing edges

Page 28: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Naming Graph with 1 root node

Page 29: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Naming Graphs

Most have a single root Many are strictly hierarchical

– Making them into a tree where each node has exactly 1 incoming edge

Some are directed acyclic graphs (as in previous slide)– Each node can have multiple incoming edges, but

no cycles allowed

Page 30: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Aliases

In the previous example the entity “/keys” has an alias “/home/steen/keys”– Multiple absolute paths referring to the same

node (Hard links)

An alternative is to use symbolic links– When resolving “/home/steen/keys” the absolute

path “/keys” is returned.– (As in the following slide)

Page 31: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Symbolic Link

Page 32: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Name Resolution

Resolving a name involves following a path through the graph;

– E.g. /home/steen/mbox

Closure Mechanism– Resolution works on the assumption that we know where to

start the path from – i.e. where is the root node? Is it a node in a higher graph? Have we already resolved that

node?

– What would you do with the string 0031204430784?

Page 33: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Mounting Points

A directory node can store the identifier of a directory node from a different namespace.

– This is the Mounting Point

Consider a collection of distributed namespaces, we can mount a foreign namespace with;

– The name of an access protocol– The name of the server– The name of the mounting point in the foreign name space

For Example – ftp://cis.payap.ac.th

Page 34: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Foreign Mounting Point

Page 35: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Namespace Implementation

A naming service implemented by name servers– For large scale DS, it is distributed across multiple servers

This is separated into layers– Global Layer

High level nodes (root node and neighbours), hence relatively fixed & stable.

– Administrational Layer Nodes from within a single organisation, e.g. groups of entities,

perhaps a node for each department in an organisation– Managerial Layer

Frequently changing nodes e.g. hosts in a local network

Page 36: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

DNS example

Page 37: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Global Layer

High Availability is particularly necessary– If one fails a large part will be unavailable as

resolution can not continue past the failed server. But, as names rarely change, clients can

cache the results– So speedy results are not as important as

availability Normally implemented using replicated

servers

Page 38: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Administrational Layer

Availability is important – for clients in the same organisation as the nameserver, but less important for those outside of the organisation.

Responsiveness is much more important at this layer– Updates need to be processed more quickly –

e.g. a new user account needs to be processed quickly.

Page 39: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Managerial Layer

Availability is less demanding– Can be managed on a single machine

Performance is crucial– Responses should be immediate

Page 40: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Layer Comparison

Page 41: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Name Resolution Implementation

Choices:– Iterative or Recursive?

Lets consider needing to resolve:– root:<nl, vu, cs, ftp, pub, globe, index.html>– Otherwise known as:– ftp://ftp.cs.vu.nl/pub/globe/index.html

Page 42: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Iterative Resolution

root:<nl, vu, cs, ftp, pub, globe, index.html> The root server resolves ‘nl’ and returns that location to the

client– Remaining pathname: nl: <vu, cs, ftp, pub, globe, index.html>

The nl nameserver resolves ‘vu’– Remaining pathname: vu: <cs, ftp, pub, globe, index.html>

The vu nameserver resolves ‘cs’ and‘ftp’– Remaining pathname: ftp: <pub, globe, index.html>

Then the ftp server can return the requested file. Each time the location of the next server is returned to the

client and the client makes a new request.

Page 43: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Iterative Name Resolution

Page 44: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Recursive Resolution

root:<nl, vu, cs, ftp, pub, globe, index.html> The nameserver passes the request on to the next

nameserver it finds;– i.e. root identifies nl and passes on the request:– nl: <vu, cs, ftp, pub, globe, index.html>

nl passes on the request to vu: cs: <ftp, pub, globe, index.html>

– vu passes on the request to ftp:– ftp: <pub, globe, index.html>

Finally the results are returned to the client back through the chain.

Page 45: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Recursive Resolution

Page 46: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Recursive vs Iterative

Recursive places more demands on the servers– Which generally makes it prohibitive for global

layer servers dealing with many requests

Page 47: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Recursive vs Iterative

Recursive name resolution enables each server to learn the address of lower level nodes– And cache these results

This makes subsequent requests much quicker– The results can be cached both by the root server

and every other server in the chain

Page 48: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Recursive Caching

Page 49: ICS362 Distributed Systems Dr Ken Cosh Week 5. Review Communication – Fundamentals – Remote Procedure Calls (RPC) – Message Oriented Communication – Stream

Recursive vs Iterative

Recursive can also be cheaper in terms of communication– Consider if the request in the example given was

made from Chiang Mai…