Upload
imogene-bridges
View
218
Download
0
Embed Size (px)
Citation preview
EMERGING TRENDS IN DISTRIBUTED SYSTEM
PREPARED BY: G.S.MISHRA
UNIT-2
PEER-2-PEER SYSTEMS
WHAT IS PEER TO PEER?
Analogous to a telephone conversation
Two people of equal status communicate
A point to point connection
Definition:
P2P is a class of applications that takes advantage of resources e.g. storage, cycles, content, human presence, available at the edges of the Internet.
P2P CONTD…
e.g. in Gnutella, there are two key differences compared to client/server based systems:
• A peer can act as both a client and a server
• The network is completely decentralized and has
no central point of control.
Peers in a Gnutella network are typically
connected to three or four other nodes and to
search the network a query is broadcast
throughout the network.
HISTORY OF P2P
The Internet started as a peer-to-peer system.
The goal of the original ARPANET was to share
computing resources around the USA.
Its challenge was to connect a set of distributed
resources, using different network connectivity,
within one common network architecture.
The first hosts on the ARPANET were several US
universities, e.g., the University College of Los
Angeles, Santa Barbara, SRI and University of Utah.
CONTD….
These were already independent computing sites
with equal status and the ARPANET connected
them as such, not in a master/slave or client/server
relationship but rather as equal computing peers.
From the late 1960s until 1994, the Internet had
one model of connectivity.
Machines were assumed to be always switched on,
always connected, and assigned permanent IP
addresses.
CONTD…
With the invention of Mosaic, another model began
to emerge in the form of users connecting to the
Internet from dial-up modems.
This created a second class of connectivity
because PCs would enter and leave the network
frequently and unpredictably.
Because ISPs began to run out of IP addresses,
they began to assign IP addresses dynamically for
each session, giving each PC a different, possibly
masked, IP address.
CONTD…
This transient nature and instability prevented PCs
from being assigned permanent DNS entries, and
therefore prevented most PC users from hosting
any data or network-facing applications locally.
For a few years, treating PCs as clients worked
well. Over time though, as hardware and software
improved, the unused resources that existed
behind this veil of second-class connectivity
started to look like something worth getting at.
CONTD…
Given the vast array of available processors
mentioned earlier, the software community is
starting to take P2P applications very seriously.
Most importantly, P2P research is concerned in
addressing some of the main difficulties of current
distributed computing: scalability, reliability,
interoperability.
BINDING OF PEERS
Within today’s Internet, we rely on fixed IP addresses.
When a user types an address into his/her Web
browser (such as http://www.google.com/), the Web
server address is translated into the IP address (e.g.,
168.127.47.8) by a domain name server (DNS).
The Internet protocol (IP) then makes a routing
decision based on the IP Address.
If DNS is unavailable then typing http://168.127.47.8/
into a browser would be equivalent since the Web
page is permanently bound to the IP address.
THE PROCESS WHEREBY AN INTERNET ADDRESS IS CONVERTED INTO THE IP ADDRESS FOR LOCATING A WEB PAGE ON THE INTERNET
CONTD…
The above example shows Early Binding. Early bindings form a simple architecture very similar
to an address book on a mobile phone e.g., the person’s name is statically bound to his/her
telephone number.
If a Web site changed its IP address several times a
day then static binding will become impractical.
Often devices do not have a fixed address as they are
hidden behind Network Address Translation (NAT)
systems
therefore need a late binding of their addresses with
their network identifier.
A P2P ENVIRONMENT: DEVICES ARE CONNECTED BEHIND NATS AND FIREWALLS
MODERN DEFINITION: P2P
P2P is a class of applications that takes advantage of
resources e.g. storage, cycles, content, human
presence, available at the edges of the Internet
A peer can act as both a client and a server (they call
these servents i.e. server and client in Gnutella.)
The network is completely decentralized and has no
central point of control.
Peers in a Gnutella network are typically connected to
three or four other nodes and to search the network a
query is broadcast throughout the network.
SOCIAL IMPACTS OF P2P
Vaidhyanathan:
what we call P2P communicative networks actually
reflect and amplify - revise and extend - an old ideology
or cultural habit.
Electronic peer-to-peer systems like Gnutella merely
simulates other, more familiar forms of unmediated,
unsensorable, irresponsible, troublesome speech;
for example, anti-royal gossip before the French
revolution, trading cassette tapes among youth
subcultures as punk or rap, or the illicit Islamist cassette
tapes through the streets and bazaars of Cairo.
RAINSFORD : “INFORMATION FEUDALISM”
The current push for control over intellectual
property rights has bred a situation analogous to
the feudal agricultural system in the medieval
period.
In effect, songwriters and scientists work for
corporate feudal lords, licensing their own
inventions in exchange for a living and the right to
‘till the lands’ of the information society.
TRUE P2P Within P2P, there are three categories of systems Centralized systems: where every peer connects to
a server which coordinates and manages communication. Some examples here include the CPU sharing applications, e.g., SETI@Home
Brokered systems: where peers connect to a server in order to discover other peers, but then manage the communication themselves (e.g., Napster).
Decentralized systems: where peers run independently without the need for centralized services. Here, the discovery is decentralized and the communication takes place between the peers. Peers do not need a known centralized service for them to operate, e.g., Gnutella, Freenet
THE P2P ENVIRONMENT
Peers are: extremely transient (they are
continually disappearing and reappearing)
connections are often multi-hop (i.e., packets
travel via several intermediaries before they
reach their destination)
peers reside in hostile environments (i.e., they
live behind NAT routing systems and
firewalls).
THE P2P ENVIRONMENT Various devices used to partition a network Hubs: A hub is a repeater that works at the physical
(lowest) layer of OSI. A hub takes data that comes into a port and sends it to the other ports in the hub.
Switches and Bridges: These are pretty similar. Both operate at the Data Link layer (just above Physical) and both can filter data so that only the appropriate segment or host receives a transmission.
Routers: These work at the Network layer of OSI (above Data Link) and operate on the IP address. Like switches and bridges, they filter by only forwarding packets destined for remote networks thus minimizing traffic, but are significantly more complex than any other networking device
NAT SYSTEMS
A network address translation system allows a single
device, such as a router, to act as an agent between the
Internet (public network) and a local (private) network.
i.e. only a single, unique IP address is required to
represent an entire group of computers.
The internal network is usually a LAN; commonly
referred to as the stub domain.
A stub domain is a LAN that uses IP addresses internally.
Any internal computers that use unregistered IP
addresses must use NAT to communicate with the rest of
the world.
CONTD…
two types of NAT translation, static or dynamic
Static NAT involves mapping an unregistered IP
address to a registered IP address on a one-to-one
basis.
Particularly useful when a device needs to be
accessible from outside the network
e.g. the computer with the IP address of 192.168.0.0
will always translate to 131.251.45.110
CONTD…
Dynamic NAT, on the other hand, maps an
unregistered IP address to a registered IP address
from a group of local dynamically allocatable IP
addresses,
i.e., the stub domain computers will be allocated an
address from a specified range of addresses, e.g.,
192.168.0.0 to 192.168.0.50
A NAT SYSTEM CAN BE ALLOCATE DYNAMIC ADDRESS OR TRANSLATE FROM FIXED STUB DOMAIN ADDRESS TO OUTSIDE ONES.
FIREWALLS
A firewall is a system designed to prevent
unauthorized access to or from a private network.
All messages entering or leaving the computer
system pass through the firewall, which examines
each message and blocks those that do not meet
the specified security criteria.
Specifically, firewalls are implemented by blocking
certain ports, thereby disabling certain types of
services that operate on those ports.
A FIREWALL BLOCKS TRAFFIC TO AND FROM SPECIFIED PORTS
P2P OVERLAY NETWORKS
P2P implementations frequently involve the creation of
overlay networks with a structure that is completely
independent of that of the underlying network of
connected devices.
The purpose of overlay networks is that they abstract
the complicated connectivity of a P2P network to a
higher-level programmatical view of the peers that make
up the network.
For example, within Jxta, a virtual network overlay sits
on top of the physical devices and is organized into
transient or persistent relationships, which they call peer
groups.
CONTD…
connections are represented through the use of
virtual pipes
Virtual pipes simply define the endpoints of the
connection and leave it to the underlying
mechanisms to implement the appropriate
behaviour for that environment
e.g., for TCP, a fixed point-to-point connection is
created for the pipe but for UDP pipes this is not
required and therefore the pipe remains
connectionless.
AN ILLUSTRATION OF THE NOTION OF AN OVERLAY NETWORK.
P2P EXAMPLE APPLICATIONS MP3 File Sharing with Napster Napster the famous MP3 file sharing program, was
launched in 1999. It had a revolutionary impact on the Internet due to
its infamous reputation for sharing illegal MP3 files and its unique design
i.e., after the initial centralized Napster search, clients connected to each other and exchanged data directly from one system’s disk to another.
Napster is P2P because the Napster peers bypass DNS and because once the Napster server resolves the IP address of the PCs hosting a particular song, it shifts control of the file transfers to the nodes.
However, Napster is an example of brokered P2P for the same reasons.
THE NAPSTER SCENARIO FOR PROVIDING A DISTRIBUTED FILE SYSTEM FOR MUSIC FILES.
INSTANT MESSAGING WITH ICQ
One of the most popular instant messaging programs
ICQ notifies users when their friends come online and
allows them to send messages to each other.
Apart from its instant messaging capabilities it allows
users to exchange files.
ICQ is a hybrid of the decentralized and client/server
architectures
It uses a central server to monitor the users that are
currently on line and to notify interested parties when
new users connect to the network.
CONTD…
All other communication between users is
conducted between the users directly.
Therefore, this employs a brokered P2P architecture,
similar to Napster, having a central database of
users for lookup purposes only, with communication
taking place independently of this central authority.
ICQ SCENARIO USES A BROKERED APPROACH USING A CENTRAL DATABASE TO STORE USER’S INFORMATION. TO THE RIGHT, THE CURRENT ICQ USER INTERFACE IS GIVEN.
FILE SHARING WITH GNUTELLA Gnutella is a ‘true P2P’ system. It does not rely on central
control for lookup, organization and communication.
A GnuCache as a lookup server for a list of Gnutella nodes
Another method: e.g., use newsgroups to get lists of
nodes, Web sites, etc.
The node joins the network by connecting initially to one
Gnutella node, which can be any node on the network
making it generally easy to join in a decentralized fashion.
Once it has joined the node discovers other nodes through
the first node by issuing ping and receiving pong
descriptors from peers accepting connections.
CONTD…
Gnutella nodes typically connect to three nodes and
then search by broadcasting their search request to
all connected neighbours, as illustrated here.
Each neighbour repeats this search request to
his/her neighbours and so on, which is known as
flooding the network.
Here, User D has the required file so User A
connects directly to User D and downloads the file
using this point-to-point connection.
GNUTELLA DECENTRALIZED APPROACH. THERE ARE TWO ASPECTS TO DISCOVERY: JOINING THE NETWORK AND THEN DISCOVERING OTHER PEERS.