Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Master Thesis - TRITA-ICT-EX-2012-262
LIBSWIFT P2P PROTOCOL: AN
ANALYSIS AND EXTENSION
Fu Tang
Design and Implementation of ICT products and systems
Royal Institute of Technology (KTH)
October 30th, 2012
Supervisor: Flutra Osmani
Examiner: Bjorn Knutsson
Department: ICT School, NSLab
Royal Institute of Technology (KTH)
1
Abstract
More and more end-users are using P2P protocols for content sharing,
on-demand and live streaming, contributing considerably to overall Internet
traffic. A novel P2P streaming protocol named libswift was developed
to enable people experience a better service by consuming less resources
and transferring less unnecessary functions and metadata. This master
thesis studies the inner functioning of libswift and analyzes some of the
vulnerabilities that directly impact performance of the protocol, namely
download speed and response delay.
By investigating the behavior of libswift in scenarios with multiple peers,
we found that the lack of a peer selection mechanism inside the protocol
affects download efficiency and response time. We also discovered that
libswift’s internal piece picking algorithm raises competition among peers,
thus not fully utilizing connected peers. In addition, we found that current
libswift implementation does not follow the specification for PEX peer
discovery, thus we modified PEX algorithm to support another message
that is used to proactively request new peers from the currently connected.
Having made these observations, we designed and implemented a peer
selection extension interface that allows for third-party peer selection
mechanisms to be used with libswift protocol. Apropos, we tested the
interface (or adapter) with an example peer selection mechanism that groups
peers according to properties such as latency and locality. Preliminary
experimental data shows that using our extension with an external peer
selection mechanism enables libswift to select peers based on various metrics
and thus enhances its download speed.
We argue that libswift is a good protocol for next generation content
delivery systems and it can get faster data transfer rates and lower latency
by integrating efficient peer selection mechanisms.
Contents
I Background 7
1 Introduction 8
1.1 Basic Introduction of Libswift . . . . . . . . . . . . . . . . . . 8
1.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5.1 Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.6 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Background and Related Work 12
2.1 Peer-to-Peer networks . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Peer-to-Peer protocols . . . . . . . . . . . . . . . . . . . . . . 13
2.3 BitTorrent protocol . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.1 Peer tracking and piece picking . . . . . . . . . . . . . 15
2.4 Libswift protocol . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4.1 Basic operations in libswift . . . . . . . . . . . . . . . 20
2.5 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.6 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . 27
2.6.1 Extensions for libswift . . . . . . . . . . . . . . . . . . 28
2.6.2 Software resources and tools . . . . . . . . . . . . . . . 29
3 Overview and Analysis 30
3.1 Peer discovery basics . . . . . . . . . . . . . . . . . . . . . . . 30
1
3.2 Peer exchange and selection . . . . . . . . . . . . . . . . . . . 33
3.3 Analysis of multiple-channel download . . . . . . . . . . . . . 35
3.4 Congestion control . . . . . . . . . . . . . . . . . . . . . . . . 41
II Design and Implementation 44
4 Protocol Extensions 45
4.1 PEX REQ design . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2 Design of the adapter . . . . . . . . . . . . . . . . . . . . . . 47
4.2.1 Interaction between components . . . . . . . . . . . . 49
5 Implementation 52
5.1 Libswift modifications . . . . . . . . . . . . . . . . . . . . . . 52
5.2 Implementation of the adapter module . . . . . . . . . . . . . 55
5.3 Integration with PeerSelector . . . . . . . . . . . . . . . . . . 59
6 Experimental Evaluation 60
6.1 PEX modifications . . . . . . . . . . . . . . . . . . . . . . . . 60
6.2 Adapter extension . . . . . . . . . . . . . . . . . . . . . . . . 61
6.2.1 Experimental results . . . . . . . . . . . . . . . . . . . 63
6.3 Comparison of policies . . . . . . . . . . . . . . . . . . . . . . 68
6.3.1 Side effects of PeerSelector . . . . . . . . . . . . . . . 70
III Discussion and Conclusions 71
7 Discussion 72
7.1 Libswift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.2 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
8 Conclusions 76
A Appendix 78
2
List of Figures
2.1 P2P network . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Centralized network . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Metainfo with tracker . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Metainfo, trackerless . . . . . . . . . . . . . . . . . . . . . . 14
2.5 Data request and response . . . . . . . . . . . . . . . . . . . 15
2.6 Channels and file transfer . . . . . . . . . . . . . . . . . . . 17
2.7 Chunk addressing in libswift . . . . . . . . . . . . . . . . . . 18
2.8 A libswift datagram . . . . . . . . . . . . . . . . . . . . . . . 20
2.9 The process of connection . . . . . . . . . . . . . . . . . . . 21
2.10 The process of exchanging data . . . . . . . . . . . . . . . . 21
3.1 The flow of PEX messages . . . . . . . . . . . . . . . . . . . 31
3.2 State machine of the leecher . . . . . . . . . . . . . . . . . . 31
3.3 State machine of the seeder . . . . . . . . . . . . . . . . . . 32
3.4 Message flow I . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.5 Message flow II . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.6 Piece picking algorithm . . . . . . . . . . . . . . . . . . . . . 35
3.7 Three peers in the same swarm . . . . . . . . . . . . . . . . 36
3.8 The setting for three groups . . . . . . . . . . . . . . . . . . 36
3.9 Group 1, trial 1 . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.10 Group 1, trial 2 . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.11 Group 1, trial 3 . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.12 Group 1, trial 4 . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.13 Group 2, trial 1 . . . . . . . . . . . . . . . . . . . . . . . . . 38
3
3.14 Group 2, trial 2 . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.15 Group 2, trial 3 . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.16 Group 2, trial 4 . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.17 Group 3, trial 1 . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.18 Group 3, trial 2 . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.19 Group 3, trial 3 . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.20 Group 3, trial 4 . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.21 Seeder’s HAVE and sent DATA messages . . . . . . . . . . 40
3.22 Congestion window size . . . . . . . . . . . . . . . . . . . . . 42
3.23 Overall view . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.24 Detailed view . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.1 Sender side . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2 Receiver side . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.3 Design overview . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.4 The interaction between components . . . . . . . . . . . . . 48
4.5 Adapter and libswift core . . . . . . . . . . . . . . . . . . . 49
4.6 Adapter and PeerSelector . . . . . . . . . . . . . . . . . . . 50
4.7 API used by PeerSelector . . . . . . . . . . . . . . . . . . . 51
4.8 API used by libswift . . . . . . . . . . . . . . . . . . . . . . 51
6.1 Test case 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.2 Test case 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.3 Download performance for Score . . . . . . . . . . . . . . . 64
6.4 Score, trial 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.5 Score, trial 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.6 Download performance for AS-hops . . . . . . . . . . . . . 65
6.7 AS-hops, trial 1 . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.8 AS-hops, trial 2 . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.9 Download performance for RTT . . . . . . . . . . . . . . . . 67
6.10 RTT, Trial 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.11 RTT, Trial 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4
6.12 Download performance for Random . . . . . . . . . . . . . 68
6.13 Random, Trial 1 . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.14 Random, Trial 2 . . . . . . . . . . . . . . . . . . . . . . . . . 69
5
List of Tables
6.1 PlanetLab machine properties . . . . . . . . . . . . . . . . . 62
6.2 Score-selected peers . . . . . . . . . . . . . . . . . . . . . . . 63
6.3 AS-hops-selected peers . . . . . . . . . . . . . . . . . . . . . 65
6.4 RTT-selected peers . . . . . . . . . . . . . . . . . . . . . . . 66
6.5 Comparison of policy-based performances . . . . . . . . . . 69
6
Part I
Background
7
Chapter 1
Introduction
During recent years Internet grew rapidly, with many protocols and
architectures developed to allow people to get information from the Internet.
Apropos, P2P networks and protocols play a very important role to help the
end-user retrieve and consume content.
1.1 Basic Introduction of Libswift
Libswift is a recently developed P2P protocol that may be understood as
BitTorrent at the transport layer [13]. Its mission is to disseminate content
among a group of devices. In its current form, it is not only a file-sharing
protocol, but it can also be used for streaming on-demand and live video [15].
Using libswift, a group of computers share and download the same file
with each other, where each computer is a peer to other computers. These
computers and the file together form a swarm. Like BitTorrent1, libswift
divides the file into small pieces which can be located among a group of
computers. As a result, the files and computers may be located in different
countries or cities. This is why we call libswift network a content distributed
network. When a user wants to retrieve pieces or the whole content, libswift
is responsible to transfer that content back to the user.
1BitTorrent(software) available at http://en.wikipedia.org/wiki/BitTorrent (software)
8
1.2 Problem Description
Naturally, users want to get their wanted content or some parts of it as
soon as possible [20]. Libswift and other P2P protocols try to retrieve
content from several computers at the same time. Just as explained above,
several devices may be serving the same content, meaning that the user
can download the corresponding content from any computer of the given
swarm. If swarm size is, say, five or six then libswift would be requesting
pieces from all five or six computers and, as a result, download speed would
increase proportionally with swarm size.
However, if swarm size were to be around five thousand[19], we should
not allow libswift to retrieve pieces from all of the computers as resources
of a local PC or a mobile device are limited, including CPU speed, memory,
or traffic bandwidth. Each request to another peer would consume some
of such resources. If thousands of requests were to be sent concurrently,
the local device would exhaust its memory and other resources. In this
thesis, we want to improve download performance by limiting the number
of simultaneous requests to certain — more wisely chosen — peers only.
1.3 Motivation
When we download data from different computers or any other device, we
may get different download speeds from each. In a P2P network, there
are many factors that may affect the performance of each peer. These
factors or properties include, among others, a peer’s location [20] and peer’s
Autonomous System (AS) [30]. Our initial motivation is to study how these
properties affect the behavior of libswift and, more specifically, how do such
properties affect libswift’s download speed, data transfer rate and latency.
9
1.4 Hypothesis
We argue that peer management directly impacts the efficiency and
performance of libswift protocol. No previous thorough analysis or solution
on peer management for libswift has been provided. Identifying and
developing a suitable peer management mechanism for libswift can be used
to reduce its download time and resource utilization.
1.5 Goal
1. Characterize libswift behavior in scenarios where libswift downloads from
multiple peers, and assess the protocol in terms of its correctness.
2. Identify and devise a peer management mechanism that reduces both
download time and resource utilization.
1.5.1 Tasks
To achieve our goal we will understand and analyze the internal peer
discovery mechanism and peer management policy. In P2P systems, the
process of selecting peers affects download performance.
We will also analyze congestion control and how it affects the behavior
of libswift. Generally, congestion control affects data transfer rate.
In addition, we will investigate piece picking algorithm and profile
libswift behavior for multiple-channel download scenarios. In a P2P
protocol, piece (chunk) picking affects the efficiency of data transfer.
We will design and implement a module to help libswift manage its
peers better. This module should act as an interface (adapter) between
libswift and a third-party peer selection mechanism. The interface should
be integrated with libswift but the peer selection mechanism should be
independent of libswift and should not require any major modifications
inside libswift’s implementation.
10
To verify the functionality of our module, we will integrate libswift with
an example peer selection engine.
Finally, we will define and carry out an experimental evaluation to verify
whether libswift’s download speed has improved.
1.6 Structure
Chapter 2 introduces background and related work, as well as, necessary
information needed to understand our thesis and the topic. Chapter 3
provides and overview and analysis of libswift, while Chapter 4 introduces
the overall design and design considerations for the extensions. Chapter 5
outlines implementation details, and Chapter 6 presents an experimental
evaluation of the extension and describes experimental results. We discuss
preliminary results and lessons learned in Chapter 7 and conclude in
Chapter 8.
11
Chapter 2
Background and Related
Work
In this chapter, we explain basic concepts of peer-to-peer networks and
protocols, introduce related work, and describe relevant background infor-
mation necessary to understand the work in this thesis.
2.1 Peer-to-Peer networks
In a centralized network, computers connect to a central server where files
are commonly located. When users want to download a file, they will connect
to that server, as depicted in Figure 2.2. Napster is the first prominent
centralized P2P system [29].
Figure 2.1: P2P network Figure 2.2: Centralized network
12
A peer-to-peer (P2P) network is a network architecture which organizes
computers as peers of an equivalent role [24]. Figure 2.1 illustrates a peer-
to-peer network where computers are connected to each other and files are
distributed across such machines. When a user wants to obtain a file, he
may connect to any computer of that network. Unlike in the centralized
network, there is no obvious server in a distributed network. Instead, every
computer acts as the rest of computers, and they call each other a peer.
Gnutella is a typical decentralized peer-to-peer network [6].
2.2 Peer-to-Peer protocols
A peer-to-peer protocol is used by users to exchange content or other
information in a P2P network. Since late 90s, such protocols were widely
used for file-sharing and distributed computing [29, 23, 26].
Terms we commonly encounter in P2P protocols are described in the
following:
• Peer — a computer participating in a P2P network and is usually
identified by its IP address and port number.
• Content — a live transmission, a pre-recorded multimedia asset, or a
file [15].
• Swarm — a group of peers that are distributing the same content.
• Chunk (piece) — a unit of content, usually measured in kilobytes.
• Leecher — a peer that is downloading content, sometimes called a
receiver.
• Seeder — a peer that is serving the content, often called a sender.
• Tracker — a device that keeps track of a group of peers that have
a given content. Peers searching for content commonly contact the
tracker in order to get other peers’ addresses.
13
2.3 BitTorrent protocol
In recent years, BitTorrent has been widely used to share music, videos and
other files in the Internet. Using BitTorrent to distribute content consists
of mainly two phases:
Publishing content
BitTorrent generates a metainfo (.torrent) file which contains a unique key
named infohash. This key contains subkeys describing chunks within the
content [6].
A file is split into small fixed-size pieces, where the default size is 256K.
Every piece is hashed and a corresponding 20 bytes string is generated
to uniquely identify the piece. All such 20 byte strings are stored in the
metainfo file, as pieces. Thus, the length of the pieces value is always a
multiple of 20.
Figure 2.3: Metainfo with tracker Figure 2.4: Metainfo, trackerless
By default, there should be an announce key that contains tracker’s
address inside metainfo (see Fig. 2.3). In a trackerless version, BitTorrent
uses Distributed Dash Tables (DHT) to get peer’s contact information [16].
In this case, a key named nodes replaces the announce key inside metainfo,
and the value of the nodes field becomes a list of IP addresses and port
numbers (see Fig. 2.4).
14
Transferring data
Once a BitTorrent client obtains the metainfo file from a torrent indexing
website, it reads tracker’s contact information from it and then connects to
that tracker to get peers who currently are sharing the requested file. If this
is a trackerless torrent, the client reads the nodes from metainfo and then
contacts such nodes for peers that have the content. In the latter case, each
node works as a tracker.
Figure 2.5: Data request and response
As soon as the client gets peer addresses, it sends them request messages
for pieces of content. A request message contains: an integer index, begin,
and length [16]. The queried peer may return a message whose payload
contains a piece index. Once the client receives a piece, it can check its
correctness against the hash value stored inside the metainfo file. Figure 2.5
illustrates the data transfer process where index represents the piece index
that peer A wants to retrieve, begin is the byte offset within the piece, and
length is the requested length.
2.3.1 Peer tracking and piece picking
Two peer tracking mechanisms exist in BitTorrent. Commonly, BitTorrent
relies on a tracker to find peers, as trackers maintain a list of peers for a
15
given infohash. When a new peer queries the tracker for a list of peers for a
certain file, the tracker returns the list of peers but also adds the querying
peer to the overall peer list for that infohash. Finally, whenever a peer wants
additional peers, it has to contact the tracker again.
Distributed Hash Tables (DHT) is an alternative, decentralized mech-
anism for peer discovery in BitTorrent [16]. In DHT, each peer maintains
(is responsible) contact information for a list of peers. Using DHT, a client
searches for k closest nodes to the given infohash, where the definition of
closeness is specific to the DHT protocol variant and its implementation, as
well as, the parameter k. Once it finds closest nodes, it queries them for the
list of peers associated with the infohash. While nodes serve only as pointers
to the list of peers, the peers possess the actual content.
BitTorrent uses rarest chunk first heuristics to retrieve pieces that
are rare in the swarm, and optimistic unchoking to enable random peers
bootstrap themselves into the swarm [7].
2.4 Libswift protocol
Libswift is a recently developed multiparty transport protocol that supports
three usecases: file download, on-demand and live streaming. Unlike Bit-
Torrent, libswift runs over UDP and TCP, however, current implementation
of the protocol is UDP-based [13, 15]. In this section, we will describe some
of the main libswift components.
File transfer
Libswift introduces file transfer — a means to distinguish between different
file transfers in the client. More specifically, file transfer is used to read or
save content that is being sent or received by a libswift client. Every file in
libswift has a corresponding file transfer. Though protocol specification does
not limit the number of concurrent file transfers, the current implementation
enforces a default of 8.
16
Channels
When two peers are connected, a channel is established between these two
peers. A channel is used to transfer the content of a file between connected
peers. One file transfer can employ more than one channel to transfer the
same file, but every channel can only work for one file transfer. Figure 2.6
illustrates the relationship between file transfers and channels.
Figure 2.6: Channels and file transfer
Congestion control
In order to control data transfer rate between the receiver and sender,
libswift uses a third-party congestion control method called Low Extra
Delay Background Transport (LEDBAT) [17]. LEDBAT is designed to
utilize all available bandwidth. As soon as there is bottleneck in the link,
it decreases the size of congestion window using the detected queuing delay.
In practice, libswift uses LEDBAT to manage its congestion window size;
libswift increases or decreases its data transfer rate according to the change
of congestion window size.
LEBDAT uses the following elements for its decision making:
17
• TARGET — if a delay is bigger than this value, the sender should
reduce the size of congestion window.
• GAIN — measures the “increase” and “decrease” ratio of the current
congestion window size.
Chunk addressing and chunk integration check
Like BitTorrent, libswift splits the file into small pieces during data transfer.
The recommended chunk size is 1 kilobyte.
Figure 2.7: Chunk addressing in libswift
Figure 2.7 illustrates the chunk addressing tree of a 4 kilobytes file. Each
leaf of the tree corresponds to a chunk. In turn, each chunk has an index
and a corresponding bin number. For example, bin number 0 stands for the
first 1 kb of the file, bin 2 stands for the second 1 kb, bin 4 stands for the
third 1kb, and so on.
When a peer starts serving this file to other peers, it calculates a SHA1
hash for every chunk. Then, it recursively calculates the upper layer SHA1
hashes according to the hashes of lower layers. Finally, the top level’s hash
value is the root hash (roothash) of the file. The roothash uniquely identifies
the file.
A peer requests content using the roothash. With the first chunk
received, the peer also gets the necessary hashes to check the integrity of
that chunk. For example, to check the first chunk of the above file, the
sender will send the hash of bin 2 and bin 5. First, when the receiver gets
the hash of bin 2, it calculates the hash for bin 1 according to bin 0 and
18
bin 2. Then, it obtains a hash by calculating bin 1 and bin 5. Finally, it
compares the hash which he got in step two with the roothash. If these two
hashes match, the integrity of the first chunk has been checked. Otherwise,
this chunk will be dropped.
Libswift messages
Libswift uses the following messages:
• HANDSHAKE — when a peer wants to connect to another peer, it
starts by sending a “handshake” message.
• HAV E — a peer uses this message to inform other peers what content
it has available.
• HINT — a peer uses this message to inform the sender what content
it wants.
• DATA — contains pieces of the file, usually, it carries 1 kb of content.
• HASH — this message is sent together with the DATA message. The
payload of this message contains the necessary hashes to verify the
integrity of transferred content.
• ACK — the receiver uses this message to acknowledge that it got the
content and successfully verified it.
• PEX REQ — peer uses this message to query for new peers from its
communicating peer.
• PEX RSP — sender uses this message to send back a list of peers to
the peer that requested them.
Peer exchange
Usually, a peer-to-peer network introduces a centralized tracker to keep track
of a group of computer addresses. When a peer wants to discover new peers,
19
it contacts this centralized tracker and the tracker gives back a subset of
peers. When a new peer joins the swarm, the tracker adds this new peer’s
address to the list of peers associated with certain content.
In addition, libswift supports trackerless peer discovery using the
gossiping algorithm called PEX (peer exchange). There are two types of
PEX messages: PEX REQ and PEX RES. When a peer A wants some
peers, it has to send a PEX REQ to a connected peer B who is transferring
data with the former. Then peer B may respond with a PEX RES message
containing its peers. The peers sent back to peer A are peers who were
actively exchanging information with peer B in the last 60 seconds.
2.4.1 Basic operations in libswift
In libswift, the basic operation unit is a message and the basic transferring
unit is a datagram which may contain one or more messages.
Figure 2.8: A libswift datagram
Figure 2.8 illustrates a datagram which contains a DATA and an
INTEGRITY message. The DATA message carries a piece of content and the
INTEGRITY message carries the hashes to verify integrity of the content.
Joining a swarm
When a peer wants to download some specific content, it begins from
requesting a list of peers from the tracker. It knows that these peers reside
in the same swarm where peers are exchanging the same file. Then, the peer
starts to connect to one of such peers by sending a HANDSHAKE message;
the other side will send back a HANDSHAKE message, indicating that the
connection has been established. In the same datagram — returned by the
connected peer — there may be some HAVE messages to show the current
20
Figure 2.9: The process of connection
content that is available at its side. Details of this process are depicted in
Figure 2.9.
Exchanging chunks
Figure 2.10: The process of exchanging data
After the connection has been established, the leecher (requester) starts
to request data from the connected peers. The leecher sends HINT
21
(REQUEST) messages to its peers whereas the peers respond back with
DATA, HAVE and INTEGRITY messages. Next, the leecher will check
and save the received chunks and send acknowledgment messages back. In
the same datagram a new HINT message may be included. Figure 2.10
illustrates the data exchange process.
Leaving a swarm
Once the initial peer downloads all desired content, it can leave the swarm
by explicitly sending a leave message or by stopping to respond to peers.
2.5 Related work
Many efforts have been made to improve the performance of P2P ap-
plications, and they can be mainly categorized into three areas: third-
party assisted approaches, end-user based approaches, and approaches using
congestion control. The first two categories focus on optimal peer selection
mechanisms, while the third category focuses on optimizing data transfer.
Third-party assisted approaches
Aggarwal et al [4] proposed a mechanism to help users find peers with good
performance. Authors propose an oracle service, provided by the ISP, to
P2P users. A P2P user provides a list of peers to the oracle, and this oracle
ranks peers according to different criteria including: peer’s distance to edge
of the AS, the bandwidth between users, or other. Given a sorted list of
peers, the user can then contact peers with better ranking. As the ISP has
more information about network’s topology, the calculated rank of peers can
be more accurate. Thus, a P2P user may benefit from this method. On the
other hand, ISPs may also benefit by ranking peers — who are in their own
network — in a higher place, keeping Internet traffic internal.
This solution seems like a win-win game, where both P2P users and ISPs
benefit. However, some questions remain. First, in a P2P network, peers
22
frequently join a swarm and leave it. This means that, in order to get the
latest peer rankings, a user must contact ISP’s oracle continuously. As a
result, new overhead is introduced at the user side. Secondly, given that
this approach requires the ISPs to deploy servers for the oracle service, it is
not clear whether ISPs are willing to incur additional costs by adding such
infrastructure. Finally, this proposal heavily relies on the ISP’s participation
and honesty.
Eugene et al [11] proposed a method to predict network distance between
two computers. The fundamental component of this method is Global
Network Positioning (GNP) which computes the coordinates of a computer
by seeing the whole network as a geometric space. First, it sets some
computers as landmarks and initiates these computers’ coordinates in the
geometric space. Then, a computer calculates its coordinates based on the
predefined landmarks. Given that the authors select a spherical surface
as their modeling geometric space, if a computer knows the coordinate of
its peer, it can compute its distance to the peer. Authors reported that
the correction between calculated GNP distance and measured distance is
0.915.
Predicting the distance of two P2P peers helps a P2P user choose a
nearby peer to download data from and thus potentially reduce data delay.
However, to set up landmarks — whose coordinates affect the accuracy of
the whole GNP system — is a challenge for a P2P network. The reason
is that, in a P2P network, it is difficult to find such peers who are always
online to work as landmarks. Moreover, the current P2P protocol must
be modified or extended to support peers that discover such landmarks.
Though we don’t integrate this approach in this master thesis, the proposal
motivates our choices and informs us that distance prediction can help in
choosing nearby peers.
Another solution given by Francis et al [12] shows a different way of
predicting distance between two computers. They built a system called
IDMaps to provide a service to other computers to check their distances as
23
well as a HOPS server, employed to provide the service. Given a computer A
who wants to know the distance to B, IDMaps calculates the distance from
A to its nearest Tracer C, and the distance from B to its nearest Tracer D.
Then, by calculating the distance from C and D, the sum of A-C, B-D and
C-D is the distance between A and B. Any computer can visit its HOPS
server to get its distance to others. Results of their research show that the
more tracers are in the system the more accurate is the calculated distance.
Both GNP and IDMaps services are focused on the distance prediction of
network users. Other studies and projects try to provide more information,
including bandwidth and AS-hops to applications to use. Haiyong Xie et
al [31] suggested a P4P service for both ISPs and P2P users. In P4P, a
network provider runs an iTracker which maintains network information
such as network backbone information and preferred inter-domain links of
the provider. A tracker or peer can contact the iTracker to obtain the
information and select its peers based on the information. Authors show that
if ISPs and P2P users know more information about each other, the traffic
caused by P2P applications can be reduced. According to their results, the
bottleneck link traffic was reduced up to 69%.
The IETF ALTO Working Group [28, 27] and the NAPA-WINE
project [5, 25] are trying to build more powerful services to provide more
network layer information to P2P applications. ALTO aims to provide
information such as bandwidth, cross-domain traffic and cost to the user.
To date, ALTO working group is still working on this project [10]. NAPA-
WINE does similar things to ALTO, moreover, it allows P2P applications
to query their own network status, including link congestion and preferred
routing. NAPA-WINE project claims that 50% traffic can be reduced on
long-haul links.
End-user based approaches
The above research efforts show that third-party entities have the ability
to make P2P applications more friendly to ISPs and achieve better
24
performance. From the P2P application point of view, the benefits are
not always obvious.
Cramer et al introduced a method to reduce the bootstrapping delay of
P2P applications [8]. Every peer hashes k-bit of its IP address prefix and
saves these hash values under a key in the DHT. When a peer comes to this
DHT, it tries to check the hash values of the IP prefix. If a peer’s prefix
hash matches its address, it is likely that they are in the same location in
terms of network topology. And, the longer the prefix match, the higher the
locality accuracy. If there are sufficient peers matching reasonable bits of
prefix, the peer can find enough peers. Otherwise, it tries to get the ones
matching shorter prefix bits.
This approach provides a quick way to filter peers who most likely are
not within the same location. Especially, in a very big DHT, peers can use
this method as the first filter to select potential peers. And based on the
results of prefix matching, a peer can further filter peers.
Lucia D’Acunto et al [9] focus on improving the QoS of the P2P system.
In their proposal, authors suggest that a peer who already has good QoS
should increase its allocated bandwidth for random peers. First, good QoS of
a peer means that its sequential progress is faster than the video playback
rate; in this case, this peer should increase its optimistic unchoke slots.
Then, by setting a higher priority for the new coming peers, the bootstrap
delay of these peers will be reduced. Their simulation results show that
average bootstrap delay is reduced by 48%, at the same time, there is no
negative impact on other peers’ performance.
Congestion Control
Most P2P clients are implemented in the application layer and use TCP
as their transport carrier. Applications such as PPLive, PPstream, Xunlei,
and BitTorrent open too many concurrent connections to other peers. Data
shows that PPLive can open up to 256 connections. In this case, when a
P2P client is running, other applications and users suffer bottleneck. For
25
example, imagine a usecase where a user has a 10MB Internet bandwidth
and a normal non-P2P application A that needs 1MB bandwidth to run. In
this case, if the user runs a P2P client who sets up more than 10 connections
to its peers, then each of the connections only has up to 1MB/11. In this
case, application A lacks enough bandwidth and cannot run correctly.
Yaning Liu et al [21] presented a way to detect congestion and a way
to raise or decrease the number of TCP connections. Their friendlyP2P
solution calculates the throughput of every P2P connection. By comparing
the current throughput with the previous recorded throughput, it knows
the change of a connection. If throughput of half of the connections is
decreased, it means that congestion was detected and P2P clients can reduce
the number of connections.
TCP is designed to fairly share the bottleneck with all connections [22]
thus a P2P application should consider the fairness with other applications
to make P2P more friendly to the network.
Our decisions
Most of the above approaches show us that involving third-party entities
requires upfront costs in order to support such peer recommendation services
for P2P applications. On the other hand, most studies also highlight the
importance of locality and keeping traffic within local ISPs.
From a P2P application’s point of view, using cheap and simple strategies
to select right peers to communicate with influences its overall performance.
When it comes to libswift, we consider it should be able to select peers
with the following properties: low latency (RTT), be as close as possible
(geographically), or be in the nearby ASes.
We mentioned that libswift prefers to run on top of UDP, and UDP
suggests that congestion control should be made in application layer. Thus,
libswift has more potentials to be more friendly to non-P2P applications
than other P2P applications who are running over TCP. In this thesis, we’ll
investigate libswift’s congestion control to see whether it is more friendly to
26
other non-P2P applications.
2.6 Experimental Evaluation
This section describes the procedure to analyze libswift and set up the
experimental environment. Later, we explain the tests we devised to test
libswift extensions and the software we used.
Procedure and testbed
In order to analyze and find vulnerabilities in libswift protocol, we inves-
tigated: peer exchange procedure, piece picking algorithm and congestion
control. First step is to study libswift’s source code [14] and verify if the
implementation functions according to protocol specification. This study
mainly focuses on: how peers exchange addresses with each other, how these
peers are used or managed, how libswift chooses which chunks of content
will be downloaded next, and what kind of congestion control is used.
We want to do our test in an environment where libswift can run
close to a real Internet setting. These tests require computers located in
different countries to run our code, since, in real world, P2P users are
located in different regions and we want our experimental setting to work
in a similar way. At the same time, our tests may cause lots of traffic
and some unknown results, thus we cannot interfere with the real Internet
infrastructure. Therefore, we have chosen PlanetLab as our platform to do
the tests. It currently consists of 1129 nodes [3] and these computers are
located in different countries.
In particular, we investigate the following issues more thoroughly:
• Peer exchange behavior — we construct a small size swarm to study
how a peer queries for news peers from the peers it already is connected
with. In particular, we want to investigate how PEX messages are
handled by libswift. Also, we want to analyze the download behavior
— which peer contributed more content to the local peer.
27
• Peer selection method — we construct a small size swarm with
peers that have different properties — RTT, location and AS-number.
Specifically, we selected sixteen computers that are located in different
countries and are diverse enough to be good candidates for the local
peer.
• Congestion control — LEDBAT congestion window size is highly
correlated with RTT. In this thesis, we observe how the size of
congestion window (CWND) increases and under what circumstances
CWND size decreases. We also analyze how an initial CWND size
affects data transfer rate at the beginning of the connection. Finally, in
order to know whether a bigger CWND size can improve performance,
we changed CWND size manually.
2.6.1 Extensions for libswift
After we investigated libswift’s peer exchange mechanism and peer selection
policy, we found that libswift does not employ any specific peer selection
policy. Moreover, libswift does not support PEX REQ message which
is necessary when using PEX protocol to exchange peers. Therefore, we
modified libswift’s source code to add support for PEX REQ message and
we decided to design an extension which we call peer selector adapter for
libswift to help the protocol manage and select its peers. To test our new
adapter (interchangeably, middleware or interface) we integrated it with a
peer selection mechanism — PeerSelector — which was developed by another
master thesis project [18].
Testing extensions and modifications
We configured two scenarios to verify the function of PEX REQ. First,
a new coming peer interacts with peers without sending a PEX REQ
message. Second, a new coming peer interacts with peers and sends a
PEX REQ message. The local peer should not receive peers in the first
28
case, even if it is in a swarm with more than a peer. However, it can receive
peers in the second case.
Since the functions of peer selector adapter can be verified together with
PeerSelector, we did not run any independent test case for the adapter.
PeerSelector employs four types of peer selection policies: geo-location,
RTT, AS-hops, and random. We tested each policy at least twice to exclude
Internet random variables and we run all the tests in the same swarm to
maintain the consistency of results. For each policy, we want to study if the
selection of peers based on a given property or metric influences libswift’s
download performance.
We then compare the results of the first three policies with the random
policy. Aside from evaluating how libswift’s download speed changes under
different policies, we also verified which peers were selected by PeerSelector
and whether such peers worked as expected.
2.6.2 Software resources and tools
We use MaxMind’s GeoIP [2] library to get the peers’ geographic location.
GeoIP is not only a free library but also provides an API to the user to
check city, country and continent for an IP address.
An mp4 file of approximately 40 megabytes was used as the target
content when we did the tests. This is a 14 minutes long video file. Usually,
a music video is just a half of this length. The download time of this video
is enough for us to observe libswift’s behavior.
To analyze download speed and draw graphs, we use gnuplot [1]. It is a
command-line driven graphing utility and it is free.
29
Chapter 3
Overview and Analysis
In this chapter, we provide an overview of libswift and discuss in greater
detail the internal functioning of the protocol. Mainly, we will focus our
discussion on three issues outlined in the previous chapter: peer discovery
— peer exchange and selection, and congestion control.
3.1 Peer discovery basics
When retrieving content, there are at least two participants. One is the peer
trying to obtain the content and thus it initiates the whole process, and the
other peer is responding back with data. We call the first peer a leecher and
the latter a seeder — who also acts as a tracker.
According to protocol specification, when a peer A wants new peer
addresses, it should follow the procedure illustrated in Figure 3.1. In step 1,
A explicitly sends a PEX REQ message to its connected peer B. In step 2,
peer B may respond with a PEX RSP message, which includes a peer C.
In step 3, peer A initiates a connection to C by sending a HANDSHAKE
message. In case peer A wants more peers, it sends PEX REQ messages to
both connected peers — B and C, where peer C may reply with a message
containing a peer D. At this point, peer A repeats step 3.
30
Figure 3.1: The flow of PEX messages
Leecher’s angle
Figure 3.2 illustrates the internal state of a leecher. When libswift runs as
a leecher, it starts by opening a new channel. First, it initiates itself and
creates a datagram containing a HANDSHAKE message, channel ID and
the roothash of the desired content. Then, it jumps into running state and
sends the datagram to the destination peer address. Finally, it waits for a
response from the destination peer. The length of waiting time is determined
by congestion control. After this waiting time is decided, the leecher jumps
to the waiting state.
Figure 3.2: State machine of the leecher
The destination peer may respond back with a datagram which contains
a HANDSHAKE and a HAV E message. While the first message informs
31
the leecher that the connection has been set up, the latter message informs
the leecher which content the destination has available.
Next, the leecher sends a HINT message to the destination, specifying
the content it wants. Again, it has to wait for the response, so the leecher
jumps back to the waiting state. Whenever it receives a datagram, it comes
back to its running state. If the received datagram contains a DATA
message, the leecher responds to the destination with an ACK message.
If in the meantime, the leecher is transferring data with other peers, it will
update them about its currently available content by sending them HAV E
messages.
Seeder’s angle
When libswift runs as a seeder, it means that this peer possesses content
it can send to other peers. In this case, libswift initializes itself and starts
listening on a certain port number. Then, it goes to waiting state. When
it receives a new datagram with a HANDSHAKE message, the seeder
will respond back with a HANDSHAKE message and its channel ID. In
the same datagram, there may be included HAV E messages to inform the
leecher of its currently available content. Now that the connection between
the seeder and leecher is established, the seeder goes to the waiting state
once more. This process is illustrated in Figure 3.3.
Figure 3.3: State machine of the seeder
32
3.2 Peer exchange and selection
To understand peer exchange behavior in libswift, we configured a small
swarm with fourteen peers. In addition, we set up an initial seeder A who,
at the same time, acts as a tracker. In this setting, our local peer will start
downloading data from seeder A.
To understand the inner process of peer exchange, we will use logs
generated by libswift. Keywords such as +hash, +hs, +hint, -have, -pex
correspond to libswift messages such as HASH, HANDSHAKE, HINT ,
HAV E and PEX ADD.
PEX messages
Figure 3.4 depicts the messages exchanged between the leecher and seeder
A. In line 3, leecher sends the first datagram to seeder A. The datagram
includes two messages: +hash — informing seeder A that the leecher wants
to download the indicated content (the string carried by this meesage is the
roothash of the file), and +hs — a HANDSHAKE message. From line 6 to
15, the leecher receives HANDSHAKE and HAVE messages from seeder A.
On lines 16 and 17, leecher sends a HINT message to seeder A. From line
18 and onwards, leecher receives a new datagram from seeder A, containing
PEX ADD (PEX RSP ) messages that inform the leecher about peers
seeder A has.
Finally, we can see from Figure 3.4 how the leecher does not send
PEX REQ messages to the seeder, however, it receives peers from the
seeder. This behavior is not compatible (goes against) with the protocol
specification which states that ”A peer that wants to retrieve some peer
addresses MUST send a PEX REQ message”.
Peer selection
As we’ve seen from the logs in the previous section, the leecher gets a subset
of peers from seeder A. Figure 3.5 further shows how the leecher tries to
33
Figure 3.4: Message flow I Figure 3.5: Message flow II
establish connections with the received peers, following the same sequence
of peers as it received them from seeder A. This implies that the current
implementation of libswift does not employ any peer selection strategy.
Piece picking algorithm
As previously explained, libswift downloads data by sending a HINT
message to the seeder. The seeder then sends DATA messages back to the
leecher, according to the HINT message which specifies the bin number.
The bin number represents the chunk that leecher wants to download, and
it is commonly determined (or picked) by the piece picking algorithm.
This means that, if we record the chunk index of a DATA message when
the leecher receives one, we can understand how libswift picks next pieces.
Figure 3.6 reveals that the current piece picking is linear, as most of the
chunks are downloaded sequentially. Roughly, we can observe that the first
received chunk has index 1, the second received chunk has index 2, and so
on; the last received chunk has the largest chunk index.
When data transfer is sequential, there are two or more concurrent
channels which will be competing with each other. This may often be the
case why data is dropped. Appendix A shows that the leecher already
34
Figure 3.6: Piece picking algorithm
received chunks 4256, 4257, 4258, and so on from peer 193.xxx.xxx.250, but
peer 192.xxx.xxx.11 still sends these chunks to the leecher. As a result,
the leecher drops data coming from 192.xxx.xxx.11. If libswift were to use
random piece picking algorithm to download pieces from these two peers,
the chances of chunks being dropped would have been lower.
3.3 Analysis of multiple-channel download
Properties of peers can affect libswift’s download performance. In this
section, we want to figure out how a peer property such as RTT affects
download speed. To study this issue, we set up three different groups of test
cases (see Fig. 3.8). The leecher is connected to two seeders, as shown in
Figure 3.7.
As shown in Figure 3.8, in every group, there are three peers. One peer
works as a leecher who downloads content from the other two seeders. The
first seeder also acts as the initial seeder and tracker, which means that, the
35
Figure 3.7: Three peers in the same swarm
leecher learns from this seeder about the second seeder’s address.
Figure 3.8: The setting for three groups
In the first group, the two seeders run on the same virtual machine,
so they have the same RTT. In the second group, each seeder runs on a
different computer in PlanetLab, however, they are located in the same AS
and running in the same laboratory, thus they have very similar RTT value.
In the last group, we run one seeder on the local virtual machine and the
other runs on PlannetLab. In this group, the two seeders are very different:
the first seeder has a very small RTT, compared to the RTT of the second
one. Finally, all tests for each group were run at least four times.
Observed behaviors
In the first group, two seeders have the same properties except their
port number. Results show that each of the two seeders provide half of
the content, moreover, download speed from two seeders is almost equal.
Figures 3.9 - 3.12 show four trials carried out for group one.
Figures 3.13 - 3.16 show the test results for the second group. From
the first two trials we can see that the content provided by two seeders is
disproportionate. Seeder one provides more content than seeder two and
36
Figure 3.9: Group 1, trial 1 Figure 3.10: Group 1, trial 2
Figure 3.11: Group 1, trial 3 Figure 3.12: Group 1, trial 4
the ratio is about 7 to 3. The last two trials show that the ratio becomes
more even, approximately 55 to 45. We should emphasize that the RTT of
the first seeder is smaller than that of the second seeder.
The last group shows very different results from the previous two.
All test cases show that the seeder running on the local virtual machine
provides much more content. Specifically, more than 90% of the content is
downloaded from seeder one, as shown in Figures 3.17 - 3.20.
Analysis
Let’s say that the leecher, seeder one, and seeder two are A, B, and C
respectively. A has two channels to B and C. A seeder typically has to
37
Figure 3.13: Group 2, trial 1 Figure 3.14: Group 2, trial 2
Figure 3.15: Group 2, trial 3 Figure 3.16: Group 2, trial 4
handle HAV E messages and send DATA messages. This process is further
illustrated in Figure 3.21 and detailed in the following:
• Step 1: records that the leecher already has (N-1) chunk, in order to
make sure this chunk is not sent again
• Step 2: waits to send the next datagram
• Step 3: creates a new datagram which containsDATA and INTEGRITY
messages
Assume that A just got a chunk of content from B or C and this chunk
of content is N-1. A then notifies both B and C with a HAV E message
38
Figure 3.17: Group 3, trial 1 Figure 3.18: Group 3, trial 2
Figure 3.19: Group 3, trial 3 Figure 3.20: Group 3, trial 4
(at time t0), confirming that A already got chunk N-1. B and C receive
this confirmation after half the RTT time, and they now are aware that A
needs its next chunk N. The time B and C spend on processing is γB and
γC respectively. Both B and C wait for τB and τC before sending the next
datagram; this timer is determined by congestion control. Once this timer
expires, B and C start preparing their datagrams with the next chunk during
time δB and δC respectively, and then send it to A.
After half the RTT time, A receives data from B and C. The data from
B arrives at A at:
39
Figure 3.21: Seeder’s HAVE and sent DATA messages
t0 +RTTB
2+ γB + τB + δB +
RTTB2
And data from C arrives at A at:
t0 +RTTC
2+ γC + τC + δC +
RTTC2
If:
(t0 +RTTB
2+ γB + τB + δB +
RTTB2
) < (t0 +RTTC
2+ γC + τC + δC +
RTTC2
)
then A will accept data from B, otherwise, A will accept data from
C. Because the RTT is always much larger than the time spent preparing
datagrams or waiting, RTT plays a more important role in our tests.
In the first group, RTT values are equal, thus two seeders provide the
same amount of content. But in the second group, RTT of the first seeder
is a little bit smaller than that of the second seeder. Consequently, the first
seeder contributes more content. Although the ratio becomes smaller in
the third and fourth trials, we still observe that the first seeder contributed
more content. In the last group, the first seeder has a much more smaller
RTT, so the data coming from it always arrives first at leecher compared to
the data coming from the second seeder. For example, if A receives chunk
40
N from B, it sends HAV E messages to B and C, saying ”I got the chunk
with bin number N”. Then B waits to send the next chunk. Whereas, if
C has not sent chunk N to A, C will cancel sending chunk N. However, if
C receives the HAVE notification after sending the chunk to A, A still can
receive the chunk and this chunk will be dropped by A. Therefore, we can
say the peers who have smaller RTT contribute more content and have a
faster data transfer rate.
3.4 Congestion control
Libswift uses LEDBAT at the sender side to control data transfer rate,
whereas the receiver side does not use any congestion control. According to
LEDBAT specification, a conservative implementation may skip slow-start
by setting an initial window size for LEDBAT. The purpose of slow-start is
to make a connection have a larger CWND (congestion window) as soon as
possible. In particular, there are two phases: slow-start — CWND increases
exponentially, and second phase — congestion control changes to LEDBAT,
after CWND has increased beyond a predefined threshold.
According to libswift protocol specification, libswift employs slow-start.
However, during our tests we found that as soon as the seeder gets the
first ACK message from the leecher, libswift changes from slow-start
to LEDBAT. This means that libswift changes to LEDBAT too early.
Because slow-start increases CWND much faster than LEDBAT, if libswift
transitions to LEDBAT too early, it means that libswift does not have a
big CWND. Often, this causes the data transfer rate to be very low at the
beginning of a new connection.
Figure 3.22 shows how CWND size changes with received ACK mes-
sages. Red color shows libswift’s initial window size of 1, whereas green
color illustrates the window size when manually changed to 70. A small
ratio of window size decrease means the current RTT increased; a very big
ratio of decrease is caused by the timeout of ACK messages. The red line
shows that libswift’s CWND size increased slowly. If the initial window size
41
Figure 3.22: Congestion window size
is set to a reasonable value, peers can get the first chunks faster. The green
line, representing a bigger window size, shows that our manually changed
— larger — window size exhibits better data transfer rate.
Figure 3.23: Overall view Figure 3.24: Detailed view
Figure 3.24 shows that if we set a bigger CWND size, we get faster
rate at the beginning. Although the overall download performance has not
improved, as can be seen in Figure 3.23, the time spend on downloading the
first few hundred chunks has decreased, as shown in Figure 3.24. It should
noted that the CWND of 200 in these illustrations is also a test number,
42
showing that if we have a larger CWND, download speed can be improved.
LEDBAT specification suggests that TARGET should be 100 millisec-
onds [17], however, libswift uses a value of 25 milliseconds. If libswift were to
run on a mobile phone, the current TARGET value is too big. In addition,
LEDBAT recommends a GAIN value of 1, whereas libswift implements a
much smaller value — 1/25000.
To summarize, congestion control in libswift needs further investigation
and experimentation to find suitable values for the initial window size,
TARGET, and GAIN.
43
Part II
Design and Implementation
44
Chapter 4
Protocol Extensions
As previously explained, current libswift implementation does not support
PEX REQ message. In this situation, the leecher will always receive peers
from seeders and there is no way to stop seeders from sending peers to the
leecher. Thus, we will complement the existing implementation to fulfill
protocol specification — ”A peer that wants to retrieve some peer addresses
MUST send a PEX REQ message”. Now, if the leecher does not need
additional peers, seeders will stop sending peers back.
Secondly, we know that some of the peers in the same swarm have better
performance. However, libswift does not have the ability to find which peer
is better to use. To manage peers better, we will extend libswift and devise
an interface module that enables a more efficient peer rotation mechanism
inside libswift and simply call it an adapter.
The rest of this chapter will be mainly divided into two sections:
explaining PEX modifications and the adapter extension.
4.1 PEX REQ design
We do not need a new component to support this functionality. Instead,
we employ a true-false switch into libswift. When a peer (sender) sends a
datagram to its connected peers, it checks the true-false switch. If the value
of the switch is true, it sends a PEX REQ, otherwise, it does not send
45
a PEX REQ. The switch is controlled by the adapter and the default is
true, but it can also be controlled by an external peer selection mechanism.
Whenever a peer (receiver) receives a PEX REQ, it sets the true-false state
to true in its corresponding channel. Before it sends the next datagram, it
checks the state. If the state is true, it adds the PEX RSP message to
the datagram, including current peers. After this, it switches the true-false
state to false. Figure 4.1 shows the process of adding a PEX REQ message.
Figure 4.2 shows the process of responding to a PEX REQ.
Figure 4.1: Sender side Figure 4.2: Receiver side
Two functions were added to instruct the channel to send and receive
PEX REQ:
• AddPexReq — is used to check the value of the switch. Then,
libswift adds PEX REQ message to the datagram that will be sent
to the receiver. If the switch’s value is false, it just sends the datagram
without including PEX REQ message
• OnPexReq — when a PEX REQ message is received, libswift uses
this function to set the flag to true to indicate that the sender wants
peers. Flag will be changed to false when “current peers” are added
to the datagram. Libswift checks the flag to decide whether it should
send its peers to the receiver
46
4.2 Design of the adapter
We designed and developed an interface module which acts as a middleware
between the existing libswift core and an external peer selection mechanism
(see Fig. 4.3). Therefore, we call it a peer selector adapter or shortly adapter.
Figure 4.3: Design overview
Libswift core
This module is responsible for interacting with other peers. It transfers
content, verifies content, constructs datagrams, sends datagrams to other
peers and receives them.
When libswift discovers new peers, it relays them to adapter. Similarly,
when libswift wants to open channels for communication with new peers, it
calls the adapter.
Adapter
The adapter is located between libswift and peer selection mechanism, as
shown in Figure 4.4. Adapter is highly coupled with libswift’s implementa-
tion and provides the interface to third-party peer selection mechanisms. As
soon as an external peer selection mechanism implements adapter’s interface,
these three components can work together.
The advantage of this design is that libswift is decoupled from external
mechanisms, enabling us to plug in (integrate) different mechanisms for peer
selection. Otherwise, the adapter saves peers’ addresses to a queue. When
47
libswift requests peers, the adapter returns peers from the queue, without
ranking them.
Figure 4.4: The interaction between components
Peer selection mechanism
The peer selection mechanism we tested with libswift is an externally
developed module called PeerSelector [18]. Once the adapter relays peers
to PeerSelector, the latter saves them to a database and then ranks them
according to the following peer ranking policies:
• RTT policy — the peer who has the smallest RTT value is placed
first, the peer who has the second smallest RTT is placed second, and
so on. RTT value is determined by probing the given peer (pinging).
• AS-hop count policy — ranks peers by calculating their AS-hop
count. For every peer it gets from the adapter, it calculates its AS-
hop count to the local peer using public BGP records.
48
• Geo-location policy — ranks peers by calculating their geographical
distance to the local peer. Geographical closeness is determined using
a third-party GeoIP database and API [2].
• Random policy — peers are returned randomly, no ranking applied.
PeerSelector provides three functions to us:
4.2.1 Interaction between components
Adapter and libswift
Figure 4.5: Adapter and libswift core
Libswift interacts with adapter in several cases. First, it asks the adapter
if it needs more peers. Also, it calls the adapter when libswift discovers new
49
peers and needs to store them locally. Furthermore, it calls the adapter when
libswift needs certain — ranked or not — peers. Figure 4.5 illustrates this
process. IsNeedPeer is called to check whether libswift needs to discover
new peers, by issuing PEX REQ messages. If new peers are discovered
(through PEX RSP ), libswift calls AddPeer to store such peers. Finally,
when libswift establishes a new channel, it calls GetBestPeer to obtain the
“the best” peer and connect to it.
Adapter and PeerSelector
Figure 4.6: Adapter and PeerSelector
Adapter relays discovered peers to PeerSelector. Figure 4.6 illustrates
their interaction. When libswift sets up a new channel, adapter will call
PeerSelector to get the “best” ranked peer. Otherwise, PeerSelector uses
handles such as Needpeer to request or not new peers.
Designed APIs
Two types of API were defined to facilitate the interaction between
components.
50
Figure 4.7: API used by PeerSelector
Figure 4.8: API used by libswift
51
Chapter 5
Implementation
Implementation work on this master thesis consists of modifications done
to add support for PEX REQ, implementation of the adapter module to
add support for a “peer rotate” mechanism, and integration of PeerSelector
with libswift source code.
5.1 Libswift modifications
Three files: swift.h, transfer.cpp and sendrecv.cpp were modified. Message
PEX REQ was added to swift.h and variables were added to transfer.cpp
(file transfer). Two functions — AddPexReq and OnPexReq — were also
added to support PEX functionality:
typedef enum {
. . .
SWIFT PEX REQ = 11 //make sw i f t suppor t PEX REQ
} message id t ;
class F i l eTran s f e r {
. . .
//We added some members to the sw i f t F i l eTrans f e r p r i v a t e
s e c t i on .
private :
S e l e c t o r ∗ pe e rS e l e c t o r ; // peer s e l e c t o r adapter po in t e r
FILE ∗ pFi l e ; // f i l e po in t e r to save l o g in format ion
52
int maxpeernum ; //how many peers the sw i f t want to use
int r o l e ; // to i d e n t i f y s w i f t . 0 i s seeder ,1 i s l e e c h e r
int s o r t ; // the s o r t p o l i c y used by sw i f t 0
score ,1 r t t , 2 as−hops ,3 //random
i n t 6 4 t s t a r t t p ; // timestamp o f f i r s t r e que s t sen t by
sw i f t
. . .
} ;
class Channel {
. . .
public :
. . .
void AddPexReq(Datagram& dgram) ; //add the PEX REQ to datagram
void OnPexReq(Datagram& dgram) ; // r e c e i v ed a PEX REQ
protected :
. . .
bool pee r r eq rvd ; //whether r e c e i v e r r e c e i v ed PEX REQ
} ;
Source Code 5.1: swift.h
Module transfer.cpp was modified to add support for the adapter. When
libswift discovers (receives) new peers, we use the adapter to save them
locally:
F i l eTran s f e r : : F i l eT ran s f e r ( const char∗ f i l ename , const Sha1Hash&
roo t ha sh ) :
f i l e ( f i l ename , r oo t ha sh ) , h s i n o f f s e t (0 ) , c b i n s t a l l e d (0 ) {
// In the cons t ruc t i on func t i on o f F i l eTrans f e r
. . .
p e e r S e l e c t o r = new Se l e c t o r ( ) ; // i n i t i a l the s e l e c t o r adapter
}
. . .
void F i l eTran s f e r : : OnPexIn ( const Address& addr ) {
. . .
i f (1 == r o l e ) {
53
pee rSe l e c t o r−>AddPeer ( addr , hash ) ; //We c a l l the adapter to
save peers .
std : : vector<Address> peer s = pee rSe l e c t o r−>GetPeers ( sort ,
this−>root hash ( ) ) ; // ge t peers back .
int counter = peer s . s i z e ( ) ;
i f ( counter >= SWARM SIZE) {
for ( int i = 0 ; i < maxpeernum ; i++) {
//And i f the open channe l s i f sma l l e r
than the maxpeernum
Address addrtemp = ( Address ) pee r s [ i ] ;
bool connected = fa l se ;
// judge whether connected to the
peer
for ( u int j = 0 ; j < h s i n . s i z e ( ) ; j++) {
Channel∗ c = Channel : : channel ( h s i n [ j ] ) ;
i f ( c && c−>t r a n s f e r ( ) . fd ( ) == this−>fd ( ) && c−>peer ( )
== addrtemp )
connected = true ; // a l r eady connected
}
i f ( ! connected ) {
//we s e t up new channel
new Channel ( this , Datagram : : d e f a u l t s o c k e t ( ) , addrtemp ) ;
}
}
}
. . .
}
Source Code 5.2: transfer.cpp
Module sendrecv.cpp was extended with two functions — AddPexReq
and OnPexReq:
b in64 t Channel : : OnData(Datagram& dgram) {
// l o g g i n g peer address , content l e n g t h and the r e c e i v ed time o f
t h i s p i e ce o f data .
f p r i n t f ( t r a n s f e r ( ) . pFi le , ” f i r s t r eque s t time=%l l d r e c e i v ed data
from %s data l ength=%d time=%l l d \n” , this−>t r a n s f e r ( ) .
54
s t a r t t p , addr . s t r ( ) , length ,NOW) ;
}
. . .
void Channel : : Reschedule ( ) {
. . .
//Before d e l e t e the channel we d e l e t e the peer in the peer
s e l e c t o r .
t r a n s f e r ( ) . p e e rSe l e c t o r−>DelPeer ( this−>peer ( ) , t r a n s f e r ( ) .
root hash ( ) ) ;
. . .
}
. . .
void Channel : : AddPexReq(Datagram& dgram) {
//Check i f we need send PEX REQ.
i f ( ! t r a n s f e r ( ) . p e e rSe l e c t o r−>IsNeedPeer ( ) ) return ;
//Yes , put PEX REQ message to datagram .
dgram . Push8 (SWIFT PEX REQ) ;
dp r i n t f ( ”%s #%u +pex req \n” , t i n t s t r ( ) , i d ) ;
}
void Channel : : OnPexReq(Datagram& dgram) {
//we r e c e i v ed a PEX REQ message , record i t .
pee r r eq rvd = true ;
d p r i n t f ( ”%s #%u −pex req \n” , t i n t s t r ( ) , i d ) ;
}
void Channel : : AddPex(Datagram& dgram) {
// check whether we need send peer to the o ther s i d e
i f ( ! p e e r r eq rvd ) return ; // i f f a l s e , j u s t re turn .
. . .
}
. . .
Source Code 5.3: sendrecv.cpp
5.2 Implementation of the adapter module
Two modules selector.h and selector.cpp were designed, implementing class
Selector. Functions used by libswift and PeerSelector are public. Functions
55
AddPeer, DelPeer, GetPeers, and IsNeedPeer are used by libswift, while
NeedPeer is used by PeerSelector.
In the private section, we have three variables: need peer — the switch
we designed to indicate whether libswift needs to send a PEX REQ
message, inpeers — queue used to save peers when there is a peer selection
mechanism, and p Btselector — pointer to PeerSelector instance.
class Se l e c t o r {
public :
S e l e c t o r ( ) ;
// add a peer to database
void AddPeer ( const Address& addr , const Sha1Hash& root ) ;
// d e l e t e a peer from database
void DelPeer ( const Address& addr , const Sha1Hash& root ) ;
// do not prov ide t h i s peer back in h a l f an hour
void SuspendPeer ( const Address& addr , const Sha1Hash& root ) ;
// whether peer s e l e c t o r wants more peers
void NeedPeer (bool need ) ;
// used by sw i f t core to judge whether need send
PEX REQ.
bool IsNeedPeer ( ) {return need peer ;}
// ge t peers from database . type : 0 score ,1 r t t , 2 as−
hops and 3 random
std : : vector<Address> GetPeers ( int type , const Sha1Hash&
f o r r o o t ) ;
// f u t u r e use
Address GetBestPeer ( ) ;
private :
bool need peer ;
// a l l peers r e c e i v ed by sw i f t , the queue t ha t
mentioned in s e c t i on 5 .1 . 12
deque<Address> i np e e r s ;
#ifde f BITswift
// the t h i r d par t peer s e l e c t o r
B i t sw i f t S e l e c t o r ∗ p Bt s e l e c t o r ;
#endif
56
} ;
Source Code 5.4: selector.h
Module selector.cpp implements adapter functions. We initialize PeerS-
elector and set the switch to true (default behavior) in the construction
function. Function AddPeer first checks whether PeerSelector is null; if null,
it saves peers to the inpeers queue. Similarly, function GetPeers checks if
PeerSelector is null; if null, it returns peers stored in the queue.
Se l e c t o r : : S e l e c t o r ( ) {
#ifde f BITswift
// i n i t a i l peer s e l e c t o r
p Bt s e l e c t o r = new B i t sw i f t S e l e c t o r ( ) ;
#endif
//by d e f au l t , s w i f t needs peers
NeedPeer ( true ) ;
}
void Se l e c t o r : : AddPeer ( const Address& addr , const Sha1Hash& root
) {
// i f t h e r e i s no peer s e l e c t o r , we j u s t save i t in a
queue .
i f ( p Bt s e l e c t o r == NULL)
inpe e r s . push back ( addr ) ;
// F i r s t we change the address o f peer and roo t hash
// to the s t y l e t h a t can be used by peer s e l e c t o r .
u in t 32 t ipv4 = ntoh l ( addr . addr . s i n addr . s addr ) ;
char r s [ 2 0 ] = { 0 } ;
s p r i n t f ( rs , ”%i .% i .% i .% i ” , ipv4 >> 24 , ( ipv4 >> 16) & 0 x f f , (
ipv4 >> 8)
& 0 x f f , ipv4 & 0 x f f ) ;
s t r i n g i pS t r i n g ( r s ) ;
// then we c a l l addpeer to save i t to peer s e l e c t o r .
p Bts e l e c to r−>addpeer ( ipSt r ing , addr . port ( ) , root . hex ( ) ) ;
}
void Se l e c t o r : : DelPeer ( const Address& addr , const Sha1Hash& root
) {
// i f t h e r e i s no peer s e l e c t o r , we d e l e t e the peer in
57
the queue
i f ( p Bt s e l e c t o r == NULL) {
for ( int i = 1 ; i < i np e e r s . s i z e ( ) ; i++) {
i f ( i np e e r s [ i ] == addr )
i npe e r s . e r a s e ( i npe e r s . begin ( ) + i ) ;
}
}
u in t 32 t ipv4 = ntoh l ( addr . addr . s i n addr . s addr ) ;
char r s [ 2 0 ] = { 0 } ;
s p r i n t f ( rs , ”%i .% i .% i .% i ” , ipv4 >> 24 , ( ipv4 >> 16) & 0 x f f , (
ipv4 >> 8)
& 0 x f f , ipv4 & 0 x f f ) ;
s t r i n g i pS t r i n g ( r s ) ;
// then c a l l d e l e t e p e e r to d e l e t e i t from peer s e l e c t o r
p Bts e l e c to r−>de l e t ep e e r ( ipSt r ing , addr . port ( ) , root . hex ( ) ) ;
}
std : : vector<Address> Se l e c t o r : : GetPeers ( int type , const Sha1Hash&
f o r r o o t ) {
std : : vector<i pPort s> i p p o r t l i s t ;
s td : : vector<Address> peer s ;
s td : : vector<Address> peer s ;
// i f t h e r e i s no peer s e l e c t o r , we ge t peer from our
queue .
i f ( p Bt s e l e c t o r == NULL) {
std : : copy ( i npe e r s . begin ( ) , i np e e r s . end ( ) , s td : : b a c k i n s e r t e r
( pee r s ) ) ;
return peer s ;
}
//Then , c a l l the f unc t i on ge t pe e r o f peer s e l e c t o r
p Bts e l e c to r−>ge tpee r ( f o r r o o t . hex ( ) , type , i p p o r t l i s t , 2 0 ) ;
//Then we change the data type t ha t w i l l be used by
sw i f t core .
for ( std : : vector<i pPort s > : : c o n s t i t e r a t o r j =
i p p o r t l i s t . begin ( ) ; j
!= i p p o r t l i s t . end ( ) ; ++j ) {
Address addr ( j−>ipAddress . c s t r ( ) , j−>port ) ;
pee r s . push back ( addr ) ;
}
58
return peer s ;
}
void Se l e c t o r : : SuspendPeer ( const Address& addr , const Sha1Hash&
root ) {
// f u t u r e use , need peer s e l e c t o r support , w i l l not a f f e c t our
experiment .
}
void Se l e c t o r : : NeedPeer (bool need ) {
need peer = need ;
}
Address S e l e c t o r : : GetBestPeer ( ) {
// f u t u r e use , need peer s e l e c t o r suppor t w i l l not a f f e c t our
experiment .
}
Source Code 5.5: selector.cpp
5.3 Integration with PeerSelector
PeerSelector implements BitSwiftSelector class in order to save peers to
the local database, and request “ranked” peers from the database. We use
this class, BitSwiftSelector.h and BitSwiftSelector.cpp respectively, to
access the functions of PeerSelector.
class B i tSw i f t S e l e c t o r {
public :
void addpeer ( s t r i n g ip , int port , s t r i n g hash ) ;
void de l e t ep e e r ( s t r i n g ip , int port , s t r i n g hash , int type =
100) ;
void ge tpee r ( s t r i n g hash , int type , std : : vector<i pPort s> &
ip por t , int count = 0) ;
. . .
} ;
Source Code 5.6: bitswiftSelector.cpp
59
Chapter 6
Experimental Evaluation
We conducted experiments to test the functions of our implemented
modules, as well as, make sure PEX REQ message modifications work as
expected. These experiments also evaluate if PeerSelector works according
to its design, as well as, observe if libswift’s download performance improves.
To summarize, first experiments test PEX REQ, the rest of experiments
evaluate adapter extension.
6.1 PEX modifications
Setup
We configured a swarm with three seeders running on the same computer
under different ports, and a leecher running on a separate machine. The
leecher connects to the initial seeder in two test cases. In the first test
case, the leecher calls NeedPeer(false) in adapter’s construction function.
This means that the leecher does not need peers and it should not send
PEX REQ to the initial seeder. In the second test case, the leecher calls
NeedPeer(true) in adapter’s construction function. This means that leecher
needs peers and it will send PEX REQ to the initial seeder.
60
Results
Figure 6.1 shows that if the leecher does not send a PEX REQ message
to the initial seeder, it can not receive peers. Figure 6.2 shows that if the
leecher sends a PEX REQ message to the initial seeder, it receives peers
as a result. Therefore our design and implementation work as expected.
Figure 6.1: Test case 1 Figure 6.2: Test case 2
6.2 Adapter extension
Testbed environment
To set up a simulation swarm, we have chosen sixteen nodes in PlanetLab
and they are running Fedora Linux 2.6, 32 bit OS. In the swarm, one
computer acts as leecher and all the other computers act as seeders.
Table 6.1 lists the properties of these computers. In geographic score, a
smaller value means closer to the leecher. And a smaller RTT means data
delay is lower. AS-hop and RTT values are relative to the leecher.
To run our implementation, we installed GNU Make 3.81 and gcc
version 4.4.4 on PlanetLab machines. To be able to run PeerSelector, we
61
IP Adress Port AS-hop Score RTT(ms) Description
193.xxx.xxx.35 leecher
193.xxx.xxx.36 8000 0 1 0.24327 seeder
128.xxx.xxx.20 8000 3 7955 141.721 seeder
128.xxx.xxx.21 8000 3 7955 141.71 seeder
193.xxx.xxx.250 8000 2 626 8.7664 seeder
193.xxx.xxx.251 8000 2 626 8.74429 seeder
129.xxx.xxx.194 8000 3 8572 153.402 seeder
128.xxx.xxx.197 8000 3 9375 246.033 seeder
128.xxx.xxx.198 8000 3 9375 250.857 seeder
128.xxx.xxx.199 8000 3 9375 252.049 seeder
140.xxx.xxx.180 8000 2 9632 313.362 seeder
128.xxx.xxx.52 8000 4 7882 142.071 seeder
128.xxx.xxx.53 8000 4 7882 145.013 seeder
200.xxx.xxx.34 8000 3 9905 269.373 seeder
192.xxx.xxx.11 8000 2 525 0.874 initial seeder
192.xxx.xxx.12 8000 2 525 0.86848 seeder
Table 6.1: PlanetLab machine properties
installed mysql-devel, mysql-server and mysql++; PeerSelector uses mysql
to save peer addresses. Finally, we also installed curl-devel and GeoIP, as
PeerSelector uses such utilities to retrieve the location for an IP address.
We put the target file on all seeders. Then we choose two machines:
one running as a leecher and another as an initial seeder. In our tests,
193.xxx.xxx.35 is the leecher and 192.xxx.xxx.11 is the initial seeder. Initial
seeder has other seeders’ IP addresses. These peers will be sent to the leecher
as soon as the connection is set up. Since the initial seeder tracks the rest
of the peers and it is the first peer the leecher contacts, it is considered also
a tracker.
When the leecher is launched, it reads a configuration file named config
for three properties:
• ROLE — defines libswift to function as a leecher or seeder. We set
62
Seeder IP Score
A 193.xxx.xxx.250 626
B 193.xxx.xxx.36 1
C 192.xxx.xxx.12 525
D 193.xxx.xxx.251 626
Table 6.2: Score-selected peers
this value to 1, which means this is a leecher
• PEERNO — defines how many concurrent peers the leecher can
communicate with. In our tests, the number is 5
• SORT — defines which policy should be used to sort peers. Number
0 stands for geographic location, number 1 stands for AS-hop, number
2 stands for RTT, and finally number 3 stands for Random.
6.2.1 Experimental results
We tested adapter module with four policies of PeerSelector. In the following
section, we show some of the obtained results for each policy.
Score
Score policy ranks peers according to the distance of two peers. The closest
peer is the first peer which is given back by PeerSelector. From Tab. 6.2,
we know that seeder B is the closest to the leecher. Seeder C is the second
closest.
Figure 6.3 shows that the leecher spent 42 seconds to finish downloading
the target file during the first trial and 48 seconds during the second trial.
To download the content (without the initial waiting time), the leecher takes
37 seconds in the first trial, and 29 seconds in the second trial.
From Figure 6.3 we cannot see an obvious trend. However, in Figures 6.4
and 6.5, we see that seeder B contributed about 80% of the content. If we
check the score saved by PeerSelector, we can see that seeder B has the
63
Figure 6.3: Download performance for Score
lowest score. It should also be noted that the initial delay is caused by the
peer sorting process, which was even longer due to an implementation bug
in the first trial (see Fig. 6.4).
Figure 6.4: Score, trial 1 Figure 6.5: Score, trial 2
AS-hop count
This policy ranks peers according to the AS-hop of two peers. For example,
if peers A and B are in the same AS and they have the same AS number,
then the hop count from A to B is zero. Therefore, peer A’s PeerSelector
64
ranks peer B in the first place.
Seeder IP Hop Count
A 193.xxx.xxx.250 2
B 193.xxx.xxx.36 0
C 192.xxx.xxx.12 2
D 193.xxx.xxx.180 2
Table 6.3: AS-hops-selected peers
Figure 6.6 shows that we spent 36 seconds to finish the first trial and
spent 51 seconds to finish the second trial. The leecher takes 34 seconds to
download the content (without waiting time) during the first trial, and 48
seconds during the second trial.
Figure 6.6: Download performance for AS-hops
From Figures 6.7 and 6.8 we can see that one seeder contributed more
than 70% of the content. From Tab. 6.3, we learn that the AS-hop count to
seeder B is zero and to other seeders is two. Obviously, seeder B has better
performance compared to other seeders; it not only has a zero AS-hop to
the leecher, but also has the lowest distance to the leecher. Finally, we can
see from Figure 6.8 that during the time interval 15 to 22, seeder B stopped
65
working. This can happen if seeder B’s computer runs at 100% CPU or does
not have enough memory. If seeder B’s gateway stops working temporarily,
it may also cause this problem.
Figure 6.7: AS-hops, trial 1 Figure 6.8: AS-hops, trial 2
RTT policy
This policy ranks peers based on the round trip time of two computers.
Therefore, the peer who has the smallest RTT to the leecher will be placed
in the first place on PeerSelector’s queue.
Seeder IP RTT
A 193.xxx.xxx.250 8.7664
B 193.xxx.xxx.36 0.24327
C 192.xxx.xxx.12 0.86848
D 193.xxx.xxx.251 8.74429
Table 6.4: RTT-selected peers
Figure 6.9 shows that we got very different results compared to previous
two policies. Overall, we spent more than 100 seconds to finish two trials.
However, if we ignore the time spent before the real download started, time
is not very long, 30 and 31 seconds respectively. The reason for such a delay
is that the PeerSelector version we tested spends a very long time calculating
the RTT of peers. In our test case, PeerSelector pings fourteen peers to get
66
the RTT value; it pings each peer 10 times to get the average RTT, totaling
140 times.
Figure 6.9: Download performance for RTT
Figures 6.10 and 6.11 show that seeder B provides most of the content
and seeder A contributed just the second most of the content. RTT values
in Tab. 6.4 indicate that seeder B has the smallest value and is thus ranked
as first by PeerSelector. In the first trial, we can see that at time 110 all
lines are flat. This can happen if leecher’s computer runs at 100% CPU or
does not have enough memory.
Figure 6.10: RTT, Trial 1 Figure 6.11: RTT, Trial 2
67
Random policy
When we use this policy to select peers, the PeerSelector does not rank
peers. If libswift wants peers from PeerSelector, the latter returns a random
peer list back.
The first trial took the leecher 41 seconds to finish and the second trial
took the leecher 49 seconds. If we deduct the time spent before libswift starts
downloading content, time periods are 46 and 40 seconds (see Fig. 6.12).
Figure 6.12: Download performance for Random
According to Figures 6.13 and 6.14, there is no seeder superior than
others. All peers provided by PeerSelector are randomly selected.
6.3 Comparison of policies
In Table 6.5 we can see that RTT policy is superior to other policies, and
score is the second best. Random policy is equivalent to libswift downloading
from peers according to its default behavior.
Compared to Random policy, on average, score policy improves download
speed by 31% and RTT improves performance up to 40%. As-hop policy
exhibits very different results: in the first trial — performance improves
68
Figure 6.13: Random, Trial 1 Figure 6.14: Random, Trial 2
Policies Trials Time (s) DW speed (MBps)
Score trial 1 37 1.06
trial 2 29 1.36
AS-hops trial 1 34 1.16
trial 2 48 0.82
RTT trial 1 30 1.32
trial 2 31 1.27
Random trial 1 46 0.86
trial 2 40 0.98
Table 6.5: Comparison of policy-based performances
26%, in the second trial — speed decreases by 11%. Based on our logs, we
know that there are 5 peers whose AS-hop count is two and 7 peers whose
AS-hop count is three. When PeerSelector ranks peers with AS-hop count
equaling two, it cannot tell the difference between these 5 peers. Thus, given
1000 peers, there could be many peers whose AS-hop count would be two
but have different capabilities. In this case, libswift may not always get
peers with best performance. However, we can use AS-hop count policy as
the first filter. Based on the first filter’s results, we can apply a second filter
such as RTT to further select better peers.
69
6.3.1 Side effects of PeerSelector
We have seen that PeerSelector needs time to rank peers. We can name this
time as the ranking time. The more peers we have, the longer the ranking
time. There are two ways which can help reduce this time. First, optimize
PeerSelector’s ranking method. For example, when using RTT, we can only
ping each peer one time instead of 10 times, reducing ranking time 10 times.
Second, rank peers based on historical records. For example, if PeerSelector
already has peer A’s geographic score in the database, next time, when
PeerSelector encounters peer A, it will not calculate A’s score again.
70
Part III
Discussion and Conclusions
71
Chapter 7
Discussion
In this thesis, we took libswift protocol and analyzed its most important
processes and inner heuristics. In particular, we studied the peer discovery
process employed in libswift — gossiping algorithm PEX — and discovered
its limitations. Moreover, we modified PEX messaging and tested its new
functionality. Through experimentation and observation, we investigated
libswift behavior when it retrieves pieces from multiple channels. Finally,
we studied piece picking and congestion control, emphasized their vulnera-
bilities, and suggested future enhancements for them.
Having studied the above processes and having understood current
limitations in the implementation, we designed and implemented a simple
module — adapter — that acts as an interface between libswift and a peer
selection mechanism, helping libswift manage its peers more effectively.
In our experiments, we also discovered that some of the peers contribute
more content than others. In particular, we found that peers with certain
properties exhibit better performance and thus influence libswift’s download
performance. Our tests with peers that have small RTT value, AS-hop
count, or are geographically closer provide some preliminary evidence that
they can help improve download performance for a libswift peer.
72
7.1 Libswift
We found that libswift downloads content unevenly from peers it is
connected with. In case peers have similar RTTs, then libswift retrieves
content from such peers roughly equally. We observe this phenomenon in
test cases for group one and two in Section 6.2.1, where the two channels
between the leecher and seeders have balanced traffic. However, in test
cases with group three, we see that one channel is very busy with the seeder
contributing more than 90% of the content, while the other channel is almost
idle with its seeder waiting to be updated with the latest bin number. This
uneven contribution is due to libswift’s piece selection strategy, as well as, its
congestion control. Currently, libswift downloads content pieces sequentially
and peers (seeders) compete to serve the leecher. In the test case with group
three, we have seeder two whose RTT is approximately 487 times larger than
that of seeder one. According to LEDBAT, congestion window size of seeder
one increases much faster than that of seeder two, therefore seeder one has
a much higher data transfer rate.
We consider that an improved piece selection mechanism should be
adopted in libswift for an enhanced performance, similar to BitTorrent’s.
By randomly distributing pieces among channels, download efficiency should
improve as there is less chance that multiple peers will be serving the same
chunk to the leecher.
When evaluating our adapter module, we found another interesting
result. When libswift wants to connect to peers, it sends a handshake
message to “best” peers that were returned by PeerSelector; however, the
leecher does not always receive responses from such best peers first. In
experiments from Section 6.2, we saw that seeder B has a smaller RTT
value than seeder D and other seeders. In this scenario, though libswift sent
a handshake to seeder B first, it was seeder D that responded to the leecher
first.
If we investigate this issue more closely — when libswift wants to
download content from these peers — we can divide the whole process in two
73
phases. During the first phase, peers who received a handshake request will
initialize new channels to the leecher. This initialization time depends on
the machine’s computing speed. Therefore, if seeder D has faster computing
speed than seeder B, seeder D may respond to the leecher before seeder B
does. We also did a quick test about the computing speed of machines, and
the results confirmed that seeder D indeed exhibits a much faster computing
speed. Consequently, we can imply that RTT is not the only factor affecting
response time. In this circumstance, if a leecher were to download a very
small target file, it may be better to download from a machine with better
computing speed than from a machine with lower RTT.
7.2 Extensions
Before enhancing PEX functionality, libswift would continue to receive peers
from seeders passively thus violating protocol specification. Furthermore, if
swarm were to be as large as say 5000, libswift instance would receive all
5000 peers sooner or later. Modifying current PEX behavior to receive peers
only when actively requesting for them avoids such issues.
Secondly, the adapter extension we devised enables libswift to manage
peers it discovers more efficiently. In particular, libswift has the ability to
describe its peers in terms of RTT, geo-location, and AS-hop count when
combining the adapter with a peer selection mechanism. This feature,
in turn, enables libswift fine-tune its performance by selecting peers with
desired properties.
Regarding policies for peer selection, there are certain assumptions we
had about such policies that are not necessarily true. For example, RTT
policy assumes that if a peer has a better RTT value to the leecher, download
speed would be faster. However, let’s consider an extreme situation: RTT
value of seeder one is only a tenth of that of seeder two, but seeder two is
one hundred time faster than seeder one. In this case, seeder two gives a
better download performance. This phenomenon also happened in our tests.
Further, the theory behind score policy is that if two computers are
74
geographically closer, these two computers have a shorter wire link and
traffic transfer between them is faster. This is only true if we do not consider
computer’s speed and traffic delay between two different ASes.
In our implementation, we considered these properties independently in
order to decrease complexity of the design and realization. If we were to
take other factors (peer properties) and the correlation of these factors into
consideration, system performance could improve more.
75
Chapter 8
Conclusions
In this thesis, we argued that peer management is an important mechanism
that affects the performance of libswift protocol. Through analysis and
experimentation, we showed that peer management indeed impacts libswift’s
download speed and efficiency.
In particular, we investigated and profiled libswift behavior for scenarios
with congestion and multiple-channel download. As a result of our research,
we identified vulnerabilities in the peer management and piece picking
processes of libswift protocol and suggested improvements for them.
Specifically, we complemented the current peer discovery algorithm such
that a peer can discover new peers proactively rather than passively, thus
reducing the overhead of establishing unnecessary channels of communica-
tion.
The main lesson we drew from analyzing the piece picking algorithm is
that the algorithm raises competition among seeders, thus an alternative,
random piece picking algorithm should be considered.
The important inferences we can make about congestion control are that
changing the size of congestion window impacts data transfer rate and thus
download speed, therefore, more suitable values for TARGET and GAIN
should be investigated.
Having made the above inferences, we came to a natural conclusion that
76
a peer management mechanism was necessary for libswift to manage its
peers more effectively. Therefore, we designed and implemented an adapter
module that libswift can use to select and manage peers, and as a result,
improve download speed and reduce overhead. Preliminary results in this
thesis suggested that our adapter worked as designed, moreover, improved
libswift’s download performance.
77
Appendix A
Appendix
. . .
192 . xxx . xxx . 11 : 8000 1024 4063544 chunk=(0 ,4244)
193 . xxx . xxx . 250 : 8000 1024 4064668 chunk=(0 ,4256)
192 . xxx . xxx . 11 : 8000 1024 4065273 chunk=(0 ,4245)
193 . xxx . xxx . 250 : 8000 1024 4067050 chunk=(0 ,4257)
192 . xxx . xxx . 11 : 8000 1024 4070193 chunk=(0 ,4246)
192 . xxx . xxx . 11 : 8000 1024 4070870 chunk=(0 ,4247)
193 . xxx . xxx . 250 : 8000 1024 4071050 chunk=(0 ,4258)
192 . xxx . xxx . 11 : 8000 1024 4071526 chunk=(0 ,4248)
192 . xxx . xxx . 11 : 8000 1024 4072356 chunk=(0 ,4249)
193 . xxx . xxx . 250 : 8000 1024 4072445 chunk=(0 ,4259)
192 . xxx . xxx . 11 : 8000 1024 4072922 chunk=(0 ,4250)
193 . xxx . xxx . 250 : 8000 1024 4073411 chunk=(0 ,4260)
193 . xxx . xxx . 250 : 8000 1024 4073611 chunk=(0 ,4261)
192 . xxx . xxx . 11 : 8000 1024 4074055 chunk=(0 ,4251)
192 . xxx . xxx . 11 : 8000 1024 4074512 chunk=(0 ,4252)
193 . xxx . xxx . 250 : 8000 1024 4074897 chunk=(0 ,4262)
192 . xxx . xxx . 11 : 8000 1024 4075262 chunk=(0 ,4253)
193 . xxx . xxx . 250 : 8000 1024 4075684 chunk=(0 ,4263)
192 . xxx . xxx . 11 : 8000 1024 4076110 chunk=(0 ,4254)
193 . xxx . xxx . 250 : 8000 1024 4076399 chunk=(0 ,4264)
192 . xxx . xxx . 11 : 8000 1024 4076504 chunk=(0 ,4255)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4077195 chunk=(0 ,4256)
193 . xxx . xxx . 250 : 8000 1024 4077316 chunk=(0 ,4265)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4077382 chunk=(0 ,4257)
78
193 . xxx . xxx . 250 : 8000 1024 4078310 chunk=(0 ,4266)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4079086 chunk=(0 ,4258)
193 . xxx . xxx . 250 : 8000 1024 4079135 chunk=(0 ,4267)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4079658 chunk=(0 ,4259)
193 . xxx . xxx . 250 : 8000 1024 4080033 chunk=(0 ,4268)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4080275 chunk=(0 ,4260)
193 . xxx . xxx . 250 : 8000 1024 4080707 chunk=(0 ,4269)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4081169 chunk=(0 ,4261)
193 . xxx . xxx . 250 : 8000 1024 4081527 chunk=(0 ,4270)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4082012 chunk=(0 ,4262)
193 . xxx . xxx . 250 : 8000 1024 4082434 chunk=(0 ,4271)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4083269 chunk=(0 ,4263)
193 . xxx . xxx . 250 : 8000 1024 4083489 chunk=(0 ,4272)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4084073 chunk=(0 ,4264)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4084332 chunk=(0 ,4265)
193 . xxx . xxx . 250 : 8000 1024 4084674 chunk=(0 ,4273)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4087751 chunk=(0 ,4266)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4089016 chunk=(0 ,4267)
193 . xxx . xxx . 250 : 8000 1024 4089434 chunk=(0 ,4274)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4090456 chunk=(0 ,4268)
193 . xxx . xxx . 250 : 8000 1024 4090517 chunk=(0 ,4275)
193 . xxx . xxx . 250 : 8000 1024 4110318 chunk=(0 ,4276)
193 . xxx . xxx . 250 : 8000 1024 4111347 chunk=(0 ,4277)
193 . xxx . xxx . 250 : 8000 1024 4111416 chunk=(0 ,4278)
193 . xxx . xxx . 250 : 8000 1024 4112552 chunk=(0 ,4279)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4112725 chunk=(0 ,4269)
193 . xxx . xxx . 250 : 8000 1024 4113319 chunk=(0 ,4280)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4113882 chunk=(0 ,4270)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4114206 chunk=(0 ,4271)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4114715 chunk=(0 ,4272)
193 . xxx . xxx . 250 : 8000 1024 4115820 chunk=(0 ,4281)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4116120 chunk=(0 ,4273)
193 . xxx . xxx . 250 : 8000 1024 4116515 chunk=(0 ,4282)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4117447 chunk=(0 ,4274)
193 . xxx . xxx . 250 : 8000 1024 4117895 chunk=(0 ,4283)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4118699 chunk=(0 ,4275)
193 . xxx . xxx . 250 : 8000 1024 4119161 chunk=(0 ,4284)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4119712 chunk=(0 ,4276)
79
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4119832 chunk=(0 ,4277)
193 . xxx . xxx . 250 : 8000 1024 4121005 chunk=(0 ,4285)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4121305 chunk=(0 ,4278)
193 . xxx . xxx . 250 : 8000 1024 4121674 chunk=(0 ,4286)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4122011 chunk=(0 ,4279)
193 . xxx . xxx . 250 : 8000 1024 4125762 chunk=(0 ,4287)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4126226 chunk=(0 ,4280)
193 . xxx . xxx . 250 : 8000 1024 4126473 chunk=(0 ,4288)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4127015 chunk=(0 ,4281)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4127387 chunk=(0 ,4282)
193 . xxx . xxx . 250 : 8000 1024 4127633 chunk=(0 ,4289)
193 . xxx . xxx . 250 : 8000 1024 4127696 chunk=(0 ,4290)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4128487 chunk=(0 ,4283)
193 . xxx . xxx . 250 : 8000 1024 4128799 chunk=(0 ,4291)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4132360 chunk=(0 ,4284)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4132826 chunk=(0 ,4285)
193 . xxx . xxx . 250 : 8000 1024 4134114 chunk=(0 ,4292)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4135121 chunk=(0 ,4286)
193 . xxx . xxx . 250 : 8000 1024 4135173 chunk=(0 ,4293)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4135666 chunk=(0 ,4287)
193 . xxx . xxx . 250 : 8000 1024 4135717 chunk=(0 ,4294)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4136038 chunk=(0 ,4288)
193 . xxx . xxx . 250 : 8000 1024 4136473 chunk=(0 ,4295)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4136535 chunk=(0 ,4289)
193 . xxx . xxx . 250 : 8000 1024 4137154 chunk=(0 ,4296)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4137261 chunk=(0 ,4290)
193 . xxx . xxx . 250 : 8000 1024 4142614 chunk=(0 ,4297)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4142709 chunk=(0 ,4291)
193 . xxx . xxx . 250 : 8000 1024 4143231 chunk=(0 ,4298)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4143429 chunk=(0 ,4292)
193 . xxx . xxx . 250 : 8000 1024 4143486 chunk=(0 ,4299)
( droped ) 192 . xxx . xxx . 11 : 8000 1024 4144043 chunk=(0 ,4293)
. . .
Source Code A.1: Logs
80
Bibliography
[1] Gnuplot. http://www.gnuplot.info/.
[2] MaxMind GeoIP. http://www.maxmind.com/app/ip-location.
[3] Planetlab testbed. https://www.planet-lab.org/.
[4] V. Aggarwal, A. Feldmann, and Ch. Scheideler. Can ISPs and P2P users
cooperate for improved performance? SIGCOMM Comput. Commun.
Rev., 2007.
[5] R. Birke, E. Leonardi, M. Mellia, A. Bakay, T. Szemethy, C. Kiraly, and
L. Cigno. Architecture of a network-aware P2P-TV application: the
NAPA-WINE approach. In IEEE Communications Magazine, 2011.
[6] Bram Cohen. Bittorrent protocol specification.
http://www.bittorrent.org/beps/bep 0003.html.
[7] Bram Cohen. Incentives build robustness in bittorrent. In Proceedings
of the Workshop on Economics of Peer-to-Peer Systems, 2003.
[8] C. Cramer, K. Kutzner, and Th. Fuhrmann. Bootstrapping locality-
aware P2P networks. In 12th IEEE International Conference in
Networks (ICON), 2004.
[9] L. D Acunto, N. Andrade, J. Pouwelse, and H. Sips. Peer selection
strategies for improved QoS in heterogeneous bittorrent-like VoD
systems. In IEEE International Symposium in Multimedia (ISM), 2010.
81
[10] R. Alimi et al. ALTO protocol. http://tools.ietf.org/html/draft-ietf-
alto-protocol-13.
[11] T. Eugene and H. Zhang. Predicting Internet network distance with
coordinates-based approaches. In INFOCOM, 2002.
[12] P. Francis, S. Jamin, V. Paxson, L. Zhang, D. Gryniewicz, and Y. Jin.
An architecture for a global Internet host distance estimation service.
In INFOCOM, 1999.
[13] Victor Grishchenko. Libswift protocol. http://libswift.org/.
[14] Victor Grishchenko. Libswift protocol implementation.
http://github.com/triblerteam/libswift.
[15] Victor Grishchenko and Arno Bakker. IETF PPSP working
group draft technical description, Peer-to-Peer Streaming Protocol.
http://datatracker.ietf.org/doc/draft-ietf-ppsp-peer-protocol/.
[16] BitTorrent Inc. DHT protocol specification.
http://www.bittorrent.org/beps/bep 0005.html/.
[17] H. Iyengar, M. Kuehlewind, and S. Shalunov. Low extra delay
background transport (LEDBAT). http://tools.ietf.org/html/draft-
ietf-ledbat-congestion-09.
[18] Rakesh Kumar. PeerSelector: A framework for grouping peers in a P2P
system. 2012.
[19] R. LaFortune and C. Carothers. Simulating large-scale P2P assisted
video streaming. In HICSS, 2009.
[20] F. Lehrieder, S. Oechsner, T. Hossfeld, Z. Despotovic, W. Kellerer, and
M Michel. Can P2P-users benefit from locality-awareness? In IEEE
Proceedings of Peer-to-Peer Computing, 2010.
82
[21] Y. Liu, H. Wang, Y. Lin, and Sh. Cheng. Friendly P2P: Application-
level congestion control for peer-to-peer applications. In IEEE
GLOBECOM, 2008.
[22] Jamshid Mahdavi and Sally Floyd. TCP-friendly unicast rate-based
flow control. ftp://ftp.cs.umass.edu/pub/net/cs691q/tcp-friendly.txt.
[23] D. Milojicic, V. Kalogeraki, R. Lukose, K. Nagaraja, J Pruyne,
B. Richard, S. Rollins, and Z. Xu. Peer-to-Peer Computing. 2003.
[24] J. Pouwelse, P. Garbacki, D. Epema, and H. Sips. An introduction to
the Bittorrent Peer-to-Peer file-sharing system. Report, 2005.
[25] NAPA WINE Project. Network-aware P2P-TV application over wise
networks. http://www.napa-wine.eu.
[26] Rudiger Schollmeier. A definition of peer-to-peer networking for
the classification of peer-to-peer architectures and applications. In
Proceedings of P2P, 2002.
[27] J. Seedorf, S. Kiesel, and M. Stiemerling. Traffic localization for
P2P-applications: The ALTO approach. In IEEE Ninth International
Conference in Peer-to-Peer Computing, 2009.
[28] Jan Seedorf. Application-Layer Traffic Optimization. RFC 5693, 2009.
[29] R. Steinmetz and K. Wehrle. Peer-to-Peer Systems and Applications.
2005.
[30] H. Wang, J. Liu, and K. Xu. On the locality of Bittorrent-based video
file swarming. In Proceedings of IPTPS. Usenix, 2009.
[31] H. Xie, R. Yang, A. Krishnamurthy, G. Liu, and A. Silberschatz. P4P:
provider portal for applications. In Proceedings of the ACM SIGCOMM
2008 conference on Data communication.
83