LIBSWIFT P2P PROTOCOL: AN ANALYSIS AND EXTENSION

Master Thesis - TRITA-ICT-EX-2012-262

LIBSWIFT P2P PROTOCOL: AN

ANALYSIS AND EXTENSION

Fu Tang

Design and Implementation of ICT products and systems

Royal Institute of Technology (KTH)

[email protected]

October 30th, 2012

Supervisor: Flutra Osmani

Examiner: Bjorn Knutsson

Department: ICT School, NSLab

Royal Institute of Technology (KTH)

1

Abstract

More and more end-users are using P2P protocols for content sharing,

on-demand and live streaming, contributing considerably to overall Internet

traffic. A novel P2P streaming protocol named libswift was developed

to enable people experience a better service by consuming less resources

and transferring less unnecessary functions and metadata. This master

thesis studies the inner functioning of libswift and analyzes some of the

vulnerabilities that directly impact performance of the protocol, namely

download speed and response delay.

By investigating the behavior of libswift in scenarios with multiple peers,

we found that the lack of a peer selection mechanism inside the protocol

affects download efficiency and response time. We also discovered that

libswift’s internal piece picking algorithm raises competition among peers,

thus not fully utilizing connected peers. In addition, we found that current

libswift implementation does not follow the specification for PEX peer

discovery, thus we modified PEX algorithm to support another message

that is used to proactively request new peers from the currently connected.

Having made these observations, we designed and implemented a peer

selection extension interface that allows for third-party peer selection

mechanisms to be used with libswift protocol. Apropos, we tested the

interface (or adapter) with an example peer selection mechanism that groups

peers according to properties such as latency and locality. Preliminary

experimental data shows that using our extension with an external peer

selection mechanism enables libswift to select peers based on various metrics

and thus enhances its download speed.

We argue that libswift is a good protocol for next generation content

delivery systems and it can get faster data transfer rates and lower latency

by integrating efficient peer selection mechanisms.

Contents

I Background 7

1 Introduction 8

1.1 Basic Introduction of Libswift . . . . . . . . . . . . . . . . . . 8

1.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . 9

1.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.4 Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.5 Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.5.1 Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.6 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 Background and Related Work 12

2.1 Peer-to-Peer networks . . . . . . . . . . . . . . . . . . . . . . 12

2.2 Peer-to-Peer protocols . . . . . . . . . . . . . . . . . . . . . . 13

2.3 BitTorrent protocol . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3.1 Peer tracking and piece picking . . . . . . . . . . . . . 15

2.4 Libswift protocol . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.4.1 Basic operations in libswift . . . . . . . . . . . . . . . 20

2.5 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.6 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . 27

2.6.1 Extensions for libswift . . . . . . . . . . . . . . . . . . 28

2.6.2 Software resources and tools . . . . . . . . . . . . . . . 29

3 Overview and Analysis 30

3.1 Peer discovery basics . . . . . . . . . . . . . . . . . . . . . . . 30

1

3.2 Peer exchange and selection . . . . . . . . . . . . . . . . . . . 33

3.3 Analysis of multiple-channel download . . . . . . . . . . . . . 35

3.4 Congestion control . . . . . . . . . . . . . . . . . . . . . . . . 41

II Design and Implementation 44

4 Protocol Extensions 45

4.1 PEX REQ design . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.2 Design of the adapter . . . . . . . . . . . . . . . . . . . . . . 47

4.2.1 Interaction between components . . . . . . . . . . . . 49

5 Implementation 52

5.1 Libswift modifications . . . . . . . . . . . . . . . . . . . . . . 52

5.2 Implementation of the adapter module . . . . . . . . . . . . . 55

5.3 Integration with PeerSelector . . . . . . . . . . . . . . . . . . 59

6 Experimental Evaluation 60

6.1 PEX modifications . . . . . . . . . . . . . . . . . . . . . . . . 60

6.2 Adapter extension . . . . . . . . . . . . . . . . . . . . . . . . 61

6.2.1 Experimental results . . . . . . . . . . . . . . . . . . . 63

6.3 Comparison of policies . . . . . . . . . . . . . . . . . . . . . . 68

6.3.1 Side effects of PeerSelector . . . . . . . . . . . . . . . 70

III Discussion and Conclusions 71

7 Discussion 72

7.1 Libswift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

7.2 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

8 Conclusions 76

A Appendix 78

2

List of Figures

2.1 P2P network . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2 Centralized network . . . . . . . . . . . . . . . . . . . . . . . 12

2.3 Metainfo with tracker . . . . . . . . . . . . . . . . . . . . . . 14

2.4 Metainfo, trackerless . . . . . . . . . . . . . . . . . . . . . . 14

2.5 Data request and response . . . . . . . . . . . . . . . . . . . 15

2.6 Channels and file transfer . . . . . . . . . . . . . . . . . . . 17

2.7 Chunk addressing in libswift . . . . . . . . . . . . . . . . . . 18

2.8 A libswift datagram . . . . . . . . . . . . . . . . . . . . . . . 20

2.9 The process of connection . . . . . . . . . . . . . . . . . . . 21

2.10 The process of exchanging data . . . . . . . . . . . . . . . . 21

3.1 The flow of PEX messages . . . . . . . . . . . . . . . . . . . 31

3.2 State machine of the leecher . . . . . . . . . . . . . . . . . . 31

3.3 State machine of the seeder . . . . . . . . . . . . . . . . . . 32

3.4 Message flow I . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.5 Message flow II . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.6 Piece picking algorithm . . . . . . . . . . . . . . . . . . . . . 35

3.7 Three peers in the same swarm . . . . . . . . . . . . . . . . 36

3.8 The setting for three groups . . . . . . . . . . . . . . . . . . 36

3.9 Group 1, trial 1 . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.10 Group 1, trial 2 . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.11 Group 1, trial 3 . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.12 Group 1, trial 4 . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.13 Group 2, trial 1 . . . . . . . . . . . . . . . . . . . . . . . . . 38

3

3.14 Group 2, trial 2 . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.15 Group 2, trial 3 . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.16 Group 2, trial 4 . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.17 Group 3, trial 1 . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.18 Group 3, trial 2 . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.19 Group 3, trial 3 . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.20 Group 3, trial 4 . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.21 Seeder’s HAVE and sent DATA messages . . . . . . . . . . 40

3.22 Congestion window size . . . . . . . . . . . . . . . . . . . . . 42

3.23 Overall view . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.24 Detailed view . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.1 Sender side . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.2 Receiver side . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.3 Design overview . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.4 The interaction between components . . . . . . . . . . . . . 48

4.5 Adapter and libswift core . . . . . . . . . . . . . . . . . . . 49

4.6 Adapter and PeerSelector . . . . . . . . . . . . . . . . . . . 50

4.7 API used by PeerSelector . . . . . . . . . . . . . . . . . . . 51

4.8 API used by libswift . . . . . . . . . . . . . . . . . . . . . . 51

6.1 Test case 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

6.2 Test case 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

6.3 Download performance for Score . . . . . . . . . . . . . . . 64

6.4 Score, trial 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

6.5 Score, trial 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

6.6 Download performance for AS-hops . . . . . . . . . . . . . 65

6.7 AS-hops, trial 1 . . . . . . . . . . . . . . . . . . . . . . . . . 66

6.8 AS-hops, trial 2 . . . . . . . . . . . . . . . . . . . . . . . . . 66

6.9 Download performance for RTT . . . . . . . . . . . . . . . . 67

6.10 RTT, Trial 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

6.11 RTT, Trial 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4

6.12 Download performance for Random . . . . . . . . . . . . . 68

6.13 Random, Trial 1 . . . . . . . . . . . . . . . . . . . . . . . . . 69

6.14 Random, Trial 2 . . . . . . . . . . . . . . . . . . . . . . . . . 69

5

List of Tables

6.1 PlanetLab machine properties . . . . . . . . . . . . . . . . . 62

6.2 Score-selected peers . . . . . . . . . . . . . . . . . . . . . . . 63

6.3 AS-hops-selected peers . . . . . . . . . . . . . . . . . . . . . 65

6.4 RTT-selected peers . . . . . . . . . . . . . . . . . . . . . . . 66

6.5 Comparison of policy-based performances . . . . . . . . . . 69

6

Part I

Background

7

Chapter 1

Introduction

During recent years Internet grew rapidly, with many protocols and

architectures developed to allow people to get information from the Internet.

Apropos, P2P networks and protocols play a very important role to help the

end-user retrieve and consume content.

1.1 Basic Introduction of Libswift

Libswift is a recently developed P2P protocol that may be understood as

BitTorrent at the transport layer [13]. Its mission is to disseminate content

among a group of devices. In its current form, it is not only a file-sharing

protocol, but it can also be used for streaming on-demand and live video [15].

Using libswift, a group of computers share and download the same file

with each other, where each computer is a peer to other computers. These

computers and the file together form a swarm. Like BitTorrent1, libswift

divides the file into small pieces which can be located among a group of

computers. As a result, the files and computers may be located in different

countries or cities. This is why we call libswift network a content distributed

network. When a user wants to retrieve pieces or the whole content, libswift

is responsible to transfer that content back to the user.

1BitTorrent(software) available at http://en.wikipedia.org/wiki/BitTorrent (software)

8

1.2 Problem Description

Naturally, users want to get their wanted content or some parts of it as

soon as possible [20]. Libswift and other P2P protocols try to retrieve

content from several computers at the same time. Just as explained above,

several devices may be serving the same content, meaning that the user

can download the corresponding content from any computer of the given

swarm. If swarm size is, say, five or six then libswift would be requesting

pieces from all five or six computers and, as a result, download speed would

increase proportionally with swarm size.

However, if swarm size were to be around five thousand[19], we should

not allow libswift to retrieve pieces from all of the computers as resources

of a local PC or a mobile device are limited, including CPU speed, memory,

or traffic bandwidth. Each request to another peer would consume some

of such resources. If thousands of requests were to be sent concurrently,

the local device would exhaust its memory and other resources. In this

thesis, we want to improve download performance by limiting the number

of simultaneous requests to certain — more wisely chosen — peers only.

1.3 Motivation

When we download data from different computers or any other device, we

may get different download speeds from each. In a P2P network, there

are many factors that may affect the performance of each peer. These

factors or properties include, among others, a peer’s location [20] and peer’s

Autonomous System (AS) [30]. Our initial motivation is to study how these

properties affect the behavior of libswift and, more specifically, how do such

properties affect libswift’s download speed, data transfer rate and latency.

9

1.4 Hypothesis

We argue that peer management directly impacts the efficiency and

performance of libswift protocol. No previous thorough analysis or solution

on peer management for libswift has been provided. Identifying and

developing a suitable peer management mechanism for libswift can be used

to reduce its download time and resource utilization.

1.5 Goal

1. Characterize libswift behavior in scenarios where libswift downloads from

multiple peers, and assess the protocol in terms of its correctness.

2. Identify and devise a peer management mechanism that reduces both

download time and resource utilization.

1.5.1 Tasks

To achieve our goal we will understand and analyze the internal peer

discovery mechanism and peer management policy. In P2P systems, the

process of selecting peers affects download performance.

We will also analyze congestion control and how it affects the behavior

of libswift. Generally, congestion control affects data transfer rate.

In addition, we will investigate piece picking algorithm and profile

libswift behavior for multiple-channel download scenarios. In a P2P

protocol, piece (chunk) picking affects the efficiency of data transfer.

We will design and implement a module to help libswift manage its

peers better. This module should act as an interface (adapter) between

libswift and a third-party peer selection mechanism. The interface should

be integrated with libswift but the peer selection mechanism should be

independent of libswift and should not require any major modifications

inside libswift’s implementation.

10

To verify the functionality of our module, we will integrate libswift with

an example peer selection engine.

Finally, we will define and carry out an experimental evaluation to verify

whether libswift’s download speed has improved.

1.6 Structure

Chapter 2 introduces background and related work, as well as, necessary

information needed to understand our thesis and the topic. Chapter 3

provides and overview and analysis of libswift, while Chapter 4 introduces

the overall design and design considerations for the extensions. Chapter 5

outlines implementation details, and Chapter 6 presents an experimental

evaluation of the extension and describes experimental results. We discuss

preliminary results and lessons learned in Chapter 7 and conclude in

Chapter 8.

11

Chapter 2

Background and Related

Work

In this chapter, we explain basic concepts of peer-to-peer networks and

protocols, introduce related work, and describe relevant background infor-

mation necessary to understand the work in this thesis.

2.1 Peer-to-Peer networks

In a centralized network, computers connect to a central server where files

are commonly located. When users want to download a file, they will connect

to that server, as depicted in Figure 2.2. Napster is the first prominent

centralized P2P system [29].

Figure 2.1: P2P network Figure 2.2: Centralized network

12

A peer-to-peer (P2P) network is a network architecture which organizes

computers as peers of an equivalent role [24]. Figure 2.1 illustrates a peer-

to-peer network where computers are connected to each other and files are

distributed across such machines. When a user wants to obtain a file, he

may connect to any computer of that network. Unlike in the centralized

network, there is no obvious server in a distributed network. Instead, every

computer acts as the rest of computers, and they call each other a peer.

Gnutella is a typical decentralized peer-to-peer network [6].

2.2 Peer-to-Peer protocols

A peer-to-peer protocol is used by users to exchange content or other

information in a P2P network. Since late 90s, such protocols were widely

used for file-sharing and distributed computing [29, 23, 26].

Terms we commonly encounter in P2P protocols are described in the

following:

• Peer — a computer participating in a P2P network and is usually

identified by its IP address and port number.

• Content — a live transmission, a pre-recorded multimedia asset, or a

file [15].

• Swarm — a group of peers that are distributing the same content.

• Chunk (piece) — a unit of content, usually measured in kilobytes.

• Leecher — a peer that is downloading content, sometimes called a

receiver.

• Seeder — a peer that is serving the content, often called a sender.

• Tracker — a device that keeps track of a group of peers that have

a given content. Peers searching for content commonly contact the

tracker in order to get other peers’ addresses.

13

2.3 BitTorrent protocol

In recent years, BitTorrent has been widely used to share music, videos and

other files in the Internet. Using BitTorrent to distribute content consists

of mainly two phases:

Publishing content

BitTorrent generates a metainfo (.torrent) file which contains a unique key

named infohash. This key contains subkeys describing chunks within the

content [6].

A file is split into small fixed-size pieces, where the default size is 256K.

Every piece is hashed and a corresponding 20 bytes string is generated

to uniquely identify the piece. All such 20 byte strings are stored in the

metainfo file, as pieces. Thus, the length of the pieces value is always a

multiple of 20.

Figure 2.3: Metainfo with tracker Figure 2.4: Metainfo, trackerless

By default, there should be an announce key that contains tracker’s

address inside metainfo (see Fig. 2.3). In a trackerless version, BitTorrent

uses Distributed Dash Tables (DHT) to get peer’s contact information [16].

In this case, a key named nodes replaces the announce key inside metainfo,

and the value of the nodes field becomes a list of IP addresses and port

numbers (see Fig. 2.4).

14

Transferring data

Once a BitTorrent client obtains the metainfo file from a torrent indexing

website, it reads tracker’s contact information from it and then connects to

that tracker to get peers who currently are sharing the requested file. If this

is a trackerless torrent, the client reads the nodes from metainfo and then

contacts such nodes for peers that have the content. In the latter case, each

node works as a tracker.

Figure 2.5: Data request and response

As soon as the client gets peer addresses, it sends them request messages

for pieces of content. A request message contains: an integer index, begin,

and length [16]. The queried peer may return a message whose payload

contains a piece index. Once the client receives a piece, it can check its

correctness against the hash value stored inside the metainfo file. Figure 2.5

illustrates the data transfer process where index represents the piece index

that peer A wants to retrieve, begin is the byte offset within the piece, and

length is the requested length.

2.3.1 Peer tracking and piece picking

Two peer tracking mechanisms exist in BitTorrent. Commonly, BitTorrent

relies on a tracker to find peers, as trackers maintain a list of peers for a

15

given infohash. When a new peer queries the tracker for a list of peers for a

certain file, the tracker returns the list of peers but also adds the querying

peer to the overall peer list for that infohash. Finally, whenever a peer wants

additional peers, it has to contact the tracker again.

Distributed Hash Tables (DHT) is an alternative, decentralized mech-

anism for peer discovery in BitTorrent [16]. In DHT, each peer maintains

(is responsible) contact information for a list of peers. Using DHT, a client

searches for k closest nodes to the given infohash, where the definition of

closeness is specific to the DHT protocol variant and its implementation, as

well as, the parameter k. Once it finds closest nodes, it queries them for the

list of peers associated with the infohash. While nodes serve only as pointers

to the list of peers, the peers possess the actual content.

BitTorrent uses rarest chunk first heuristics to retrieve pieces that

are rare in the swarm, and optimistic unchoking to enable random peers

bootstrap themselves into the swarm [7].

2.4 Libswift protocol

Libswift is a recently developed multiparty transport protocol that supports

three usecases: file download, on-demand and live streaming. Unlike Bit-

Torrent, libswift runs over UDP and TCP, however, current implementation

of the protocol is UDP-based [13, 15]. In this section, we will describe some

of the main libswift components.

File transfer

Libswift introduces file transfer — a means to distinguish between different

file transfers in the client. More specifically, file transfer is used to read or

save content that is being sent or received by a libswift client. Every file in

libswift has a corresponding file transfer. Though protocol specification does

not limit the number of concurrent file transfers, the current implementation

enforces a default of 8.

16

Channels

When two peers are connected, a channel is established between these two

peers. A channel is used to transfer the content of a file between connected

peers. One file transfer can employ more than one channel to transfer the

same file, but every channel can only work for one file transfer. Figure 2.6

illustrates the relationship between file transfers and channels.

Figure 2.6: Channels and file transfer

Congestion control

In order to control data transfer rate between the receiver and sender,

libswift uses a third-party congestion control method called Low Extra

Delay Background Transport (LEDBAT) [17]. LEDBAT is designed to

utilize all available bandwidth. As soon as there is bottleneck in the link,

it decreases the size of congestion window using the detected queuing delay.

In practice, libswift uses LEDBAT to manage its congestion window size;

libswift increases or decreases its data transfer rate according to the change

of congestion window size.

LEBDAT uses the following elements for its decision making:

17

• TARGET — if a delay is bigger than this value, the sender should

reduce the size of congestion window.

• GAIN — measures the “increase” and “decrease” ratio of the current

congestion window size.

Chunk addressing and chunk integration check

Like BitTorrent, libswift splits the file into small pieces during data transfer.

The recommended chunk size is 1 kilobyte.

Figure 2.7: Chunk addressing in libswift

Figure 2.7 illustrates the chunk addressing tree of a 4 kilobytes file. Each

leaf of the tree corresponds to a chunk. In turn, each chunk has an index

and a corresponding bin number. For example, bin number 0 stands for the

first 1 kb of the file, bin 2 stands for the second 1 kb, bin 4 stands for the

third 1kb, and so on.

When a peer starts serving this file to other peers, it calculates a SHA1

hash for every chunk. Then, it recursively calculates the upper layer SHA1

hashes according to the hashes of lower layers. Finally, the top level’s hash

value is the root hash (roothash) of the file. The roothash uniquely identifies

the file.

A peer requests content using the roothash. With the first chunk

received, the peer also gets the necessary hashes to check the integrity of

that chunk. For example, to check the first chunk of the above file, the

sender will send the hash of bin 2 and bin 5. First, when the receiver gets

the hash of bin 2, it calculates the hash for bin 1 according to bin 0 and

18

bin 2. Then, it obtains a hash by calculating bin 1 and bin 5. Finally, it

compares the hash which he got in step two with the roothash. If these two

hashes match, the integrity of the first chunk has been checked. Otherwise,

this chunk will be dropped.

Libswift messages

Libswift uses the following messages:

• HANDSHAKE — when a peer wants to connect to another peer, it

starts by sending a “handshake” message.

• HAV E — a peer uses this message to inform other peers what content

it has available.

• HINT — a peer uses this message to inform the sender what content

it wants.

• DATA — contains pieces of the file, usually, it carries 1 kb of content.

• HASH — this message is sent together with the DATA message. The

payload of this message contains the necessary hashes to verify the

integrity of transferred content.

• ACK — the receiver uses this message to acknowledge that it got the

content and successfully verified it.

• PEX REQ — peer uses this message to query for new peers from its

communicating peer.

• PEX RSP — sender uses this message to send back a list of peers to

the peer that requested them.

Peer exchange

Usually, a peer-to-peer network introduces a centralized tracker to keep track

of a group of computer addresses. When a peer wants to discover new peers,

19

it contacts this centralized tracker and the tracker gives back a subset of

peers. When a new peer joins the swarm, the tracker adds this new peer’s

address to the list of peers associated with certain content.

In addition, libswift supports trackerless peer discovery using the

gossiping algorithm called PEX (peer exchange). There are two types of

PEX messages: PEX REQ and PEX RES. When a peer A wants some

peers, it has to send a PEX REQ to a connected peer B who is transferring

data with the former. Then peer B may respond with a PEX RES message

containing its peers. The peers sent back to peer A are peers who were

actively exchanging information with peer B in the last 60 seconds.

2.4.1 Basic operations in libswift

In libswift, the basic operation unit is a message and the basic transferring

unit is a datagram which may contain one or more messages.

Figure 2.8: A libswift datagram

Figure 2.8 illustrates a datagram which contains a DATA and an

INTEGRITY message. The DATA message carries a piece of content and the

INTEGRITY message carries the hashes to verify integrity of the content.

Joining a swarm

When a peer wants to download some specific content, it begins from

requesting a list of peers from the tracker. It knows that these peers reside

in the same swarm where peers are exchanging the same file. Then, the peer

starts to connect to one of such peers by sending a HANDSHAKE message;

the other side will send back a HANDSHAKE message, indicating that the

connection has been established. In the same datagram — returned by the

connected peer — there may be some HAVE messages to show the current

20

Figure 2.9: The process of connection

content that is available at its side. Details of this process are depicted in

Figure 2.9.

Exchanging chunks

Figure 2.10: The process of exchanging data

After the connection has been established, the leecher (requester) starts

to request data from the connected peers. The leecher sends HINT

21

(REQUEST) messages to its peers whereas the peers respond back with

DATA, HAVE and INTEGRITY messages. Next, the leecher will check

and save the received chunks and send acknowledgment messages back. In

the same datagram a new HINT message may be included. Figure 2.10

illustrates the data exchange process.

Leaving a swarm

Once the initial peer downloads all desired content, it can leave the swarm

by explicitly sending a leave message or by stopping to respond to peers.

2.5 Related work

Many efforts have been made to improve the performance of P2P ap-

plications, and they can be mainly categorized into three areas: third-

party assisted approaches, end-user based approaches, and approaches using

congestion control. The first two categories focus on optimal peer selection

mechanisms, while the third category focuses on optimizing data transfer.

Third-party assisted approaches

Aggarwal et al [4] proposed a mechanism to help users find peers with good

performance. Authors propose an oracle service, provided by the ISP, to

P2P users. A P2P user provides a list of peers to the oracle, and this oracle

ranks peers according to different criteria including: peer’s distance to edge

of the AS, the bandwidth between users, or other. Given a sorted list of

peers, the user can then contact peers with better ranking. As the ISP has

more information about network’s topology, the calculated rank of peers can

be more accurate. Thus, a P2P user may benefit from this method. On the

other hand, ISPs may also benefit by ranking peers — who are in their own

network — in a higher place, keeping Internet traffic internal.

This solution seems like a win-win game, where both P2P users and ISPs

benefit. However, some questions remain. First, in a P2P network, peers

22

frequently join a swarm and leave it. This means that, in order to get the

latest peer rankings, a user must contact ISP’s oracle continuously. As a

result, new overhead is introduced at the user side. Secondly, given that

this approach requires the ISPs to deploy servers for the oracle service, it is

not clear whether ISPs are willing to incur additional costs by adding such

infrastructure. Finally, this proposal heavily relies on the ISP’s participation

and honesty.

Eugene et al [11] proposed a method to predict network distance between

two computers. The fundamental component of this method is Global

Network Positioning (GNP) which computes the coordinates of a computer

by seeing the whole network as a geometric space. First, it sets some

computers as landmarks and initiates these computers’ coordinates in the

geometric space. Then, a computer calculates its coordinates based on the

predefined landmarks. Given that the authors select a spherical surface

as their modeling geometric space, if a computer knows the coordinate of

its peer, it can compute its distance to the peer. Authors reported that

the correction between calculated GNP distance and measured distance is

0.915.

Predicting the distance of two P2P peers helps a P2P user choose a

nearby peer to download data from and thus potentially reduce data delay.

However, to set up landmarks — whose coordinates affect the accuracy of

the whole GNP system — is a challenge for a P2P network. The reason

is that, in a P2P network, it is difficult to find such peers who are always

online to work as landmarks. Moreover, the current P2P protocol must

be modified or extended to support peers that discover such landmarks.

Though we don’t integrate this approach in this master thesis, the proposal

motivates our choices and informs us that distance prediction can help in

choosing nearby peers.

Another solution given by Francis et al [12] shows a different way of

predicting distance between two computers. They built a system called

IDMaps to provide a service to other computers to check their distances as

23

well as a HOPS server, employed to provide the service. Given a computer A

who wants to know the distance to B, IDMaps calculates the distance from

A to its nearest Tracer C, and the distance from B to its nearest Tracer D.

Then, by calculating the distance from C and D, the sum of A-C, B-D and

C-D is the distance between A and B. Any computer can visit its HOPS

server to get its distance to others. Results of their research show that the

more tracers are in the system the more accurate is the calculated distance.

Both GNP and IDMaps services are focused on the distance prediction of

network users. Other studies and projects try to provide more information,

including bandwidth and AS-hops to applications to use. Haiyong Xie et

al [31] suggested a P4P service for both ISPs and P2P users. In P4P, a

network provider runs an iTracker which maintains network information

such as network backbone information and preferred inter-domain links of

the provider. A tracker or peer can contact the iTracker to obtain the

information and select its peers based on the information. Authors show that

if ISPs and P2P users know more information about each other, the traffic

caused by P2P applications can be reduced. According to their results, the

bottleneck link traffic was reduced up to 69%.

The IETF ALTO Working Group [28, 27] and the NAPA-WINE

project [5, 25] are trying to build more powerful services to provide more

network layer information to P2P applications. ALTO aims to provide

information such as bandwidth, cross-domain traffic and cost to the user.

To date, ALTO working group is still working on this project [10]. NAPA-

WINE does similar things to ALTO, moreover, it allows P2P applications

to query their own network status, including link congestion and preferred

routing. NAPA-WINE project claims that 50% traffic can be reduced on

long-haul links.

End-user based approaches

The above research efforts show that third-party entities have the ability

to make P2P applications more friendly to ISPs and achieve better

24

performance. From the P2P application point of view, the benefits are

not always obvious.

Cramer et al introduced a method to reduce the bootstrapping delay of

P2P applications [8]. Every peer hashes k-bit of its IP address prefix and

saves these hash values under a key in the DHT. When a peer comes to this

DHT, it tries to check the hash values of the IP prefix. If a peer’s prefix

hash matches its address, it is likely that they are in the same location in

terms of network topology. And, the longer the prefix match, the higher the

locality accuracy. If there are sufficient peers matching reasonable bits of

prefix, the peer can find enough peers. Otherwise, it tries to get the ones

matching shorter prefix bits.

This approach provides a quick way to filter peers who most likely are

not within the same location. Especially, in a very big DHT, peers can use

this method as the first filter to select potential peers. And based on the

results of prefix matching, a peer can further filter peers.

Lucia D’Acunto et al [9] focus on improving the QoS of the P2P system.

In their proposal, authors suggest that a peer who already has good QoS

should increase its allocated bandwidth for random peers. First, good QoS of

a peer means that its sequential progress is faster than the video playback

rate; in this case, this peer should increase its optimistic unchoke slots.

Then, by setting a higher priority for the new coming peers, the bootstrap

delay of these peers will be reduced. Their simulation results show that

average bootstrap delay is reduced by 48%, at the same time, there is no

negative impact on other peers’ performance.

Congestion Control

Most P2P clients are implemented in the application layer and use TCP

as their transport carrier. Applications such as PPLive, PPstream, Xunlei,

and BitTorrent open too many concurrent connections to other peers. Data

shows that PPLive can open up to 256 connections. In this case, when a

P2P client is running, other applications and users suffer bottleneck. For

25

example, imagine a usecase where a user has a 10MB Internet bandwidth

and a normal non-P2P application A that needs 1MB bandwidth to run. In

this case, if the user runs a P2P client who sets up more than 10 connections

to its peers, then each of the connections only has up to 1MB/11. In this

case, application A lacks enough bandwidth and cannot run correctly.

Yaning Liu et al [21] presented a way to detect congestion and a way

to raise or decrease the number of TCP connections. Their friendlyP2P

solution calculates the throughput of every P2P connection. By comparing

the current throughput with the previous recorded throughput, it knows

the change of a connection. If throughput of half of the connections is

decreased, it means that congestion was detected and P2P clients can reduce

the number of connections.

TCP is designed to fairly share the bottleneck with all connections [22]

thus a P2P application should consider the fairness with other applications

to make P2P more friendly to the network.

Our decisions

Most of the above approaches show us that involving third-party entities

requires upfront costs in order to support such peer recommendation services

for P2P applications. On the other hand, most studies also highlight the

importance of locality and keeping traffic within local ISPs.

From a P2P application’s point of view, using cheap and simple strategies

to select right peers to communicate with influences its overall performance.

When it comes to libswift, we consider it should be able to select peers

with the following properties: low latency (RTT), be as close as possible

(geographically), or be in the nearby ASes.

We mentioned that libswift prefers to run on top of UDP, and UDP

suggests that congestion control should be made in application layer. Thus,

libswift has more potentials to be more friendly to non-P2P applications

than other P2P applications who are running over TCP. In this thesis, we’ll

investigate libswift’s congestion control to see whether it is more friendly to

26

other non-P2P applications.

2.6 Experimental Evaluation

This section describes the procedure to analyze libswift and set up the

experimental environment. Later, we explain the tests we devised to test

libswift extensions and the software we used.

Procedure and testbed

In order to analyze and find vulnerabilities in libswift protocol, we inves-

tigated: peer exchange procedure, piece picking algorithm and congestion

control. First step is to study libswift’s source code [14] and verify if the

implementation functions according to protocol specification. This study

mainly focuses on: how peers exchange addresses with each other, how these

peers are used or managed, how libswift chooses which chunks of content

will be downloaded next, and what kind of congestion control is used.

We want to do our test in an environment where libswift can run

close to a real Internet setting. These tests require computers located in

different countries to run our code, since, in real world, P2P users are

located in different regions and we want our experimental setting to work

in a similar way. At the same time, our tests may cause lots of traffic

and some unknown results, thus we cannot interfere with the real Internet

infrastructure. Therefore, we have chosen PlanetLab as our platform to do

the tests. It currently consists of 1129 nodes [3] and these computers are

located in different countries.

In particular, we investigate the following issues more thoroughly:

• Peer exchange behavior — we construct a small size swarm to study

how a peer queries for news peers from the peers it already is connected

with. In particular, we want to investigate how PEX messages are

handled by libswift. Also, we want to analyze the download behavior

— which peer contributed more content to the local peer.

27

• Peer selection method — we construct a small size swarm with

peers that have different properties — RTT, location and AS-number.

Specifically, we selected sixteen computers that are located in different

countries and are diverse enough to be good candidates for the local

peer.

• Congestion control — LEDBAT congestion window size is highly

correlated with RTT. In this thesis, we observe how the size of

congestion window (CWND) increases and under what circumstances

CWND size decreases. We also analyze how an initial CWND size

affects data transfer rate at the beginning of the connection. Finally, in

order to know whether a bigger CWND size can improve performance,

we changed CWND size manually.

2.6.1 Extensions for libswift

After we investigated libswift’s peer exchange mechanism and peer selection

policy, we found that libswift does not employ any specific peer selection

policy. Moreover, libswift does not support PEX REQ message which

is necessary when using PEX protocol to exchange peers. Therefore, we

modified libswift’s source code to add support for PEX REQ message and

we decided to design an extension which we call peer selector adapter for

libswift to help the protocol manage and select its peers. To test our new

adapter (interchangeably, middleware or interface) we integrated it with a

peer selection mechanism — PeerSelector — which was developed by another

master thesis project [18].

Testing extensions and modifications

We configured two scenarios to verify the function of PEX REQ. First,

a new coming peer interacts with peers without sending a PEX REQ

message. Second, a new coming peer interacts with peers and sends a

PEX REQ message. The local peer should not receive peers in the first

28

case, even if it is in a swarm with more than a peer. However, it can receive

peers in the second case.

Since the functions of peer selector adapter can be verified together with

PeerSelector, we did not run any independent test case for the adapter.

PeerSelector employs four types of peer selection policies: geo-location,

RTT, AS-hops, and random. We tested each policy at least twice to exclude

Internet random variables and we run all the tests in the same swarm to

maintain the consistency of results. For each policy, we want to study if the

selection of peers based on a given property or metric influences libswift’s

download performance.

We then compare the results of the first three policies with the random

policy. Aside from evaluating how libswift’s download speed changes under

different policies, we also verified which peers were selected by PeerSelector

and whether such peers worked as expected.

2.6.2 Software resources and tools

We use MaxMind’s GeoIP [2] library to get the peers’ geographic location.

GeoIP is not only a free library but also provides an API to the user to

check city, country and continent for an IP address.

An mp4 file of approximately 40 megabytes was used as the target

content when we did the tests. This is a 14 minutes long video file. Usually,

a music video is just a half of this length. The download time of this video

is enough for us to observe libswift’s behavior.

To analyze download speed and draw graphs, we use gnuplot [1]. It is a

command-line driven graphing utility and it is free.

29

Chapter 3

Overview and Analysis

In this chapter, we provide an overview of libswift and discuss in greater

detail the internal functioning of the protocol. Mainly, we will focus our

discussion on three issues outlined in the previous chapter: peer discovery

— peer exchange and selection, and congestion control.

3.1 Peer discovery basics

When retrieving content, there are at least two participants. One is the peer

trying to obtain the content and thus it initiates the whole process, and the

other peer is responding back with data. We call the first peer a leecher and

the latter a seeder — who also acts as a tracker.

According to protocol specification, when a peer A wants new peer

addresses, it should follow the procedure illustrated in Figure 3.1. In step 1,

A explicitly sends a PEX REQ message to its connected peer B. In step 2,

peer B may respond with a PEX RSP message, which includes a peer C.

In step 3, peer A initiates a connection to C by sending a HANDSHAKE

message. In case peer A wants more peers, it sends PEX REQ messages to

both connected peers — B and C, where peer C may reply with a message

containing a peer D. At this point, peer A repeats step 3.

30

Figure 3.1: The flow of PEX messages

Leecher’s angle

Figure 3.2 illustrates the internal state of a leecher. When libswift runs as

a leecher, it starts by opening a new channel. First, it initiates itself and

creates a datagram containing a HANDSHAKE message, channel ID and

the roothash of the desired content. Then, it jumps into running state and

sends the datagram to the destination peer address. Finally, it waits for a

response from the destination peer. The length of waiting time is determined

by congestion control. After this waiting time is decided, the leecher jumps

to the waiting state.

Figure 3.2: State machine of the leecher

The destination peer may respond back with a datagram which contains

a HANDSHAKE and a HAV E message. While the first message informs

31

the leecher that the connection has been set up, the latter message informs

the leecher which content the destination has available.

Next, the leecher sends a HINT message to the destination, specifying

the content it wants. Again, it has to wait for the response, so the leecher

jumps back to the waiting state. Whenever it receives a datagram, it comes

back to its running state. If the received datagram contains a DATA

message, the leecher responds to the destination with an ACK message.

If in the meantime, the leecher is transferring data with other peers, it will

update them about its currently available content by sending them HAV E

messages.

Seeder’s angle

When libswift runs as a seeder, it means that this peer possesses content

it can send to other peers. In this case, libswift initializes itself and starts

listening on a certain port number. Then, it goes to waiting state. When

it receives a new datagram with a HANDSHAKE message, the seeder

will respond back with a HANDSHAKE message and its channel ID. In

the same datagram, there may be included HAV E messages to inform the

leecher of its currently available content. Now that the connection between

the seeder and leecher is established, the seeder goes to the waiting state

once more. This process is illustrated in Figure 3.3.

Figure 3.3: State machine of the seeder

32

3.2 Peer exchange and selection

To understand peer exchange behavior in libswift, we configured a small

swarm with fourteen peers. In addition, we set up an initial seeder A who,

at the same time, acts as a tracker. In this setting, our local peer will start

downloading data from seeder A.

To understand the inner process of peer exchange, we will use logs

generated by libswift. Keywords such as +hash, +hs, +hint, -have, -pex

correspond to libswift messages such as HASH, HANDSHAKE, HINT ,

HAV E and PEX ADD.

PEX messages

Figure 3.4 depicts the messages exchanged between the leecher and seeder

A. In line 3, leecher sends the first datagram to seeder A. The datagram

includes two messages: +hash — informing seeder A that the leecher wants

to download the indicated content (the string carried by this meesage is the

roothash of the file), and +hs — a HANDSHAKE message. From line 6 to

15, the leecher receives HANDSHAKE and HAVE messages from seeder A.

On lines 16 and 17, leecher sends a HINT message to seeder A. From line

18 and onwards, leecher receives a new datagram from seeder A, containing

PEX ADD (PEX RSP ) messages that inform the leecher about peers

seeder A has.

Finally, we can see from Figure 3.4 how the leecher does not send

PEX REQ messages to the seeder, however, it receives peers from the

seeder. This behavior is not compatible (goes against) with the protocol

specification which states that ”A peer that wants to retrieve some peer

addresses MUST send a PEX REQ message”.

Peer selection

As we’ve seen from the logs in the previous section, the leecher gets a subset

of peers from seeder A. Figure 3.5 further shows how the leecher tries to

33

Figure 3.4: Message flow I Figure 3.5: Message flow II

establish connections with the received peers, following the same sequence

of peers as it received them from seeder A. This implies that the current

implementation of libswift does not employ any peer selection strategy.

Piece picking algorithm

As previously explained, libswift downloads data by sending a HINT

message to the seeder. The seeder then sends DATA messages back to the

leecher, according to the HINT message which specifies the bin number.

The bin number represents the chunk that leecher wants to download, and

it is commonly determined (or picked) by the piece picking algorithm.

This means that, if we record the chunk index of a DATA message when

the leecher receives one, we can understand how libswift picks next pieces.

Figure 3.6 reveals that the current piece picking is linear, as most of the

chunks are downloaded sequentially. Roughly, we can observe that the first

received chunk has index 1, the second received chunk has index 2, and so

on; the last received chunk has the largest chunk index.

When data transfer is sequential, there are two or more concurrent

channels which will be competing with each other. This may often be the

case why data is dropped. Appendix A shows that the leecher already

34

Figure 3.6: Piece picking algorithm

received chunks 4256, 4257, 4258, and so on from peer 193.xxx.xxx.250, but

peer 192.xxx.xxx.11 still sends these chunks to the leecher. As a result,

the leecher drops data coming from 192.xxx.xxx.11. If libswift were to use

random piece picking algorithm to download pieces from these two peers,

the chances of chunks being dropped would have been lower.

3.3 Analysis of multiple-channel download

Properties of peers can affect libswift’s download performance. In this

section, we want to figure out how a peer property such as RTT affects

download speed. To study this issue, we set up three different groups of test

cases (see Fig. 3.8). The leecher is connected to two seeders, as shown in

Figure 3.7.

As shown in Figure 3.8, in every group, there are three peers. One peer

works as a leecher who downloads content from the other two seeders. The

first seeder also acts as the initial seeder and tracker, which means that, the

35

Figure 3.7: Three peers in the same swarm

leecher learns from this seeder about the second seeder’s address.

Figure 3.8: The setting for three groups

In the first group, the two seeders run on the same virtual machine,

so they have the same RTT. In the second group, each seeder runs on a

different computer in PlanetLab, however, they are located in the same AS

and running in the same laboratory, thus they have very similar RTT value.

In the last group, we run one seeder on the local virtual machine and the

other runs on PlannetLab. In this group, the two seeders are very different:

the first seeder has a very small RTT, compared to the RTT of the second

one. Finally, all tests for each group were run at least four times.

Observed behaviors

In the first group, two seeders have the same properties except their

port number. Results show that each of the two seeders provide half of

the content, moreover, download speed from two seeders is almost equal.

Figures 3.9 - 3.12 show four trials carried out for group one.

Figures 3.13 - 3.16 show the test results for the second group. From

the first two trials we can see that the content provided by two seeders is

disproportionate. Seeder one provides more content than seeder two and

36

Figure 3.9: Group 1, trial 1 Figure 3.10: Group 1, trial 2


the ratio is about 7 to 3. The last two trials show that the ratio becomes

more even, approximately 55 to 45. We should emphasize that the RTT of

the first seeder is smaller than that of the second seeder.

The last group shows very different results from the previous two.

All test cases show that the seeder running on the local virtual machine

provides much more content. Specifically, more than 90% of the content is

downloaded from seeder one, as shown in Figures 3.17 - 3.20.

Analysis

Let’s say that the leecher, seeder one, and seeder two are A, B, and C

respectively. A has two channels to B and C. A seeder typically has to

37



handle HAV E messages and send DATA messages. This process is further

illustrated in Figure 3.21 and detailed in the following:

• Step 1: records that the leecher already has (N-1) chunk, in order to

make sure this chunk is not sent again

• Step 2: waits to send the next datagram

• Step 3: creates a new datagram which containsDATA and INTEGRITY

messages

Assume that A just got a chunk of content from B or C and this chunk

of content is N-1. A then notifies both B and C with a HAV E message

38



(at time t0), confirming that A already got chunk N-1. B and C receive

this confirmation after half the RTT time, and they now are aware that A

needs its next chunk N. The time B and C spend on processing is γB and

γC respectively. Both B and C wait for τB and τC before sending the next

datagram; this timer is determined by congestion control. Once this timer

expires, B and C start preparing their datagrams with the next chunk during

time δB and δC respectively, and then send it to A.

After half the RTT time, A receives data from B and C. The data from

B arrives at A at:

39

Figure 3.21: Seeder’s HAVE and sent DATA messages

t0 +RTTB

2+ γB + τB + δB +

RTTB2

And data from C arrives at A at:

t0 +RTTC

2+ γC + τC + δC +

RTTC2

If:

(t0 +RTTB

2+ γB + τB + δB +

RTTB2

) < (t0 +RTTC

2+ γC + τC + δC +

RTTC2

)

then A will accept data from B, otherwise, A will accept data from

C. Because the RTT is always much larger than the time spent preparing

datagrams or waiting, RTT plays a more important role in our tests.

In the first group, RTT values are equal, thus two seeders provide the

same amount of content. But in the second group, RTT of the first seeder

is a little bit smaller than that of the second seeder. Consequently, the first

seeder contributes more content. Although the ratio becomes smaller in

the third and fourth trials, we still observe that the first seeder contributed

more content. In the last group, the first seeder has a much more smaller

RTT, so the data coming from it always arrives first at leecher compared to

the data coming from the second seeder. For example, if A receives chunk

40

N from B, it sends HAV E messages to B and C, saying ”I got the chunk

with bin number N”. Then B waits to send the next chunk. Whereas, if

C has not sent chunk N to A, C will cancel sending chunk N. However, if

C receives the HAVE notification after sending the chunk to A, A still can

receive the chunk and this chunk will be dropped by A. Therefore, we can

say the peers who have smaller RTT contribute more content and have a

faster data transfer rate.

3.4 Congestion control

Libswift uses LEDBAT at the sender side to control data transfer rate,

whereas the receiver side does not use any congestion control. According to

LEDBAT specification, a conservative implementation may skip slow-start

by setting an initial window size for LEDBAT. The purpose of slow-start is

to make a connection have a larger CWND (congestion window) as soon as

possible. In particular, there are two phases: slow-start — CWND increases

exponentially, and second phase — congestion control changes to LEDBAT,

after CWND has increased beyond a predefined threshold.

According to libswift protocol specification, libswift employs slow-start.

However, during our tests we found that as soon as the seeder gets the

first ACK message from the leecher, libswift changes from slow-start

to LEDBAT. This means that libswift changes to LEDBAT too early.

Because slow-start increases CWND much faster than LEDBAT, if libswift

transitions to LEDBAT too early, it means that libswift does not have a

big CWND. Often, this causes the data transfer rate to be very low at the

beginning of a new connection.

Figure 3.22 shows how CWND size changes with received ACK mes-

sages. Red color shows libswift’s initial window size of 1, whereas green

color illustrates the window size when manually changed to 70. A small

ratio of window size decrease means the current RTT increased; a very big

ratio of decrease is caused by the timeout of ACK messages. The red line

shows that libswift’s CWND size increased slowly. If the initial window size

41

Figure 3.22: Congestion window size

is set to a reasonable value, peers can get the first chunks faster. The green

line, representing a bigger window size, shows that our manually changed

— larger — window size exhibits better data transfer rate.

Figure 3.23: Overall view Figure 3.24: Detailed view

Figure 3.24 shows that if we set a bigger CWND size, we get faster

rate at the beginning. Although the overall download performance has not

improved, as can be seen in Figure 3.23, the time spend on downloading the

first few hundred chunks has decreased, as shown in Figure 3.24. It should

noted that the CWND of 200 in these illustrations is also a test number,

42

showing that if we have a larger CWND, download speed can be improved.

LEDBAT specification suggests that TARGET should be 100 millisec-

onds [17], however, libswift uses a value of 25 milliseconds. If libswift were to

run on a mobile phone, the current TARGET value is too big. In addition,

LEDBAT recommends a GAIN value of 1, whereas libswift implements a

much smaller value — 1/25000.

To summarize, congestion control in libswift needs further investigation

and experimentation to find suitable values for the initial window size,

TARGET, and GAIN.

43

Part II

Design and Implementation

44

Chapter 4

Protocol Extensions

As previously explained, current libswift implementation does not support

PEX REQ message. In this situation, the leecher will always receive peers

from seeders and there is no way to stop seeders from sending peers to the

leecher. Thus, we will complement the existing implementation to fulfill

protocol specification — ”A peer that wants to retrieve some peer addresses

MUST send a PEX REQ message”. Now, if the leecher does not need

additional peers, seeders will stop sending peers back.

Secondly, we know that some of the peers in the same swarm have better

performance. However, libswift does not have the ability to find which peer

is better to use. To manage peers better, we will extend libswift and devise

an interface module that enables a more efficient peer rotation mechanism

inside libswift and simply call it an adapter.

The rest of this chapter will be mainly divided into two sections:

explaining PEX modifications and the adapter extension.

4.1 PEX REQ design

We do not need a new component to support this functionality. Instead,

we employ a true-false switch into libswift. When a peer (sender) sends a

datagram to its connected peers, it checks the true-false switch. If the value

of the switch is true, it sends a PEX REQ, otherwise, it does not send

45

a PEX REQ. The switch is controlled by the adapter and the default is

true, but it can also be controlled by an external peer selection mechanism.

Whenever a peer (receiver) receives a PEX REQ, it sets the true-false state

to true in its corresponding channel. Before it sends the next datagram, it

checks the state. If the state is true, it adds the PEX RSP message to

the datagram, including current peers. After this, it switches the true-false

state to false. Figure 4.1 shows the process of adding a PEX REQ message.

Figure 4.2 shows the process of responding to a PEX REQ.

Figure 4.1: Sender side Figure 4.2: Receiver side

Two functions were added to instruct the channel to send and receive

PEX REQ:

• AddPexReq — is used to check the value of the switch. Then,

libswift adds PEX REQ message to the datagram that will be sent

to the receiver. If the switch’s value is false, it just sends the datagram

without including PEX REQ message

• OnPexReq — when a PEX REQ message is received, libswift uses

this function to set the flag to true to indicate that the sender wants

peers. Flag will be changed to false when “current peers” are added

to the datagram. Libswift checks the flag to decide whether it should

send its peers to the receiver

46

4.2 Design of the adapter

We designed and developed an interface module which acts as a middleware

between the existing libswift core and an external peer selection mechanism

(see Fig. 4.3). Therefore, we call it a peer selector adapter or shortly adapter.

Figure 4.3: Design overview

Libswift core

This module is responsible for interacting with other peers. It transfers

content, verifies content, constructs datagrams, sends datagrams to other

peers and receives them.

When libswift discovers new peers, it relays them to adapter. Similarly,

when libswift wants to open channels for communication with new peers, it

calls the adapter.

Adapter

The adapter is located between libswift and peer selection mechanism, as

shown in Figure 4.4. Adapter is highly coupled with libswift’s implementa-

tion and provides the interface to third-party peer selection mechanisms. As

soon as an external peer selection mechanism implements adapter’s interface,

these three components can work together.

The advantage of this design is that libswift is decoupled from external

mechanisms, enabling us to plug in (integrate) different mechanisms for peer

selection. Otherwise, the adapter saves peers’ addresses to a queue. When

47

libswift requests peers, the adapter returns peers from the queue, without

ranking them.

Figure 4.4: The interaction between components

Peer selection mechanism

The peer selection mechanism we tested with libswift is an externally

developed module called PeerSelector [18]. Once the adapter relays peers

to PeerSelector, the latter saves them to a database and then ranks them

according to the following peer ranking policies:

• RTT policy — the peer who has the smallest RTT value is placed

first, the peer who has the second smallest RTT is placed second, and

so on. RTT value is determined by probing the given peer (pinging).

• AS-hop count policy — ranks peers by calculating their AS-hop

count. For every peer it gets from the adapter, it calculates its AS-

hop count to the local peer using public BGP records.

48

• Geo-location policy — ranks peers by calculating their geographical

distance to the local peer. Geographical closeness is determined using

a third-party GeoIP database and API [2].

• Random policy — peers are returned randomly, no ranking applied.

PeerSelector provides three functions to us:

4.2.1 Interaction between components

Adapter and libswift

Figure 4.5: Adapter and libswift core

Libswift interacts with adapter in several cases. First, it asks the adapter

if it needs more peers. Also, it calls the adapter when libswift discovers new

49

peers and needs to store them locally. Furthermore, it calls the adapter when

libswift needs certain — ranked or not — peers. Figure 4.5 illustrates this

process. IsNeedPeer is called to check whether libswift needs to discover

new peers, by issuing PEX REQ messages. If new peers are discovered

(through PEX RSP ), libswift calls AddPeer to store such peers. Finally,

when libswift establishes a new channel, it calls GetBestPeer to obtain the

“the best” peer and connect to it.

Adapter and PeerSelector

Figure 4.6: Adapter and PeerSelector

Adapter relays discovered peers to PeerSelector. Figure 4.6 illustrates

their interaction. When libswift sets up a new channel, adapter will call

PeerSelector to get the “best” ranked peer. Otherwise, PeerSelector uses

handles such as Needpeer to request or not new peers.

Designed APIs

Two types of API were defined to facilitate the interaction between

components.

50

Figure 4.7: API used by PeerSelector

Figure 4.8: API used by libswift

51

Chapter 5

Implementation

Implementation work on this master thesis consists of modifications done

to add support for PEX REQ, implementation of the adapter module to

add support for a “peer rotate” mechanism, and integration of PeerSelector

with libswift source code.

5.1 Libswift modifications

Three files: swift.h, transfer.cpp and sendrecv.cpp were modified. Message

PEX REQ was added to swift.h and variables were added to transfer.cpp

(file transfer). Two functions — AddPexReq and OnPexReq — were also

added to support PEX functionality:

typedef enum {

. . .

SWIFT PEX REQ = 11 //make sw i f t suppor t PEX REQ

} message id t ;

class F i l eTran s f e r {

. . .

//We added some members to the sw i f t F i l eTrans f e r p r i v a t e

s e c t i on .

private :

S e l e c t o r ∗ pe e rS e l e c t o r ; // peer s e l e c t o r adapter po in t e r

FILE ∗ pFi l e ; // f i l e po in t e r to save l o g in format ion

52

int maxpeernum ; //how many peers the sw i f t want to use

int r o l e ; // to i d e n t i f y s w i f t . 0 i s seeder ,1 i s l e e c h e r

int s o r t ; // the s o r t p o l i c y used by sw i f t 0

score ,1 r t t , 2 as−hops ,3 //random

i n t 6 4 t s t a r t t p ; // timestamp o f f i r s t r e que s t sen t by

sw i f t

. . .

} ;

class Channel {

. . .

public :

. . .

void AddPexReq(Datagram& dgram) ; //add the PEX REQ to datagram

void OnPexReq(Datagram& dgram) ; // r e c e i v ed a PEX REQ

protected :

. . .

bool pee r r eq rvd ; //whether r e c e i v e r r e c e i v ed PEX REQ

} ;

Source Code 5.1: swift.h

Module transfer.cpp was modified to add support for the adapter. When

libswift discovers (receives) new peers, we use the adapter to save them

locally:

F i l eTran s f e r : : F i l eT ran s f e r ( const char∗ f i l ename , const Sha1Hash&

roo t ha sh ) :

f i l e ( f i l ename , r oo t ha sh ) , h s i n o f f s e t (0 ) , c b i n s t a l l e d (0 ) {

// In the cons t ruc t i on func t i on o f F i l eTrans f e r

. . .

p e e r S e l e c t o r = new Se l e c t o r ( ) ; // i n i t i a l the s e l e c t o r adapter

}

. . .

void F i l eTran s f e r : : OnPexIn ( const Address& addr ) {

. . .

i f (1 == r o l e ) {

53

pee rSe l e c t o r−>AddPeer ( addr , hash ) ; //We c a l l the adapter to

save peers .

std : : vector<Address> peer s = pee rSe l e c t o r−>GetPeers ( sort ,

this−>root hash ( ) ) ; // ge t peers back .

int counter = peer s . s i z e ( ) ;

i f ( counter >= SWARM SIZE) {

for ( int i = 0 ; i < maxpeernum ; i++) {

//And i f the open channe l s i f sma l l e r

than the maxpeernum

Address addrtemp = ( Address ) pee r s [ i ] ;

bool connected = fa l se ;

// judge whether connected to the

peer

for ( u int j = 0 ; j < h s i n . s i z e ( ) ; j++) {

Channel∗ c = Channel : : channel ( h s i n [ j ] ) ;

i f ( c && c−>t r a n s f e r ( ) . fd ( ) == this−>fd ( ) && c−>peer ( )

== addrtemp )

connected = true ; // a l r eady connected

}

i f ( ! connected ) {

//we s e t up new channel

new Channel ( this , Datagram : : d e f a u l t s o c k e t ( ) , addrtemp ) ;

}

}

}

. . .

}

Source Code 5.2: transfer.cpp

Module sendrecv.cpp was extended with two functions — AddPexReq

and OnPexReq:

b in64 t Channel : : OnData(Datagram& dgram) {

// l o g g i n g peer address , content l e n g t h and the r e c e i v ed time o f

t h i s p i e ce o f data .

f p r i n t f ( t r a n s f e r ( ) . pFi le , ” f i r s t r eque s t time=%l l d r e c e i v ed data

from %s data l ength=%d time=%l l d \n” , this−>t r a n s f e r ( ) .

54

s t a r t t p , addr . s t r ( ) , length ,NOW) ;

}

. . .

void Channel : : Reschedule ( ) {

. . .

//Before d e l e t e the channel we d e l e t e the peer in the peer

s e l e c t o r .

t r a n s f e r ( ) . p e e rSe l e c t o r−>DelPeer ( this−>peer ( ) , t r a n s f e r ( ) .

root hash ( ) ) ;

. . .

}

. . .

void Channel : : AddPexReq(Datagram& dgram) {

//Check i f we need send PEX REQ.

i f ( ! t r a n s f e r ( ) . p e e rSe l e c t o r−>IsNeedPeer ( ) ) return ;

//Yes , put PEX REQ message to datagram .

dgram . Push8 (SWIFT PEX REQ) ;

dp r i n t f ( ”%s #%u +pex req \n” , t i n t s t r ( ) , i d ) ;

}

void Channel : : OnPexReq(Datagram& dgram) {

//we r e c e i v ed a PEX REQ message , record i t .

pee r r eq rvd = true ;

d p r i n t f ( ”%s #%u −pex req \n” , t i n t s t r ( ) , i d ) ;

}

void Channel : : AddPex(Datagram& dgram) {

// check whether we need send peer to the o ther s i d e

i f ( ! p e e r r eq rvd ) return ; // i f f a l s e , j u s t re turn .

. . .

}

. . .

Source Code 5.3: sendrecv.cpp

5.2 Implementation of the adapter module

Two modules selector.h and selector.cpp were designed, implementing class

Selector. Functions used by libswift and PeerSelector are public. Functions

55

AddPeer, DelPeer, GetPeers, and IsNeedPeer are used by libswift, while

NeedPeer is used by PeerSelector.

In the private section, we have three variables: need peer — the switch

we designed to indicate whether libswift needs to send a PEX REQ

message, inpeers — queue used to save peers when there is a peer selection

mechanism, and p Btselector — pointer to PeerSelector instance.

class Se l e c t o r {

public :

S e l e c t o r ( ) ;

// add a peer to database

void AddPeer ( const Address& addr , const Sha1Hash& root ) ;

// d e l e t e a peer from database

void DelPeer ( const Address& addr , const Sha1Hash& root ) ;

// do not prov ide t h i s peer back in h a l f an hour

void SuspendPeer ( const Address& addr , const Sha1Hash& root ) ;

// whether peer s e l e c t o r wants more peers

void NeedPeer (bool need ) ;

// used by sw i f t core to judge whether need send

PEX REQ.

bool IsNeedPeer ( ) {return need peer ;}

// ge t peers from database . type : 0 score ,1 r t t , 2 as−

hops and 3 random

std : : vector<Address> GetPeers ( int type , const Sha1Hash&

f o r r o o t ) ;

// f u t u r e use

Address GetBestPeer ( ) ;

private :

bool need peer ;

// a l l peers r e c e i v ed by sw i f t , the queue t ha t

mentioned in s e c t i on 5 .1 . 12

deque<Address> i np e e r s ;

#ifde f BITswift

// the t h i r d par t peer s e l e c t o r

B i t sw i f t S e l e c t o r ∗ p Bt s e l e c t o r ;

#endif

56

} ;

Source Code 5.4: selector.h

Module selector.cpp implements adapter functions. We initialize PeerS-

elector and set the switch to true (default behavior) in the construction

function. Function AddPeer first checks whether PeerSelector is null; if null,

it saves peers to the inpeers queue. Similarly, function GetPeers checks if

PeerSelector is null; if null, it returns peers stored in the queue.

Se l e c t o r : : S e l e c t o r ( ) {

#ifde f BITswift

// i n i t a i l peer s e l e c t o r

p Bt s e l e c t o r = new B i t sw i f t S e l e c t o r ( ) ;

#endif

//by d e f au l t , s w i f t needs peers

NeedPeer ( true ) ;

}

void Se l e c t o r : : AddPeer ( const Address& addr , const Sha1Hash& root

) {

// i f t h e r e i s no peer s e l e c t o r , we j u s t save i t in a

queue .

i f ( p Bt s e l e c t o r == NULL)

inpe e r s . push back ( addr ) ;

// F i r s t we change the address o f peer and roo t hash

// to the s t y l e t h a t can be used by peer s e l e c t o r .

u in t 32 t ipv4 = ntoh l ( addr . addr . s i n addr . s addr ) ;

char r s [ 2 0 ] = { 0 } ;

s p r i n t f ( rs , ”%i .% i .% i .% i ” , ipv4 >> 24 , ( ipv4 >> 16) & 0 x f f , (

ipv4 >> 8)

& 0 x f f , ipv4 & 0 x f f ) ;

s t r i n g i pS t r i n g ( r s ) ;

// then we c a l l addpeer to save i t to peer s e l e c t o r .

p Bts e l e c to r−>addpeer ( ipSt r ing , addr . port ( ) , root . hex ( ) ) ;

}

void Se l e c t o r : : DelPeer ( const Address& addr , const Sha1Hash& root

) {

// i f t h e r e i s no peer s e l e c t o r , we d e l e t e the peer in

57

the queue

i f ( p Bt s e l e c t o r == NULL) {

for ( int i = 1 ; i < i np e e r s . s i z e ( ) ; i++) {

i f ( i np e e r s [ i ] == addr )

i npe e r s . e r a s e ( i npe e r s . begin ( ) + i ) ;

}

}

u in t 32 t ipv4 = ntoh l ( addr . addr . s i n addr . s addr ) ;

char r s [ 2 0 ] = { 0 } ;

s p r i n t f ( rs , ”%i .% i .% i .% i ” , ipv4 >> 24 , ( ipv4 >> 16) & 0 x f f , (

ipv4 >> 8)

& 0 x f f , ipv4 & 0 x f f ) ;

s t r i n g i pS t r i n g ( r s ) ;

// then c a l l d e l e t e p e e r to d e l e t e i t from peer s e l e c t o r

p Bts e l e c to r−>de l e t ep e e r ( ipSt r ing , addr . port ( ) , root . hex ( ) ) ;

}

std : : vector<Address> Se l e c t o r : : GetPeers ( int type , const Sha1Hash&

f o r r o o t ) {

std : : vector<i pPort s> i p p o r t l i s t ;

s td : : vector<Address> peer s ;

s td : : vector<Address> peer s ;

// i f t h e r e i s no peer s e l e c t o r , we ge t peer from our

queue .

i f ( p Bt s e l e c t o r == NULL) {

std : : copy ( i npe e r s . begin ( ) , i np e e r s . end ( ) , s td : : b a c k i n s e r t e r

( pee r s ) ) ;

return peer s ;

}

//Then , c a l l the f unc t i on ge t pe e r o f peer s e l e c t o r

p Bts e l e c to r−>ge tpee r ( f o r r o o t . hex ( ) , type , i p p o r t l i s t , 2 0 ) ;

//Then we change the data type t ha t w i l l be used by

sw i f t core .

for ( std : : vector<i pPort s > : : c o n s t i t e r a t o r j =

i p p o r t l i s t . begin ( ) ; j

!= i p p o r t l i s t . end ( ) ; ++j ) {

Address addr ( j−>ipAddress . c s t r ( ) , j−>port ) ;

pee r s . push back ( addr ) ;

}

58

return peer s ;

}

void Se l e c t o r : : SuspendPeer ( const Address& addr , const Sha1Hash&

root ) {

// f u t u r e use , need peer s e l e c t o r support , w i l l not a f f e c t our

experiment .

}

void Se l e c t o r : : NeedPeer (bool need ) {

need peer = need ;

}

Address S e l e c t o r : : GetBestPeer ( ) {

// f u t u r e use , need peer s e l e c t o r suppor t w i l l not a f f e c t our

experiment .

}

Source Code 5.5: selector.cpp

5.3 Integration with PeerSelector

PeerSelector implements BitSwiftSelector class in order to save peers to

the local database, and request “ranked” peers from the database. We use

this class, BitSwiftSelector.h and BitSwiftSelector.cpp respectively, to

access the functions of PeerSelector.

class B i tSw i f t S e l e c t o r {

public :

void addpeer ( s t r i n g ip , int port , s t r i n g hash ) ;

void de l e t ep e e r ( s t r i n g ip , int port , s t r i n g hash , int type =

100) ;

void ge tpee r ( s t r i n g hash , int type , std : : vector<i pPort s> &

ip por t , int count = 0) ;

. . .

} ;

Source Code 5.6: bitswiftSelector.cpp

59

Chapter 6

Experimental Evaluation

We conducted experiments to test the functions of our implemented

modules, as well as, make sure PEX REQ message modifications work as

expected. These experiments also evaluate if PeerSelector works according

to its design, as well as, observe if libswift’s download performance improves.

To summarize, first experiments test PEX REQ, the rest of experiments

evaluate adapter extension.

6.1 PEX modifications

Setup

We configured a swarm with three seeders running on the same computer

under different ports, and a leecher running on a separate machine. The

leecher connects to the initial seeder in two test cases. In the first test

case, the leecher calls NeedPeer(false) in adapter’s construction function.

This means that the leecher does not need peers and it should not send

PEX REQ to the initial seeder. In the second test case, the leecher calls

NeedPeer(true) in adapter’s construction function. This means that leecher

needs peers and it will send PEX REQ to the initial seeder.

60

Results

Figure 6.1 shows that if the leecher does not send a PEX REQ message

to the initial seeder, it can not receive peers. Figure 6.2 shows that if the

leecher sends a PEX REQ message to the initial seeder, it receives peers

as a result. Therefore our design and implementation work as expected.

Figure 6.1: Test case 1 Figure 6.2: Test case 2

6.2 Adapter extension

Testbed environment

To set up a simulation swarm, we have chosen sixteen nodes in PlanetLab

and they are running Fedora Linux 2.6, 32 bit OS. In the swarm, one

computer acts as leecher and all the other computers act as seeders.

Table 6.1 lists the properties of these computers. In geographic score, a

smaller value means closer to the leecher. And a smaller RTT means data

delay is lower. AS-hop and RTT values are relative to the leecher.

To run our implementation, we installed GNU Make 3.81 and gcc

version 4.4.4 on PlanetLab machines. To be able to run PeerSelector, we

61

IP Adress Port AS-hop Score RTT(ms) Description

193.xxx.xxx.35 leecher

193.xxx.xxx.36 8000 0 1 0.24327 seeder

128.xxx.xxx.20 8000 3 7955 141.721 seeder

128.xxx.xxx.21 8000 3 7955 141.71 seeder

193.xxx.xxx.250 8000 2 626 8.7664 seeder

193.xxx.xxx.251 8000 2 626 8.74429 seeder

129.xxx.xxx.194 8000 3 8572 153.402 seeder

128.xxx.xxx.197 8000 3 9375 246.033 seeder

128.xxx.xxx.198 8000 3 9375 250.857 seeder

128.xxx.xxx.199 8000 3 9375 252.049 seeder

140.xxx.xxx.180 8000 2 9632 313.362 seeder

128.xxx.xxx.52 8000 4 7882 142.071 seeder

128.xxx.xxx.53 8000 4 7882 145.013 seeder

200.xxx.xxx.34 8000 3 9905 269.373 seeder

192.xxx.xxx.11 8000 2 525 0.874 initial seeder

192.xxx.xxx.12 8000 2 525 0.86848 seeder

Table 6.1: PlanetLab machine properties

installed mysql-devel, mysql-server and mysql++; PeerSelector uses mysql

to save peer addresses. Finally, we also installed curl-devel and GeoIP, as

PeerSelector uses such utilities to retrieve the location for an IP address.

We put the target file on all seeders. Then we choose two machines:

one running as a leecher and another as an initial seeder. In our tests,

193.xxx.xxx.35 is the leecher and 192.xxx.xxx.11 is the initial seeder. Initial

seeder has other seeders’ IP addresses. These peers will be sent to the leecher

as soon as the connection is set up. Since the initial seeder tracks the rest

of the peers and it is the first peer the leecher contacts, it is considered also

a tracker.

When the leecher is launched, it reads a configuration file named config

for three properties:

• ROLE — defines libswift to function as a leecher or seeder. We set

62

Seeder IP Score

A 193.xxx.xxx.250 626

B 193.xxx.xxx.36 1

C 192.xxx.xxx.12 525

D 193.xxx.xxx.251 626

Table 6.2: Score-selected peers

this value to 1, which means this is a leecher

• PEERNO — defines how many concurrent peers the leecher can

communicate with. In our tests, the number is 5

• SORT — defines which policy should be used to sort peers. Number

0 stands for geographic location, number 1 stands for AS-hop, number

2 stands for RTT, and finally number 3 stands for Random.

6.2.1 Experimental results

We tested adapter module with four policies of PeerSelector. In the following

section, we show some of the obtained results for each policy.

Score

Score policy ranks peers according to the distance of two peers. The closest

peer is the first peer which is given back by PeerSelector. From Tab. 6.2,

we know that seeder B is the closest to the leecher. Seeder C is the second

closest.

Figure 6.3 shows that the leecher spent 42 seconds to finish downloading

the target file during the first trial and 48 seconds during the second trial.

To download the content (without the initial waiting time), the leecher takes

37 seconds in the first trial, and 29 seconds in the second trial.

From Figure 6.3 we cannot see an obvious trend. However, in Figures 6.4

and 6.5, we see that seeder B contributed about 80% of the content. If we

check the score saved by PeerSelector, we can see that seeder B has the

63

Figure 6.3: Download performance for Score

lowest score. It should also be noted that the initial delay is caused by the

peer sorting process, which was even longer due to an implementation bug

in the first trial (see Fig. 6.4).

Figure 6.4: Score, trial 1 Figure 6.5: Score, trial 2

AS-hop count

This policy ranks peers according to the AS-hop of two peers. For example,

if peers A and B are in the same AS and they have the same AS number,

then the hop count from A to B is zero. Therefore, peer A’s PeerSelector

64

ranks peer B in the first place.

Seeder IP Hop Count

A 193.xxx.xxx.250 2

B 193.xxx.xxx.36 0

C 192.xxx.xxx.12 2

D 193.xxx.xxx.180 2

Table 6.3: AS-hops-selected peers

Figure 6.6 shows that we spent 36 seconds to finish the first trial and

spent 51 seconds to finish the second trial. The leecher takes 34 seconds to

download the content (without waiting time) during the first trial, and 48

seconds during the second trial.

Figure 6.6: Download performance for AS-hops

From Figures 6.7 and 6.8 we can see that one seeder contributed more

than 70% of the content. From Tab. 6.3, we learn that the AS-hop count to

seeder B is zero and to other seeders is two. Obviously, seeder B has better

performance compared to other seeders; it not only has a zero AS-hop to

the leecher, but also has the lowest distance to the leecher. Finally, we can

see from Figure 6.8 that during the time interval 15 to 22, seeder B stopped

65

working. This can happen if seeder B’s computer runs at 100% CPU or does

not have enough memory. If seeder B’s gateway stops working temporarily,

it may also cause this problem.

Figure 6.7: AS-hops, trial 1 Figure 6.8: AS-hops, trial 2

RTT policy

This policy ranks peers based on the round trip time of two computers.

Therefore, the peer who has the smallest RTT to the leecher will be placed

in the first place on PeerSelector’s queue.

Seeder IP RTT

A 193.xxx.xxx.250 8.7664

B 193.xxx.xxx.36 0.24327

C 192.xxx.xxx.12 0.86848

D 193.xxx.xxx.251 8.74429

Table 6.4: RTT-selected peers

Figure 6.9 shows that we got very different results compared to previous

two policies. Overall, we spent more than 100 seconds to finish two trials.

However, if we ignore the time spent before the real download started, time

is not very long, 30 and 31 seconds respectively. The reason for such a delay

is that the PeerSelector version we tested spends a very long time calculating

the RTT of peers. In our test case, PeerSelector pings fourteen peers to get

66

the RTT value; it pings each peer 10 times to get the average RTT, totaling

140 times.

Figure 6.9: Download performance for RTT

Figures 6.10 and 6.11 show that seeder B provides most of the content

and seeder A contributed just the second most of the content. RTT values

in Tab. 6.4 indicate that seeder B has the smallest value and is thus ranked

as first by PeerSelector. In the first trial, we can see that at time 110 all

lines are flat. This can happen if leecher’s computer runs at 100% CPU or

does not have enough memory.

Figure 6.10: RTT, Trial 1 Figure 6.11: RTT, Trial 2

67

Random policy

When we use this policy to select peers, the PeerSelector does not rank

peers. If libswift wants peers from PeerSelector, the latter returns a random

peer list back.

The first trial took the leecher 41 seconds to finish and the second trial

took the leecher 49 seconds. If we deduct the time spent before libswift starts

downloading content, time periods are 46 and 40 seconds (see Fig. 6.12).

Figure 6.12: Download performance for Random

According to Figures 6.13 and 6.14, there is no seeder superior than

others. All peers provided by PeerSelector are randomly selected.

6.3 Comparison of policies

In Table 6.5 we can see that RTT policy is superior to other policies, and

score is the second best. Random policy is equivalent to libswift downloading

from peers according to its default behavior.

Compared to Random policy, on average, score policy improves download

speed by 31% and RTT improves performance up to 40%. As-hop policy

exhibits very different results: in the first trial — performance improves

68

Figure 6.13: Random, Trial 1 Figure 6.14: Random, Trial 2

Policies Trials Time (s) DW speed (MBps)

Score trial 1 37 1.06

trial 2 29 1.36

AS-hops trial 1 34 1.16

trial 2 48 0.82

RTT trial 1 30 1.32

trial 2 31 1.27

Random trial 1 46 0.86

trial 2 40 0.98

Table 6.5: Comparison of policy-based performances

26%, in the second trial — speed decreases by 11%. Based on our logs, we

know that there are 5 peers whose AS-hop count is two and 7 peers whose

AS-hop count is three. When PeerSelector ranks peers with AS-hop count

equaling two, it cannot tell the difference between these 5 peers. Thus, given

1000 peers, there could be many peers whose AS-hop count would be two

but have different capabilities. In this case, libswift may not always get

peers with best performance. However, we can use AS-hop count policy as

the first filter. Based on the first filter’s results, we can apply a second filter

such as RTT to further select better peers.

69

6.3.1 Side effects of PeerSelector

We have seen that PeerSelector needs time to rank peers. We can name this

time as the ranking time. The more peers we have, the longer the ranking

time. There are two ways which can help reduce this time. First, optimize

PeerSelector’s ranking method. For example, when using RTT, we can only

ping each peer one time instead of 10 times, reducing ranking time 10 times.

Second, rank peers based on historical records. For example, if PeerSelector

already has peer A’s geographic score in the database, next time, when

PeerSelector encounters peer A, it will not calculate A’s score again.

70

Part III

Discussion and Conclusions

71

Chapter 7

Discussion

In this thesis, we took libswift protocol and analyzed its most important

processes and inner heuristics. In particular, we studied the peer discovery

process employed in libswift — gossiping algorithm PEX — and discovered

its limitations. Moreover, we modified PEX messaging and tested its new

functionality. Through experimentation and observation, we investigated

libswift behavior when it retrieves pieces from multiple channels. Finally,

we studied piece picking and congestion control, emphasized their vulnera-

bilities, and suggested future enhancements for them.

Having studied the above processes and having understood current

limitations in the implementation, we designed and implemented a simple

module — adapter — that acts as an interface between libswift and a peer

selection mechanism, helping libswift manage its peers more effectively.

In our experiments, we also discovered that some of the peers contribute

more content than others. In particular, we found that peers with certain

properties exhibit better performance and thus influence libswift’s download

performance. Our tests with peers that have small RTT value, AS-hop

count, or are geographically closer provide some preliminary evidence that

they can help improve download performance for a libswift peer.

72

7.1 Libswift

We found that libswift downloads content unevenly from peers it is

connected with. In case peers have similar RTTs, then libswift retrieves

content from such peers roughly equally. We observe this phenomenon in

test cases for group one and two in Section 6.2.1, where the two channels

between the leecher and seeders have balanced traffic. However, in test

cases with group three, we see that one channel is very busy with the seeder

contributing more than 90% of the content, while the other channel is almost

idle with its seeder waiting to be updated with the latest bin number. This

uneven contribution is due to libswift’s piece selection strategy, as well as, its

congestion control. Currently, libswift downloads content pieces sequentially

and peers (seeders) compete to serve the leecher. In the test case with group

three, we have seeder two whose RTT is approximately 487 times larger than

that of seeder one. According to LEDBAT, congestion window size of seeder

one increases much faster than that of seeder two, therefore seeder one has

a much higher data transfer rate.

We consider that an improved piece selection mechanism should be

adopted in libswift for an enhanced performance, similar to BitTorrent’s.

By randomly distributing pieces among channels, download efficiency should

improve as there is less chance that multiple peers will be serving the same

chunk to the leecher.

When evaluating our adapter module, we found another interesting

result. When libswift wants to connect to peers, it sends a handshake

message to “best” peers that were returned by PeerSelector; however, the

leecher does not always receive responses from such best peers first. In

experiments from Section 6.2, we saw that seeder B has a smaller RTT

value than seeder D and other seeders. In this scenario, though libswift sent

a handshake to seeder B first, it was seeder D that responded to the leecher

first.

If we investigate this issue more closely — when libswift wants to

download content from these peers — we can divide the whole process in two

73

phases. During the first phase, peers who received a handshake request will

initialize new channels to the leecher. This initialization time depends on

the machine’s computing speed. Therefore, if seeder D has faster computing

speed than seeder B, seeder D may respond to the leecher before seeder B

does. We also did a quick test about the computing speed of machines, and

the results confirmed that seeder D indeed exhibits a much faster computing

speed. Consequently, we can imply that RTT is not the only factor affecting

response time. In this circumstance, if a leecher were to download a very

small target file, it may be better to download from a machine with better

computing speed than from a machine with lower RTT.

7.2 Extensions

Before enhancing PEX functionality, libswift would continue to receive peers

from seeders passively thus violating protocol specification. Furthermore, if

swarm were to be as large as say 5000, libswift instance would receive all

5000 peers sooner or later. Modifying current PEX behavior to receive peers

only when actively requesting for them avoids such issues.

Secondly, the adapter extension we devised enables libswift to manage

peers it discovers more efficiently. In particular, libswift has the ability to

describe its peers in terms of RTT, geo-location, and AS-hop count when

combining the adapter with a peer selection mechanism. This feature,

in turn, enables libswift fine-tune its performance by selecting peers with

desired properties.

Regarding policies for peer selection, there are certain assumptions we

had about such policies that are not necessarily true. For example, RTT

policy assumes that if a peer has a better RTT value to the leecher, download

speed would be faster. However, let’s consider an extreme situation: RTT

value of seeder one is only a tenth of that of seeder two, but seeder two is

one hundred time faster than seeder one. In this case, seeder two gives a

better download performance. This phenomenon also happened in our tests.

Further, the theory behind score policy is that if two computers are

74

geographically closer, these two computers have a shorter wire link and

traffic transfer between them is faster. This is only true if we do not consider

computer’s speed and traffic delay between two different ASes.

In our implementation, we considered these properties independently in

order to decrease complexity of the design and realization. If we were to

take other factors (peer properties) and the correlation of these factors into

consideration, system performance could improve more.

75

Chapter 8

Conclusions

In this thesis, we argued that peer management is an important mechanism

that affects the performance of libswift protocol. Through analysis and

experimentation, we showed that peer management indeed impacts libswift’s

download speed and efficiency.

In particular, we investigated and profiled libswift behavior for scenarios

with congestion and multiple-channel download. As a result of our research,

we identified vulnerabilities in the peer management and piece picking

processes of libswift protocol and suggested improvements for them.

Specifically, we complemented the current peer discovery algorithm such

that a peer can discover new peers proactively rather than passively, thus

reducing the overhead of establishing unnecessary channels of communica-

tion.

The main lesson we drew from analyzing the piece picking algorithm is

that the algorithm raises competition among seeders, thus an alternative,

random piece picking algorithm should be considered.

The important inferences we can make about congestion control are that

changing the size of congestion window impacts data transfer rate and thus

download speed, therefore, more suitable values for TARGET and GAIN

should be investigated.

Having made the above inferences, we came to a natural conclusion that

76

a peer management mechanism was necessary for libswift to manage its

peers more effectively. Therefore, we designed and implemented an adapter

module that libswift can use to select and manage peers, and as a result,

improve download speed and reduce overhead. Preliminary results in this

thesis suggested that our adapter worked as designed, moreover, improved

libswift’s download performance.

77

Appendix A

Appendix

. . .

192 . xxx . xxx . 11 : 8000 1024 4063544 chunk=(0 ,4244)

193 . xxx . xxx . 250 : 8000 1024 4064668 chunk=(0 ,4256)

192 . xxx . xxx . 11 : 8000 1024 4065273 chunk=(0 ,4245)

193 . xxx . xxx . 250 : 8000 1024 4067050 chunk=(0 ,4257)

192 . xxx . xxx . 11 : 8000 1024 4070193 chunk=(0 ,4246)

192 . xxx . xxx . 11 : 8000 1024 4070870 chunk=(0 ,4247)

193 . xxx . xxx . 250 : 8000 1024 4071050 chunk=(0 ,4258)

192 . xxx . xxx . 11 : 8000 1024 4071526 chunk=(0 ,4248)

192 . xxx . xxx . 11 : 8000 1024 4072356 chunk=(0 ,4249)

193 . xxx . xxx . 250 : 8000 1024 4072445 chunk=(0 ,4259)

192 . xxx . xxx . 11 : 8000 1024 4072922 chunk=(0 ,4250)

193 . xxx . xxx . 250 : 8000 1024 4073411 chunk=(0 ,4260)

193 . xxx . xxx . 250 : 8000 1024 4073611 chunk=(0 ,4261)

192 . xxx . xxx . 11 : 8000 1024 4074055 chunk=(0 ,4251)

192 . xxx . xxx . 11 : 8000 1024 4074512 chunk=(0 ,4252)

193 . xxx . xxx . 250 : 8000 1024 4074897 chunk=(0 ,4262)

192 . xxx . xxx . 11 : 8000 1024 4075262 chunk=(0 ,4253)

193 . xxx . xxx . 250 : 8000 1024 4075684 chunk=(0 ,4263)

192 . xxx . xxx . 11 : 8000 1024 4076110 chunk=(0 ,4254)

193 . xxx . xxx . 250 : 8000 1024 4076399 chunk=(0 ,4264)

192 . xxx . xxx . 11 : 8000 1024 4076504 chunk=(0 ,4255)

( droped ) 192 . xxx . xxx . 11 : 8000 1024 4077195 chunk=(0 ,4256)

193 . xxx . xxx . 250 : 8000 1024 4077316 chunk=(0 ,4265)


78

193 . xxx . xxx . 250 : 8000 1024 4078310 chunk=(0 ,4266)


193 . xxx . xxx . 250 : 8000 1024 4079135 chunk=(0 ,4267)


193 . xxx . xxx . 250 : 8000 1024 4080033 chunk=(0 ,4268)


193 . xxx . xxx . 250 : 8000 1024 4080707 chunk=(0 ,4269)


193 . xxx . xxx . 250 : 8000 1024 4081527 chunk=(0 ,4270)


193 . xxx . xxx . 250 : 8000 1024 4082434 chunk=(0 ,4271)


193 . xxx . xxx . 250 : 8000 1024 4083489 chunk=(0 ,4272)



193 . xxx . xxx . 250 : 8000 1024 4084674 chunk=(0 ,4273)



193 . xxx . xxx . 250 : 8000 1024 4089434 chunk=(0 ,4274)


193 . xxx . xxx . 250 : 8000 1024 4090517 chunk=(0 ,4275)

193 . xxx . xxx . 250 : 8000 1024 4110318 chunk=(0 ,4276)

193 . xxx . xxx . 250 : 8000 1024 4111347 chunk=(0 ,4277)

193 . xxx . xxx . 250 : 8000 1024 4111416 chunk=(0 ,4278)

193 . xxx . xxx . 250 : 8000 1024 4112552 chunk=(0 ,4279)


193 . xxx . xxx . 250 : 8000 1024 4113319 chunk=(0 ,4280)




193 . xxx . xxx . 250 : 8000 1024 4115820 chunk=(0 ,4281)


193 . xxx . xxx . 250 : 8000 1024 4116515 chunk=(0 ,4282)


193 . xxx . xxx . 250 : 8000 1024 4117895 chunk=(0 ,4283)


193 . xxx . xxx . 250 : 8000 1024 4119161 chunk=(0 ,4284)


79


193 . xxx . xxx . 250 : 8000 1024 4121005 chunk=(0 ,4285)


193 . xxx . xxx . 250 : 8000 1024 4121674 chunk=(0 ,4286)


193 . xxx . xxx . 250 : 8000 1024 4125762 chunk=(0 ,4287)


193 . xxx . xxx . 250 : 8000 1024 4126473 chunk=(0 ,4288)



193 . xxx . xxx . 250 : 8000 1024 4127633 chunk=(0 ,4289)

193 . xxx . xxx . 250 : 8000 1024 4127696 chunk=(0 ,4290)


193 . xxx . xxx . 250 : 8000 1024 4128799 chunk=(0 ,4291)



193 . xxx . xxx . 250 : 8000 1024 4134114 chunk=(0 ,4292)


193 . xxx . xxx . 250 : 8000 1024 4135173 chunk=(0 ,4293)


193 . xxx . xxx . 250 : 8000 1024 4135717 chunk=(0 ,4294)


193 . xxx . xxx . 250 : 8000 1024 4136473 chunk=(0 ,4295)


193 . xxx . xxx . 250 : 8000 1024 4137154 chunk=(0 ,4296)


193 . xxx . xxx . 250 : 8000 1024 4142614 chunk=(0 ,4297)


193 . xxx . xxx . 250 : 8000 1024 4143231 chunk=(0 ,4298)


193 . xxx . xxx . 250 : 8000 1024 4143486 chunk=(0 ,4299)


. . .

Source Code A.1: Logs

80

Bibliography

[1] Gnuplot. http://www.gnuplot.info/.

[2] MaxMind GeoIP. http://www.maxmind.com/app/ip-location.

[3] Planetlab testbed. https://www.planet-lab.org/.

[4] V. Aggarwal, A. Feldmann, and Ch. Scheideler. Can ISPs and P2P users

cooperate for improved performance? SIGCOMM Comput. Commun.

Rev., 2007.

[5] R. Birke, E. Leonardi, M. Mellia, A. Bakay, T. Szemethy, C. Kiraly, and

L. Cigno. Architecture of a network-aware P2P-TV application: the

NAPA-WINE approach. In IEEE Communications Magazine, 2011.

[6] Bram Cohen. Bittorrent protocol specification.

http://www.bittorrent.org/beps/bep 0003.html.

[7] Bram Cohen. Incentives build robustness in bittorrent. In Proceedings

of the Workshop on Economics of Peer-to-Peer Systems, 2003.

[8] C. Cramer, K. Kutzner, and Th. Fuhrmann. Bootstrapping locality-

aware P2P networks. In 12th IEEE International Conference in

Networks (ICON), 2004.

[9] L. D Acunto, N. Andrade, J. Pouwelse, and H. Sips. Peer selection

strategies for improved QoS in heterogeneous bittorrent-like VoD

systems. In IEEE International Symposium in Multimedia (ISM), 2010.

81

[10] R. Alimi et al. ALTO protocol. http://tools.ietf.org/html/draft-ietf-

alto-protocol-13.

[11] T. Eugene and H. Zhang. Predicting Internet network distance with

coordinates-based approaches. In INFOCOM, 2002.

[12] P. Francis, S. Jamin, V. Paxson, L. Zhang, D. Gryniewicz, and Y. Jin.

An architecture for a global Internet host distance estimation service.

In INFOCOM, 1999.

[13] Victor Grishchenko. Libswift protocol. http://libswift.org/.

[14] Victor Grishchenko. Libswift protocol implementation.

http://github.com/triblerteam/libswift.

[15] Victor Grishchenko and Arno Bakker. IETF PPSP working

group draft technical description, Peer-to-Peer Streaming Protocol.

http://datatracker.ietf.org/doc/draft-ietf-ppsp-peer-protocol/.

[16] BitTorrent Inc. DHT protocol specification.

http://www.bittorrent.org/beps/bep 0005.html/.

[17] H. Iyengar, M. Kuehlewind, and S. Shalunov. Low extra delay

background transport (LEDBAT). http://tools.ietf.org/html/draft-

ietf-ledbat-congestion-09.

[18] Rakesh Kumar. PeerSelector: A framework for grouping peers in a P2P

system. 2012.

[19] R. LaFortune and C. Carothers. Simulating large-scale P2P assisted

video streaming. In HICSS, 2009.

[20] F. Lehrieder, S. Oechsner, T. Hossfeld, Z. Despotovic, W. Kellerer, and

M Michel. Can P2P-users benefit from locality-awareness? In IEEE

Proceedings of Peer-to-Peer Computing, 2010.

82

[21] Y. Liu, H. Wang, Y. Lin, and Sh. Cheng. Friendly P2P: Application-

level congestion control for peer-to-peer applications. In IEEE

GLOBECOM, 2008.

[22] Jamshid Mahdavi and Sally Floyd. TCP-friendly unicast rate-based

flow control. ftp://ftp.cs.umass.edu/pub/net/cs691q/tcp-friendly.txt.

[23] D. Milojicic, V. Kalogeraki, R. Lukose, K. Nagaraja, J Pruyne,

B. Richard, S. Rollins, and Z. Xu. Peer-to-Peer Computing. 2003.

[24] J. Pouwelse, P. Garbacki, D. Epema, and H. Sips. An introduction to

the Bittorrent Peer-to-Peer file-sharing system. Report, 2005.

[25] NAPA WINE Project. Network-aware P2P-TV application over wise

networks. http://www.napa-wine.eu.

[26] Rudiger Schollmeier. A definition of peer-to-peer networking for

the classification of peer-to-peer architectures and applications. In

Proceedings of P2P, 2002.

[27] J. Seedorf, S. Kiesel, and M. Stiemerling. Traffic localization for

P2P-applications: The ALTO approach. In IEEE Ninth International

Conference in Peer-to-Peer Computing, 2009.

[28] Jan Seedorf. Application-Layer Traffic Optimization. RFC 5693, 2009.

[29] R. Steinmetz and K. Wehrle. Peer-to-Peer Systems and Applications.

2005.

[30] H. Wang, J. Liu, and K. Xu. On the locality of Bittorrent-based video

file swarming. In Proceedings of IPTPS. Usenix, 2009.

[31] H. Xie, R. Yang, A. Krishnamurthy, G. Liu, and A. Silberschatz. P4P:

provider portal for applications. In Proceedings of the ACM SIGCOMM

2008 conference on Data communication.

83

Documents

LIBSWIFT P2P PROTOCOL: AN ANALYSIS AND EXTENSION