7/31/2019 Live Streaming With
1/14
1
Live Streaming with
Receiver-based Peer-division MultiplexingHyunseok Chang Sugih Jamin Wenjie Wang
AbstractA number of commercial peer-to-peer systems forlive streaming have been introduced in recent years. The behaviorof these popular systems has been extensively studied in severalmeasurement papers. Due to the proprietary nature of thesecommercial systems, however, these studies have to rely on ablack-box approach, where packet traces are collected froma single or a limited number of measurement points, to infervarious properties of traffic on the control and data planes.Although such studies are useful to compare different systemsfrom end-users perspective, it is difficult to intuitively under-stand the observed properties without fully reverse-engineeringthe underlying systems. In this paper we describe the network
architecture of Zattoo, one of the largest production live stream-ing providers in Europe at the time of writing, and present alarge-scale measurement study of Zattoo using data collected bythe provider. To highlight, we found that even when the Zattoosystem was heavily loaded with as high as 20,000 concurrentusers on a single overlay, the median channel join delay remainedless than 2 to 5 seconds, and that, for a majority of users, thestreamed signal lags over-the-air broadcast signal by no morethan 3 seconds.
Index TermsPeer-to-peer system, live streaming, networkarchitecture
I. INTRODUCTION
THERE is an emerging market for IPTV. Numerous com-mercial systems now offer services over the Internet that
are similar to traditional over-the-air, cable, or satellite TV.
Live television, time-shifted programming, and content-on-
demand are all presently available over the Internet. Increased
broadband speed, growth of broadband subscription base, and
improved video compression technologies have contributed to
the emergence of these IPTV services.
We draw a distinction between three uses of peer-to-peer
(P2P) networks: delay tolerant file download of archival ma-
terial, delay sensitive progressive download (or streaming) of
archival material, and real-time live streaming. In the first
case, the completion of download is elastic, depending on
available bandwidth in the P2P network. The applicationbuffer receives data as it trickles in and informs the user
upon the completion of download. The user can then start
playing back the file for viewing in the case of a video
file. Bittorrent and variants are example of delay-tolerant file
download systems. In the second case, video playback starts as
H. Chang is with Alcatel-Lucent Bell Labs, Holmdel, NJ 07733 USA (e-mail: [email protected]).
S. Jamin is with EECS Department, University of Michigan, Ann Arbor,MI 48109 USA (e-mail: [email protected]).
W. Wang is with IBM Ressearch, CRL, Beijing 100193, China (e-mail:[email protected]).
This work was done when authors Chang and Wang were at Zattoo Inc.
soon as the application assesses it has sufficient data buffered
that, given the estimated download rate and the playback rate,
it will not deplete the buffer before the end of file. If this
assessment is wrong, the application would have to either
pause playback and rebuffer, or slow down playback. While
users would like playback to start as soon as possible, the
application has some degree of freedom in trading off playback
start time against estimated network capacity. Most video-on-
demand systems are examples of delay-sensitive progressive-
download application. The third case, real-time live streaming,
has the most stringent delay requirement. While progressivedownload may tolerate initial buffering of tens of seconds or
even minutes, live streaming generally cannot tolerate more
than a few seconds of buffering. Taking into account the
delay introduced by signal ingest and encoding, and network
transmission and propagation, the live streaming system can
introduce only a few seconds of buffering time end-to-end and
still be considered live [1].
The Zattoo peer-to-peer live streaming system was a free-
to-use network serving over 3 million registered users in eight
European countries at the time of study, with a maximum
of over 60,000 concurrent users on a single channel. The
system delivers live streams using a receiver-based, peer-
division multiplexing scheme as described in Section II. Toensure real-time performance when peer uplink capacity is
below requirement, Zattoo subsidizes the networks bandwidth
requirement, as described in Section III. After delving into
Zattoos architecture in detail, we study in Sections IV and V
large-scale measurements collected during the live broadcast
of the UEFA European Football Championship, one of the
most popular one-time events in Europe, in June, 2008 [2].
During the course of the month of June 2008, Zattoo served
more than 35 million sessions to more than one million distinct
users. Drawing from these measurements, we report on the
operational scalability of Zattoos live streaming system along
several key issues:
1) How does the system scale in terms of overlay size and
its effectiveness in utilizing peers uplink bandwidth?
2) How responsive is the system during channel switching,
for example, when compared to the 3-second channel
switch time of satellite TV?
3) How effective is the packet retransmission scheme in
allowing a peer to recover from transient congestion?
4) How effective is the receiver-based peer-division multi-
plexing scheme in delivering synchronized sub-streams?
5) How effective is the global bandwidth subsidy system
in provisioning for flash crowd scenarios?
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011
7/31/2019 Live Streaming With
2/14
2
6) Would a peer further away from the stream source
experience adversely long lag compared to a peer closer
to the stream source?
7) How effective is error-correcting code in isolating packet
losses on the overlay?
We also discuss in Section VI several challenges in in-
creasing the bandwidth contribution of Zattoo peers. Finally,
we describe related works in Section VII and conclude inSection VIII.
I I . SYSTEM ARCHITECTURE
The Zattoo system rebroadcasts live TV, captured from
satellites, onto the Internet. The system carries each TV
channel on a separate peer-to-peer delivery network and is not
limited in the number of TV channels it can carry. Although
a peer can freely switch from one TV channel to another, and
thereby departing and joining different peer-to-peer networks,
it can only join one peer-to-peer network at any one time.
We henceforth limit our description of the Zattoo delivery
network as it pertains to carrying one TV channel. Fig. 1shows a typical setup of a single TV channel carried on the
Zattoo network. TV signal captured from satellite is encoded
into H.264/AAC streams, encrypted, and sent onto the Zattoo
network. The encoding server may be physically separated
from the server delivering the encoded content onto the Zattoo
network. For ease of exposition, we will consider the two as
logically co-located on an Encoding Server. Users are required
to register themselves at the Zattoo website to download a free
copy of the Zattoo player application. To receive the signal
of a channel, the user first authenticates itself to the Zattoo
Authentication Server. Upon authentication, the user is granted
a ticket with limited lifetime. The user then presents this ticket,
along with the identity of the TV channel of interest, to theZattoo Rendezvous Server. If the ticket specifies that the user
is authorized to receive signal of the said TV channel, the
Rendezvous Server returns to the user a list of peers currently
joined to the P2P network carrying the channel, together with
a signed channel ticket. If the user is the first peer to join a
channel, the list of peers it receives contain only the Encoding
Server. The user joins the channel by contacting the peers
returned by the Rendezvous Server, presenting its channel
ticket, and obtaining the live stream of the channel from them
(see Section II-A for details).
Each live stream is sent out by the Encoding Server as
n logical sub-streams. The signal received from satellite is
encoded into a variable-bit rate stream. During periods ofsource quiescence, no data is generated. During source busy
periods, generated data is packetized into a packet stream,
with each packet limited to a maximum size. The Encoding
Server multiplexes this packet stream onto the Zattoo network
as n logical sub-streams. Thus the first packet generated isconsidered part of the first sub-stream, the second packet that
of the second sub-stream, the n-th packet that of the n-th sub-stream. The n+1-th packet cycles back to the first sub-stream,etc. such that the i-th sub-stream carries the mn+i-th packets,where m 0, 1 i n, and n a user-defined constant.We call a set of n packets with the same index multiplier m
Rendezvous Server
Authentication Server
Feedback Server
other admin servers
Demultiplexer
Encoding
Servers
Fig. 1. Zattoo delivery network architecture.
a segment. Thus m serves as the segment index, while iserves as the packet index within a segment. Each segment
is of size n packets. Being the packet index, i also serves asthe sub-stream index. The number mn + i is carried in each
packet as its sequence number.Zattoo uses the Reed-Solomon (RS) error correcting code
(ECC) for forward error correction. The RS code is a sys-
tematic code: of the n packets sent per segment, k < npackets carry the live stream data while the remainder carries
the redundant data [3, Section 7.3]. Due to the variable-bit
rate nature of the data stream, the time period covered by a
segment is variable, and a packet may be of size less than the
maximum packet size. A packet smaller than the maximum
packet size is zero-padded to the maximum packet size for
the purposes of computing the (shortened) RS code, but is
transmitted in its original size. Once a peer has received kpackets per segment, it can reconstruct the remaining n kpackets. We do not differentiate between streaming data and
redundant data in our discussion in the remainder of this paper.
When a new peer requests to join an existing peer, it
specifies the sub-stream(s) it would like to receive from the
existing peer. These sub-streams do not have to be consecutive.
Contingent upon availability of bandwidth at existing peers,
the receiving peer decides how to multiplex a stream onto
its set of neighboring peers, giving rise to our description of
the Zattoo live streaming protocol as a receiver-based, peer-
division multiplexing protocol. The details of peer-division
multiplexing is described in Section II-A while the details of
how a peer manages sub-stream forwarding and stream recon-
struction is described in Section II-B. Receiver-based peer-
division multiplexing has also been used by the latest version
of CoolStreaming peer-to-peer protocol though it differs from
Zattoo in its stream management (Section II-B) and adaptive
behavior (Section II-C) [4].
A. Peer-Division Multiplexing
To minimize per-packet processing time of a stream, the
Zattoo protocol sets up a virtual circuit with multiple fan outs
at each peer. When a peer joins a TV channel, it establishes
a peer-division multiplexing (PDM) scheme amongst a set of
neighboring peers, by building a virtual circuit to each of the
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011
7/31/2019 Live Streaming With
3/14
3
neighboring peers. Baring departure or performance degrada-
tion of a neighbor peer, the virtual circuits are maintained
until the joining peer switches to another TV channel. With the
virtual circuits set up, each packet is forwarded without further
per-packet handshaking between peers. We describe the PDM
boot strapping mechanism in this section and the adaptive
PDM mechanism to handle peer departure and performance
degradation in Section II-C.
The PDM establishment process consists of two phases:
the search phase and the join phase. In the search phase, the
new, joining peer determines its set of potential neighbors. In
the join phase, the joining peer requests peering relationships
with a subset of its potential neighbors. Upon acceptance of a
peering relationship request, the peers become neighbors and
a virtual circuit is formed between them.
Search phase. To obtain a list of potential neighbors, a
joining peer sends out a SEARCH message to a random subset
of the existing peers returned by the Rendezvous Server. The
SEARCH message contains the sub-stream indices for which
this joining peer is looking for peering relationships. The sub-
stream indices is usually represented as a bitmask of n bits,where n is the number of sub-streams defined for the TVchannel. In the beginning, the joining peer will be looking for
peering relationships for all sub-streams and have all the bits
in the bitmask turned on. In response to a SEARCH message,
an existing peer replies with the number of sub-streams it can
forward. From the returning SEARCH replies, the joining peer
constructs a set of potential neighbors that covers the full set of
sub-streams comprising the live stream of the TV channel. The
joining peer continues to wait for SEARCH replies until the
set of potential neighbors contains at least a minimum number
of peers, or until all SEARCH replies have been received.
With each SEARCH reply, the existing peer also returns a
random subset of its known peers. If a joining peer cannotform a set of potential neighbors that covers all of the sub-
streams of the TV channel, it initiates another SEARCH round,
sending SEARCH messages to peers newly learned from the
previous round. The joining peer gives up if it cannot obtain
the full stream after two SEARCH rounds. To help the joining
peer synchronize the sub-streams it receives from multiple
peers, each existing peer also indicates for each sub-stream
the latest sequence number it has received for that sub-stream,
and the existence of any quality problem. The joining peer can
then choose sub-streams with good quality that are closely
synchronized.
Join phase. Once the set of potential neighbors is estab-
lished, the joining peer sends JOIN requests to each potentialneighbor. The JOIN request lists the sub-streams for which
the joining peer would like to construct virtual circuit with the
potential neighbor. If a joining peer has l potential neighbors,each willing to forward it the full stream of a TV channel, it
would typically choose to have each forward only 1/l-th of thestream, to spread out the load amongst the peers and to speed
up error recovery, as described in Section II-C. In selecting
which of the potential neighbors to peer with, the joining
peer gives highest preference to topologically close-by peers,
even if these peers have less capacity or carry lower quality
sub-streams. The topological location of a peer is defined
Zattoo
Peer and
Player
Application
PDM
Fig. 2. Zattoo peer with IOB.
to be its subnet number, autonomous system (AS) number,
and country code, in that order of precedence. A joining
peer obtains its own topological location from the Zattoo
Authentication Server as part of its authentication process.
The list of peers returned by both the Rendezvous Server
and potential neighbors all come attached with topological
locations. A topology-aware overlay not only allows us to
be ISP-friendly, by minimizing inter-domain traffic and
thus save on transit bandwidth cost, but also helps reduce
the number of physical links and metro hops traversed inthe overlay network, potentially resulting in enhanced user-
perceived stream quality.
B. Stream Management
We represent a peer as a packet buffer, called the IOB,
fed by sub-streams incoming from the PDM constructed as
described in Section II-A.1 The IOB drains to (1) a local
media player if one is running, (2) a local file if recording
is supported, and (3) potentially other peers. Fig. 2 depicts
a Zattoo player application with virtual circuits established to
four peers. As packets from each sub-stream arrive at the peer,
they are stored in the IOB for reassembly to reconstruct thefull stream. Portions of the stream that have been reconstructed
are then played back to the user. In addition to providing
a reassembly area, the IOB also allows a peer to absorb
some variabilities in available network bandwidth and network
delay.
The IOB is referenced by an input pointer, a repair pointer,
and one or more output pointers. The input pointer points
to the slot in the IOB where the next incoming packet with
sequence number higher than the highest sequence number
1In the case of the Encoding Server, which we also consider a peer on theZattoo network, the buffer is fed by the encoding process.
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011
7/31/2019 Live Streaming With
4/14
4
received so far will be stored. The repair pointer always
points one slot beyond the last packet received in order and is
used to regulate packet retransmission and adaptive PDM as
described later. We assign an output pointer to each forwarding
destination. The output pointer of a destination indicates the
destinations current forwarding horizon on the IOB. In accor-
dance to the three types of possible forwarding destinations
listed above, we have three types of output pointers: player
pointer, file pointer, and peer pointer. One would typically
have at most one player pointer and one file pointer but
potentially multiple concurrent peer pointers, referencing an
IOB. The Zattoo player application does not currently support
recording.
Since we maintain the IOB as a circular buffer, if the
incoming packet rate is higher than the forwarding rate of a
particular destination, the input pointer will overrun the output
pointer of that destination. We could move the output pointer
to match the input pointer so that we consistently forward the
oldest packet in the IOB to the destination. Doing so, however,
requires checking the input pointer against all output pointers
on every packet arrival. Instead, we have implemented the IOBas a double buffer. With the double buffer, the positions of the
output pointers are checked against that of the input pointer
only when the input pointer moves from one sub-buffer to the
other. When the input pointer moves from sub-buffer a to sub-buffer b, all the output pointers still pointing to sub-buffer b aremoved to the start of sub-buffer a and sub-buffer b is flushed,ready to accept new packets. When a sub-buffer is flushed
while there are still output pointers referencing it, packets that
have not been forwarded to the destinations associate with
those pointers are lost to them, resulting in quality degradation.
To minimize packet lost due to sub-buffer flushing, we would
like to use large sub-buffers. However, the real-time delay
requirement of live streaming limits the usefulness of latearriving packets and effectively puts a cap on the maximum
size of the sub-buffers.
Different peers may request for different numbers of, possi-
bly non-consecutive, sub-streams. To accommodate the differ-
ent forwarding rates and regimes required by the destinations,
we associate a packet map and forwarding discipline with each
output pointer. Fig. 3 shows the packet map associated with an
output peer pointer where the peer has requested sub-streams
1, 4, 9, and 14. Every time a peer pointer is repositioned
to the beginning of a sub-buffer of the IOB, all the packet
slots of the requested sub-streams are marked NEEDed and
all the slots of the sub-streams not requested by the peer are
marked SKIP. When a NEEDed packet arrives and is storedin the IOB, its state in the packet map is changed to READY.
As the peer pointer moves along its associated packet map,
READY packets are forwarded to the peer and their states
changed to SENT. A slot marked NEEDed but not READY,
such as slot n + 4 in Fig. 3, indicates that the packet is lostor will arrive out-of-order and is bypassed. When an out-of-
order packet arrives, its slot is changed to READY and the
peer pointer is reset to point to this slot. Once the out-of-order
packet has been sent to the peer, the peer pointer will move
forward, bypassing all SKIP, NEED, and SENT slots until it
reaches the next READY slot, where it can resume sending.
SKIP NEED
SENT READY
1 n
segment 0
4 9 14
segment m-1
Fig. 3. Packet map associated with a peer pointer.
Peer0
Peer1
Player
File
inputpointer
...
...
PacketMap
PacketMap
PacketMap
segment 0
segmentm-1
segment size: n
sub-buffer 0
sub-buffer 1
received packet empty slot
segment 0
segmentm-1
IOB
PacketMap
repair
pointer
Fig. 4. IOB, input/output pointers and packet maps.
The player pointer behaves the same as a peer pointer except
that all packets in its packet map will always start out marked
NEEDed.
Fig. 4 shows an IOB consisting of a double buffer, with an
input pointer, a repair pointer, and an output file pointer, anoutput player pointer, and two output peer pointers referencing
the IOB. Each output pointer has a packet map associated
with it. For the scenario depicted in the figure, the player
pointer tracks the input pointer and has skipped over some
lost packets. Both peer pointers are lagging the input pointer,
indicating that the forwarding rates to the peers are bandwidth
limited. The file pointer is pointing at the first lost packet.
Archiving a live stream to file does not impose real-time delay
bound on packet arrivals. To achieve the best quality recording
possible, a recording peer always waits for retransmission of
lost packets that cannot be recovered by error correction.
In addition to achieving lossless recording, we use re-
transmission to let a peer recover from transient networkcongestion. A peer sends out a retransmission request when
the distance between the repair pointer and the input pointer
has reached a threshold of R packet slots, usually spanningmultiple segments. A retransmission request consists of an R-bit packet mask, with each bit representing a packet, and the
sequence number of the packet corresponding to the first bit.
Marked bits in the packet mask indicate that the corresponding
packets need to be retransmitted. When a packet loss is
detected, it could be caused by congestion on the virtual
circuits forming the current PDM or congestion on the path
beyond the neighboring peers. In either case, current neighbor
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011
7/31/2019 Live Streaming With
5/14
5
peers will not be good sources of retransmitted packets. Hence
we send our retransmission requests to r random peers that arenot neighbor peers. A peer receiving a retransmission request
will honor the request only if the requested packets are still
in its IOB and it has sufficient left-over capacity, after serving
its current peers, to transmit all the requested packets. Once a
retransmission request is accepted, the peer will retransmit all
the requested packets to completion.
C. Adaptive PDM
While we rely on packet retransmission to recover from
transient congestions, we have two channel capacity adjust-
ment mechanisms to handle longer-term bandwidth fluctua-
tions. The first mechanism allows a forwarding peer to adapt
the number of sub-streams it will forward given its current
available bandwidth, while the second allows the receiving
peer to switch provider at the sub-stream level.
Peers on the Zattoo network can redistribute a highly
variable number of sub-streams, reflecting the high variability
in uplink bandwidth of different access network technologies.
For a full-stream consisting of sixteen constant-bit rate sub-
streams, our prior study show that based on realistic peer
characteristics measured from the Zattoo network, half of the
peers can support less than half of a stream, 82% of peers can
support less than a full-stream, and the remainder can support
up to ten full streams (peers that can redistribute more than
a full stream is conventionally known as supernodes in the
literature) [5]. With variable-bit rate streams, the bandwidth
carried by each sub-stream is also variable. To increase peer
bandwidth usage, without undue degradation of service, we
instituted measurement-based admission control at each peer.
In addition to controlling resource commitment, another goal
of the measurement-based admission control module is tocontinually estimate the amount of available uplink bandwidth
at a peer.
The amount of available uplink bandwidth at a peer is
initially estimated by the peer sending a pair of probe packets
to Zattoos Bandwidth Estimation Server. Once a peer starts
forwarding sub-streams to other peers, it will receive from
those peers quality-of-service feedbacks that inform its update
of available uplink bandwidth estimate. A peer sends quality-
of-service feedback only if the quality of a sub-stream drops
below a certain threshold.2 Upon receiving quality feedback
from multiple peers, a peer first determines if the identified
sub-streams are arriving in low quality. If so, the low quality of
service may not be caused by limit on its own available uplinkbandwidth; in which case, it ignores the low quality feedbacks.
Otherwise, the peer decrements its estimate of available uplink
bandwidth. If the new estimate is below the bandwidth needed
to support existing number of virtual circuits, the peer closes
2Depending on a peers NAT and/or firewall configuration, Zattoo useseither UDP or TCP as the underlying transport protocol. The quality of a sub-stream is measured differently for UDP and TCP. A packet is considered lostunder UDP if it doesnt arrive within a fixed threshold. The quality measurefor UDP is computed as a function of both the packet lost rate and the bursterror rate (number of contiguous packet losses). The quality measure for TCPis defined to be how far behind a peer is, relative to other peers, in servingits sub-streams.
a virtual circuit. To reduce the instability introduced into the
network, a peer closes first the virtual circuit carrying the
smallest number of sub-streams.
A peer attempts to increase its available uplink bandwidth
estimate periodically: if it has fully utilized its current estimate
of available uplink bandwidth without triggering any bad
quality feedback from neighboring peers. A peer doubles the
estimated available uplink bandwidth if current estimate is
below a threshold, switching to linear increase above the
threshold, similar to how TCP maintains its congestion win-
dow size. A peer also increases its estimate of available uplink
bandwidth if a neighbor peer departs the network without any
bad quality feedback.
When the repair pointer lags behind the input pointer by Rpacket slots, in addition to initiating a retransmission request,
a peer also computes a loss rate over the R packets. If theloss rate is above a threshold, the peer considers the neighbor
slow and attempts to reconfigure its PDM. In reconfiguring
its PDM, a peer attempts to shift half of the sub-streams
currently forwarded by the slow neighbor to other existing
neighbors. At the same time, it searches for new peer(s) toforward these sub-streams. If new peer(s) are found, the load
will be shifted from existing neighbors to the new peer(s).
If sub-streams from the slow neighbor continues to suffer
after the reconfiguration of the PDM, the peer will drop the
neighbor completely and initiate another reconfiguration of the
PDM. When a peer loses a neighbor due to reduced available
uplink bandwidth at the neighbor or due to neighbor departure,
it also initiates a PDM reconfiguration. A peer may also
initiate a PDM reconfiguration to switch to a topologically
closer peer. Similar to the PDM establishment process, PDM
reconfiguration is accomplished by peers exchanging sub-
stream bitmasks in a request/response handshake, with each
bit of the bitmask representing a sub-stream. During and aftera PDM reconfiguration, slow neighbor detection is disabled
for a short period of time to allow for the system to stabilize.
III. GLOBAL BANDWIDTH SUBSIDY SYSTEM
Each peer on the Zattoo network is assumed to serve a
user through a media player, which means that each peer
must receive, and can potentially forward, all n sub-streamsof the TV channel the user is watching. The limited redistri-
bution capacity of peers on the Zattoo network means that
a typical client can contribute only a fraction of the sub-
streams that make up a channel. This shortage of bandwidth
leads to a global bandwidth deficit in the peer-to-peer net-
work. Whereas bittorrent-like delay-tolerant file downloads orthe delay-sensitive progressive download of video-on-demand
applications can mitigate such global bandwidth shortage by
increasing download time, a live streaming system such as
Zattoos must subsidize the bandwidth shortfall to provide
real-time delivery guarantee.
Zattoos Global Bandwidth Subsidy System (or simply, the
Subsidy System), consists of a global bandwidth monitoring
subsystem, a global bandwidth forecasting and provisioning
subsystem, and a pool of Repeater nodes. The monitoring
subsystem continuously monitors the global bandwidth re-
quirement of a channel. The forecasting and provisioning
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011
7/31/2019 Live Streaming With
6/14
6
subsytem projects global bandwidth requirement based on
measured history and allocates Repeater nodes to the chan-
nel as needed. The monitoring and provisioning of global
bandwidth is complicated by two highly varying parameters
over time, client population size and peak streaming rate, and
one varying parameter over space, available uplink bandwidth,
which is network-service provider dependent. Forecasting of
bandwidth requirement is a vast subject in itself. Zattoo
adopted a very simple mechanism, described in Section III-B
which has performed adequately in provisioning the network
for both daily demand fluctuations and flash crowds scenarios
(see Section IV-C).
When a bandwidth shortage is projected for a channel, the
Subsidy System assigns one or more Repeater nodes to the
channel. Repeater nodes function as bandwidth multiplier, to
amplify the amount of available bandwidth in the network.
Each Repeater node serves at most one channel at a time;
it joins and leaves a given channel at the behest of the
Subsidy System. Repeater nodes receive and serve all nsub-streams of the channel they join, run the same PDM
protocol, and are treated by actual peers like any other peerson the network; however, as bandwidth amplifiers, they are
usually provisioned to contribute more uplink bandwidth than
the download bandwidth they consume. The use of Repeater
nodes makes the Zattoo network a hybrid P2P and content
distribution network.
We next describe the bandwidth monitoring subsystem of
the Subsidy System, followed by design of the simple band-
width projection and Repeater node assignment subsystem.
A. Global Bandwidth Measurement
The capacity metric of a channel is the tuple (D, C),
where D is the aggregate download rates required by allusers on the channel, and C is the aggregate upload capacityof those users. Usually C < D and the difference betweenthe two is the global bandwidth deficit of the channel. Since
channel viewership changes over time as users join or leave
the channel, we need a scalable means to measure and update
the capacity metric. We rely on aggregating the capacity
metric up the peer division multiplexing tree. Each peer in the
overlay periodically aggregates the capacity metric reported
by all its downstream receiver peers, adds its own capacity
measure (D, C) to the aggregate, and forwards the resultingcapacity metric upstream to its forwarding peers. By the time
the capacity metric percolates up to the Encoding Server, it
contains the total download and upload rate aggregates of thewhole streaming overlay. The Encoding Server then simply
forwards the obtained (D, C) to the Subsidy Server.
B. Global Bandwidth Projection and Provisioning
For each channel, the Subsidy Server keeps a history of the
capacity metric (D, C) reports received from the channelsEncoding Server. The channel utilization ratio (U) is the ratioD over C. Based on recent movements of the ratio U, weclassify the capacity trend of each channel into the following
four categories.
Stable: the ratio U has remained within [S-, S+] forthe past Ts reporting periods.
Exploding: the ratio U increased by at least E betweenTe (e.g., Te = 2) reporting periods.
Increasing: the ratio U has steadily increased by I (IE) over the past Ti reporting periods.
Falling: the ratio U has decreased by F over the past Tfreporting periods.
Orthogonal to the capacity-trend based classification above,
each channel is further categorized in terms of its capacity
utilization ratio as follows.
Under-utilized: the ratio U is below the low thresholdL, e.g., U 0.5.
Near Capacity: the ratio U is almost 1.0, e.g., U 0.9.
If the capacity trend of a channel has been Exploding, one
or more Repeater nodes will be assigned to it immediately. If
the channels capacity trend has been Increasing, it will be
assigned a Repeater node with a smaller capacity. The goal
of the Subsidy System is to keep a channel below Near
Capacity. If a channels capacity trend is Stable and the
channel is Under-utilized, the Subsidy System attempts tofree Repeater nodes (if any) assigned to the channel. If a
channels capacity utilization is Falling, the Subsidy System
waits for the utilization to stabilize before reassigning any
Repeater nodes.
Each Repeater node periodically sends a keep-alive message
to the Subsidy System. The keep-live message tells the Sub-
sidy System which channel the Repeater node is serving, plus
its CPU and capacity utilization ratio. This allows the Subsidy
System to monitor the health of Repeater nodes and to increase
the stability of the overlay during Repeater reassignment.
When reassigning Repeater nodes from a channel, the Subsidy
System will start from the Repeater node with the lowest
utilization ratio. It will notify the selected Repeater node to
stop accepting new peers and then to leave the channel after
a specified time.
In addition to Repeater nodes, the Subsidy System may
recognize extra capacity from idle peers whose owners are
not actively watching a channel. However, our previous study
shows that a large number of idle peers are required to make
any discernible impact on the global bandwidth deficit of a
channel [5]. Our current Subsidy System therefore does not
solicit bandwidth contribution from idle peers.
IV. SERVER-SIDE MEASUREMENTS
In the Zattoo system, two separate centralized collectorservers collect usage statistics and error reports, which we
call the stats server and the user-feedback server re-
spectively. The stats server periodically collects aggregated
player statistics from individual peers, from which full session
logs are constructed and entered into a session database.
The session database gives a complete picture of all past
and present sessions served by the Zattoo system. A given
database entry contains statistics about a particular session,
which includes join time, leave time, uplink bytes, download
bytes, and channel name associated with the session. We
study the sessions generated on three major TV channels from
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011
7/31/2019 Live Streaming With
7/14
7
TABLE ISESSION DATABASE (6/1/20086/30/2008).
Channel # sessions # distinct users
ARD 2,102,638 298,601
Cuatro 1,845,843 268,522
SF2 1,425,285 157,639
TABLE IIFEEDBACK LOGS (6/20/20086/29/2008).
Channel # feedback logs # sessions
ARD 871 1,253
Cuatro 2,922 4,568
SF2 656 1,140
three different countries (Germany, Spain, and Switzerland),
from June 1st to June 30th, 2008. Throughout the paper, we
label those channels from Germany, Spain, and Switzerland as
ARD, Cuatro, and SF2, respectively. Euro 2008 games were
held during this period, and those three channels broadcast a
majority of the Euro 2008 games including the final match.
See Table I for information about the collected session datasets.
The user-feedback server, on the other hand, collects
users error logs submitted asynchronously by users. The user
feedback data here is different from peers quality feedback
used in PDM reconfiguration described in Section II-C. Zat-
too player maintains an encrypted log file which contains,
for debugging purposes, detailed behavior of client-side P2P
engine, as well as history of all the streaming sessions initiated
by a user since the player startup. When users encounter
any error while using the player, such as log-in error, join
failure, bad quality streaming etc., they can choose to report
the error by clicking a Submit Feedback button on the
player, which causes the Zattoo player to send the generatedlog file to the user-feedback server. Since a given feedback
log not only reports on a particular error, but also describes
normal sessions generated prior to the occurrence of the
error, we can study users viewing experience (e.g., channel
join delay) from the feedback logs. Table II describes the
feedback logs collected from June 20th to June 29th. A given
feedback log can contain multiple sessions (for the same or
different channels), depending on users viewing behavior.
The second column in the table represents the number of
feedback logs which contain at least one session generated
on the channel listed in the corresponding entry in the first
column. The numbers in the third column indicate how many
distinct sessions generated on said channel are present in thefeedback logs.
A. Overlay Size and Sharing Ratio
We first study how many concurrent users are supported by
the Zattoo system, and how much bandwidth is contributed by
them. For this purpose, we use the session database presented
in Table I. By using the join/leave timestamps of the collected
sessions, we calculate the number of concurrent users on a
given channel at time i. Then we calculate the average sharingratio of the given channel at the same time. The average
TABLE IIIAVERAGE SHARING RATIO.
ChannelAverage sharing ratioOff-peak Peak
ARD 0.335 0.313
Cuatro 0.242 0.224
SF2 0.277 0.222
sharing ratio is defined as total users uplink rate divided
by their download rate on the channel. A sharing ratio of
one means users contribute to other peers in the network as
much traffic as they download at the time. We calculate the
average sharing ratio from the total download/uplink bytes
of the collected sessions. We first obtain all the sessions
which are active across time i. We call a set of suchsessions Si. Then assuming uplink/download bytes of eachsession are spread uniformly throughout the entire session
duration, we approximate the average sharing ratio at time
i as
iSi
uplink bytes(i)/duration(i)
iSi
download bytes(i)/duration(i).
Fig. 5 shows the overlay size (i.e., number of concurrentusers) and average sharing ratio super-imposed across the
month of June, 2008. According to the figure, the overlay
size grew to more than 20,000 (e.g., 20,059 on ARD on 6/18
and 22,152 on Cuatro on 6/9). As opposed to the overlay
size, the average sharing ratio tends to stay flatter throughout
the month. Occasional spikes in the sharing ratio all occurred
during 2AM to 6AM (GMT) when the channel usage is very
low, and therefore may be considered statistically insignificant.
By segmenting a 24-hour day into two time periods, e.g.,
off-peak hours (0AM-6PM) and peak hours (6PM-0AM),
Table III shows the average sharing ratio in the two time
periods separately. Zattoos usage during peak hours typically
accounts for about 50% to 70% of the total usage of theday. According to the table, the average sharing ratio during
peak hours is slightly lower than, but not very much different
from during off-peak hours. Cuatro channel in Spain exhibits
relatively lower sharing ratio than the two other channels. One
bandwidth test site [6] reports that average uplink bandwidth in
Spain is about 205 kbps, which is much lower than in Germany
(582 kbps) and Switzerland (787 kbps). The lower sharing
ratio on the Spanish channel may reflect regional difference
in residential access network provisioning. The balance of the
required bandwidth is provided by Zattoos Encoding Server
and Repeater nodes.
B. Channel Switching Delay
When user clicks on a new channel button, it takes some
time (a.k.a. channel switching delay) for the user to be
able to start watching streamed video on Zattoo player. The
channel switching delay has two components. First, Zattoo
player needs to contact other available peers and retrieve all
required sub-streams from them. We call the delay incurred
during this stage join delay. Once all necessary sub-streams
have been negotiated successfully, the player then needs to
wait and buffer the minimum amount of streams (e.g., 3
seconds) before starting to show the video to the user. We call
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011
7/31/2019 Live Streaming With
8/14
8
0
5000
10000
15000
20000
25000
0 5 10 15 20 25 3000.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
OverlaySize
SharingRatio
Day in 2008/6
Overlay SizeSharing Ratio
(a) ARD
0
5000
10000
15000
20000
25000
0 5 10 15 20 25 3000.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
OverlaySize
SharingRatio
Day in 2008/6
Overlay SizeSharing Ratio
(b) Cuatro
0
5000
10000
15000
20000
25000
0 5 10 15 20 25 3000.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
OverlaySize
SharingRatio
Day in 2008/6
Overlay SizeSharing Ratio
(c) SF2
Fig. 5. Overlay size and sharing ratio.
0
0.2
0.4
0.6
0.8
1
0 2 4 6 8 10 12 14 16 18 20 22 24
CDF
Arrival Hour
Feedback LogsSession Database
(a) ARD
0
0.2
0.4
0.6
0.8
1
0 2 4 6 8 10 12 14 16 18 20 22 24
CDF
Arrival Hour
Feedback LogsSession Database
(b) Cuatro
0
0.2
0.4
0.6
0.8
1
0 2 4 6 8 10 12 14 16 18 20 22 24
CDF
Arrival Hour
Feedback LogsSession Database
(c) SF2
Fig. 6. CDF of user arrival time.
the resulting wait time buffering delay. The total channel
switching delay experienced by users is thus the sum of join
delay and buffering delay. PPLive reports channel switching
delay around 20 to 30 seconds, but can be as high as 2 minutes,
of which join delay typically accounts for 10 to 15 seconds [7].
We measure the join delay experienced by Zattoo users from
the feedback logs described in Table II. Debugging informa-
tion contained in the feedback logs tells us when user clickedon a particular channel, and when the player has successfully
joined the P2P overlay and starting to buffer content. One
concern in relying on user-submitted feedback logs to infer
join delay is the potential sampling bias associated with them.
Users typically submit feedback logs when they encounter
some kind of errors, and that brings up the question of whether
the captured sessions are representative samples to study.
We attempt to address this concern by comparing the data
from feedback logs against those from the session database.
The latter captures the complete picture of users channel
watching behavior, and therefore can serve as a reference. In
our analysis, we compare the user arrival time distribution
obtained from the two data sets. For fair comparison, we useda subset of the session database which was generated during
the same period when the feedback logs were collected (i.e.,
from June 20th to 29th).
Fig. 6 plots the CDF distribution of user arrivals per hour
obtained from feedback logs and session database separately.
The steep slope of the distributions during hour 18-20 (6-
8PM) indicates the high frequency of user arrivals during
those hours. On ARD and Cuatro, the user arrival distributions
inferred from feedback logs are almost identical to those from
session database. On the other hand, on SF2, the distribution
obtained from feedback logs tends to grow slowly during early
TABLE IVMEDIAN CHANNEL JOIN DELAY.
ChannelMedian join delay Maximum overlay size
Off-peak Peak Off-peak Peak
ARD 2.29 sec 1.96 sec 2,313 19,223
Cuatro 3.67 sec 4.48 sec 2,357 8,073
SF2 2.49 sec 2.67 sec 1,126 11,360
hours, which indicates that feedback submission rate duringoff-peak hours on SF2 was relatively lower than normal. Later
during peak hours, however, feedback submission rate picks
up as expected, closely matching the actual user arrival rate.
Based on this observation, we argue that feedback logs can
serve as representative samples of daily user activities.
Fig. 7 shows the CDF distributions of channel join delay
for ARD, Cuatro and SF2 channels. We show the distributions
for off-peak hours (0AM-6PM) and peak hours (6PM-0AM)
separately. Median channel join delay is also presented in a
similar fashion in Table IV. According to the CDF distribu-
tions, 80% of users experience less than 4 to 8 seconds of
join delay, and 50% of users even less than 2 seconds of join
delay. Also, Table IV shows that even a 10-fold increase onthe number of concurrent users during peak hours does not
unduly lengthen the channel join delay (up to 22% increase
in median join delay).
C. Repeater Node Assignment
As illustrated in Fig. 5, the live coverage of Euro Cup games
brought huge flash crowds to the Zattoo system. With typical
users contributing only about 2530% of the average streaming
bitrate, Zattoo must subsidize the balance. As described in
Section III, the Zattoos Subsidy System assigns Repeater
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011
7/31/2019 Live Streaming With
9/14
9
00.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
CDF
Channel Join Delay (Sec)
Off-Peak HoursPeak Hours
(a) ARD
00.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
CDF
Channel Join Delay (Sec)
Off-Peak HoursPeak Hours
(b) Cuatro
00.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
CDF
Channel Join Delay (Sec)
Off-Peak HoursPeak Hours
(c) SF2
Fig. 7. CDF of channel join delay.
0
5000
10000
15000
20000
25000
18 19 20 21 22 23 240
20
40
60
80
100
120
140
160
180
200
OverlaySize
NumberofRepeatersAssigned
Hour of Day
Overlay SizeNumber of Repeaters
(a) ARD
0
5000
10000
15000
20000
25000
12 14 16 18 20 22 240
20
40
60
80
100
120
140
160
180
200
OverlaySize
NumberofRepeatersAssigned
Hour of Day
Overlay SizeNumber of Repeaters
(b) Cuatro
0
5000
10000
15000
20000
25000
18 19 20 21 22 23 240
20
40
60
80
100
120
140
160
180
200
OverlaySize
NumberofRepeatersAssigned
Hour of Day
Overlay SizeNumber of Repeaters
(c) SF2
Fig. 8. Overlay size and channel provisioning.
nodes to channels that require more bandwidth than its own
globally available aggregate upload bandwidth.
Fig. 8 shows the Subsidy System assigning more Repeater
nodes to a channel as flash crowds arrived during each Euro
Cup game and then gradually reclaiming them as the flash
crowd departed. For each of the channel reported, we choose
a particular date with the biggest flash crowd on the channel.
The dip in overlay sizes on ARD and Cuatro channels occurred
during the half-time break of the games. The Subsidy Server
was less aggressive in assigning Repeater nodes to the ARD
channel, as compared to the other two channels, because the
Repeater nodes in the vicinity of the ARD server have higher
capacity than those near the other two.
V. CLIENT-SIDE MEASUREMENTS
To further study the P2P overlay beyond details obtain-
able from aggregated session-level statistics, we run several
modified Zattoo clients which periodically retrieve the internal
states of other participating peers in the network by exchang-
ing SEARCH/JOIN messages with them. After a given probesession is over, the monitoring client archives a log file where
we can analyze control/data traffic exchanged and detailed
protocol behavior. We run the experiment during Zattoos live
coverage of Euro 2008 (June 7th to 29th). The monitoring
clients tuned to game channels from one of Zattoos data
centers located in Switzerland while the games were broadcast
live. The data sets presented in this paper were collected
during the coverage of the championship final on two separate
channels: ARD in Germany and Cuatro in Spain. Soccer teams
from Germany and Spain participated in the championship
final.
As described in Section II-A, Zattoos peer discovery is
guided by peers topology information. To minimize potential
sampling bias caused by our use of single vantage point for
monitoring, we assigned empty AS number and country
code to our monitoring clients, so that their probing is not
geared towards those peers located in the same AS and
country.
A. Sub-Stream Synchrony
To ensure good viewing quality, peer should not only
obtain all necessary sub-streams (discounting redundant sub-
streams), but also have those sub-streams delivered temporally
synchronized with each other for proper online decoding. Re-
ceiving out-of-sync sub-streams typically results in pixelated
screen on the player. As described in Sections 1 and II-C,
Zattoos protocol favors sub-streams that are relatively in-
sync when constructing the PDM, and continually monitors
the sub-streams progression over time, replacing those sub-
streams that have fallen behind and reconfiguring the PDM
when necessary. In this section we measure the effectivenessof Zattoos Adaptive PDM in selecting sub-streams that are
largely in-sync.
To quantify such inter-sub stream synchrony, we measure
the difference in the latest (i.e., maximum) packet sequence
numbers belonging to different incoming sub-streams. When a
remote peer responds to a SEARCH query message, it includes
in its SEARCH reply the latest sequence numbers that it has
received for all sub-streams. If some sub-streams happen to be
lossy or stalled at that time, the peer marks such sub-streams
in its SEARCH replies. Thus, we can inspect SEARCH replies
from existing peers to study their inter-sub stream synchrony.
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011
7/31/2019 Live Streaming With
10/14
10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
CDF
Number of Bad Sub-streams
Euro 2008 final on ARDEuro 2008 final on Cuatro
(a) CDF for number of bad sub-streams
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 100 200 300 400 500 600 700 800 900 1000
CDF
Sub-stream Synchrony (# of Packets)
Euro 2008 final on ARDEuro 2008 final on Cuatro
(b) CDF for sub-stream synchrony
Fig. 9. Sub-stream synchrony.
In our experiment, we collected SEARCH replies from
4,420 and 6,530 distinct peers from ARD and Cuatro channels
respectively, during the 2-hour coverage of the final game.
From the collected SEARCH replies, we check how many
sub-streams (out of 16) are bad (e.g., lossy or missing) foreach peer. Fig. 9(a) shows the CDF distribution of the number
of bad sub-streams. According to the figure, about 99% (ARD)
and 96% (Cuatro) of peers have 3 or less bad sub-streams.
Current Zattoo deployment dedicates k = 3 sub-streams(out of n = 16) for loss recovery purposes. That is, given asegment of 16 consecutive packets, if peer has received at least
13 packets, it can reconstruct the remaining 3 packets from
the RS error correcting code (see Section 1). Thus if peer can
receive any 13 sub-streams out of 16 reliably, it can decode
the full stream properly. The result in Fig. 9 (a) suggests that
the number of bad sub-streams is low enough as to not cause
quality issues in the Zattoo network.
After discounting bad sub-streams, we then look at the
synchrony of the remaining good sub-streams in each peer.
Fig. 9(b) shows the CDF distribution of the sub-stream syn-
chrony in the two channels. Sub-stream synchrony of a given
peer is defined as the difference between maximum and mini-
mum packet sequence numbers among all sub-streams, which
is obtained from the peers SEARCH reply. For example, if
some peer has sub-stream synchrony measured at 100, it means
that the peer has one sub-stream that is ahead of another sub-
stream by 100 packets. If all the packets are received in order,
the sub-stream synchrony of a peer measures at most n 1.If we received multiple SEARCH replies from the same peer,
we average the sub-stream synchrony across all the replies.
Given the 500 kbps average channel data rate, 60 consecutive
packets roughly correspond to 1-second worth of streaming.
Thus, the figure shows that on Cuatro channel, 20% of peers
have their sub-streams completely in-sync, while more than
90% have their sub-streams lagging each other by at most
5 seconds; on ARD channel, 30% are in-sync, and more than
90% are within 1.5 seconds. The buffer space of Zattoo player
has been dimensioned sufficiently to accommodate such low
degree of out-of-sync sub-streams.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-350 -300 -250 -200 -150 -100 -50 0 50 100
CDF
Relative Peer Synchrony (# of Packets)
Euro 2008 final on CuatroEuro 2008 final on ARD
Fig. 10. Peer synchrony.
B. Peer Synchrony
While sub-stream synchrony tells us stream quality differentpeers may experience, peer synchrony tells us how varied in
time peers viewing points are. With small scale P2P networks,
all participating peers are likely to watch live streaming
roughly synchronized in time. However, as the size of the
P2P overlay grows, the viewing point of edge nodes may be
delayed significantly compared to those closer to the Encoding
Server. In the experiment, we define the viewing point of a
peer as the median of the latest sequence numbers across
its sub-streams. Then we choose one peer (e.g., a Repeater
node directly connected to the Encoding Server) as a reference
point, and compare other peers viewing point against the
reference viewing point.
Fig. 10 shows the CDFs of relative peer synchrony. Therelative peer synchrony of peer X is obtained by the viewingpoint of X subtracted by the reference viewing point. So peersynchrony at -60 means that a given peers viewing point
is delayed by 60 packets (roughly 1 second for a 500 kbps
stream) compared to the reference viewing point. A positive
viewing point means that a given peers stream gets ahead
of the reference point, which could happen for peers which
receive streams directly from the Encoding Server. The figure
shows that about 1% of peers on ARD and 4% of peers on
Cuatro experienced more than 3 seconds (i.e., 180 packets)
delay in streaming compared to the reference viewing point.
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011
7/31/2019 Live Streaming With
11/14
11
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-350 -300 -250 -200 -150 -100 -50 0 50 100
CDF
Peer Synchrony
Peer hops from ES: 6Peer hops from ES: 5Peer hops from ES: 4Peer hops from ES: 3Peer hops from ES: 2
(a) ARD
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-350 -300 -250 -200 -150 -100 -50 0 50 100
CDF
Peer Synchrony
Peer hops from ES: 6Peer hops from ES: 5Peer hops from ES: 4Peer hops from ES: 3Peer hops from ES: 2
(b) Cuatro
Fig. 11. Peer synchrony vs. peer-hop from the Encoding Server.
To better understand to what extent peers position in the
overlay affects peer synchrony, we plot in Fig. 11 the CDFs
of peer synchrony at different depths, i.e., distances from the
Encoding Server. We look at how much delay is introduced
for sub-streams traversing i peers from the Encoding Server(i.e., depth i). For this purpose, we associate per-sub streamviewing point information from SEARCH reply with per-sub
stream overlay depth information from JOIN reply, where
SEARCH/JOIN replies were sent from the same peer, close in
time (e.g., less than 1 second apart). If the length of peer-hop
from the Encoding Server has an adverse effect on playback
delay and viewing point, we expect peers further away from
the Encoding Server to be further offset from the reference
viewing point, resulting in CDFs that are shifted to the upper
left corner for peers at further distances from the Encoding
Server. The figure shows an absence of such a leftward shift
in the CDFs. The median delay at depth 6 does not grow
by more than 0.5 seconds compared to the median delay at
depth 2. The figure shows that having more peers further away
from the Encoding Servers increases the number of peers that
are 0.5 seconds behind the reference viewing point, without
increasing the offset in viewing point itself. On the Zattoo
network, once the virtual circuits comprising a PDM have
been set up, each packet are streamed with minimal delay
through each peer. Hence each peer-hop from the Encoding
Server introduces delay only in tens of milliseconds range,
attesting to the suitability of the network architecture to carry
live media streaming on large-scale P2P networks.
C. Effectiveness of ECC in Isolating Loss
Now we investigate the effects of overlay sizes on the per-
formance scalability of the Zattoo system. Here we focus on
client-side quality (e.g., loss rate). As described in Section 1,
Zattoo-broadcast media streams are RS encoded, which allows
peers to reconstruct a full stream once they obtain at least k ofn sub-streams. Since the ECC-encoded stream reconstructionoccurs hop by hop in the overlay, it can mask sporadic packet
losses, and thus prevent packet losses from being propagated
throughout the overlay at large.
To see if such packet loss containment actually occurs in
the production system, we run the following experiments. We
let our monitoring client join a game channel, stay tuned for
15 seconds, and then leave. We wait for 10 seconds after the
15-second session is over. We repeat this process throughout
the 2-hour game coverage and collect the logs. A given log file
from each session contains a complete list of packet sequence
numbers received from the connected peers during the session,
from which we can detect upstream packet losses. We then
associate individual packet loss with the peer-path from the
Encoding Server. This gives us a rough sense of whether
packets traversing longer hops would experience higher loss
rate. In our analysis, we discount packets delivered for the first
3 seconds of a session to allow the PDM to stabilize.
Fig. 12 shows how the average packet loss rate changes
across different overlay depths (in all cases < 1%). On bothchannels, the packet loss rate does not grow with overlaydepth. This result confirms our expectation that ECC help
localize packet losses on the overlay.
V I . PEE R-TO-P EE R SHARING RATIO
The premise of a given P2P systems success in providing
scalable stream distribution is sufficient bandwidth sharing
from participating users [5]. Section IV-A shows that the
average sharing ratio of Zattoo users ranges from 0.2 to 0.35,
which translates into bandwidth uplinks ranging from 100 kbps
to 175 kbps. This is far lower than the numbers reportedas typical uplink bandwidth in countries where Zattoo is
available [6]. Aside from the possibility that users bandwidth
may be shared with other applications, we find factors such
as users behavior, support for variable-bit rate encoding,
and heterogeneous NAT environments contributing to sub-
optimal sharing performance. Designers of P2P streaming
systems must pay attention to these factors to achieve good
bandwidth sharing. However, one must also keep in mind that
improving bandwidth sharing should not be at the expense
of compromised user viewing experience, e.g., due to more
frequent uplink bandwidth saturation.
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011
7/31/2019 Live Streaming With
12/14
12
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0 1 2 3 4 5 6 7
PacketLossRate(%)
Peer Hops from Encoding Server
(a) ARD
0
0.1
0.2
0.3
0.4
0.5
0.6
0 1 2 3 4 5 6 7
PacketLossRate(%)
Peer Hops from Encoding Server
(b) Cuatro
Fig. 12. Packet loss rate vs. peer hops from Encoding Server.
0
10
20
30
40
50
60
0 10 20 30 40 50 60
AverageReceiver-PeerChurns
Session Length (Min)
Fig. 13. Receiver-peer churn frequency.
User churns: It is known that frequent user churns ac-
companied by short sessions, as well as flash crowd behaviorpose a unique challenge in live streaming systems [7], [8].
User churns can occur in both upstream (e.g., forwarding
peers) and downstream (e.g., receiver peers) connectivity on
the overlay. While frequent churns of forwarding peers can
cause quality problems, frequent changes of receiver peers can
lead to under-utilization of a forwarding peers uplink capacity.
To estimate receiver peers churn behavior, we visualize
in Fig. 13 the relationship between session length and the
frequency of receiver-peer churns experienced during the ses-
sions. We studied receiver-peer churn behavior for the Cuatro
channel by using Zattoos feedback logs described in Table II.
The y-value at session length x minutes denotes the average
number of receiver-peer churns experienced for sessions withlength [x, x + 1). According to the figure, the receiver-peerchurn frequency tends to grow linearly with session length,
and peers experience approximately one receiver-peer churn
every minute.
Variable-bit rate streams: Variable-bit rate (VBR) en-
coding is commonly used in production streaming systems,
including Zattoo, due to its better quality-to-bandwidth ra-
tio compared to constant-bit rate encoding. However, VBR
streams may put additional strain on peers bandwidth con-
tribution and quality optimization. As described in Sec-
tion II-C, the presence of VBR streams require peers to
perform measurement-based admission control when allocat-
ing resources to set up a virtual circuit. To avoid degrada-
tion of service due to overloading of the uplink bandwidth,
the measurement-based admission control module must be
conservative both in its reaction to increases in availablebandwidth and its allocation of available bandwidth to newly
joining peers. This conservativeness necessarily lead to under-
utilization of resources.
NAT reachability: Asymmetric reachability imposed by
prevalent NAT boxes adds to the difficulty in achieving full
utilization of users uplink capacity. Zattoo delivery system
supports 6 different NAT configurations: open host, full cone,
IP-restricted, port-restricted, symmetric, and UDP-disabled,
listed in increasing degree of restrictiveness. Not every pair-
wise communication can occur among different NAT types.
For example, peers behind a port-restricted NAT box cannot
communicate with those of symmetric NAT type (we docu-
mented a NAT reachability matrix in an earlier work [5]).
To examine how the varied NAT reachability comes into
play as far as sharing performance is concerned, we performed
the following two controlled experiments. In both experiments,
we run four Zattoo clients, each with a distinct NAT type,
tuned to the same channel concurrently for 30 minutes. In
the first experiment, we fixed the maximum allowable uplink
capacity (denoted max cp) of those clients to the same con-
stant value.3 In the second experiment, we let the max cp of
those clients self-adjust over time (which is the default setting
of Zattoo player). In both experiments, we monitored how the
uplink bandwidth utilization ratio (i.e., ratio between actively
used uplink bandwidth and maximum allowable bandwidthmax cp) changes over time. The experiments were repeated
three times for reliability.
Fig. 14 shows the capacity utilization ratio from these two
experiments for four different NAT types. Fig. 14(a) visualizes
how fast the capacity utilization ratio converges to 1.0 over
time after the client joins the network. Fig. 14(b) plots the
average utilization ratio (i.e., [
tcurrent capacity(t)dt] /
[
tmax cp(t)dt]). In both constant and adjustable cases, the
3The experiments were performed from one of Zattoos data centers inEurope, and there was sufficient uplink bandwidth available to support 4 *max cp.
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011
7/31/2019 Live Streaming With
13/14
13
results clearly exemplify the varied sharing performance across
different NAT types. Especially, the symmetric NAT type,
which is the most restrictive among the four, shows inferior
sharing ratio compared to the rest. Unfortunately, however, the
symmetric NAT type is the second most popular NAT type
for Zattoo population [5], and therefore can seriously affect
Zattoos overall sharing performance. There is a relatively
small number of peers running IP-restricted NAT-type, hence
its sharing performance has not been fine-tuned at Zattoo.
Reliability of NAT detection: To allow communications
among clients behind a NAT gateway (i.e., NATed clients),
each client in Zattoo performs a NAT detection procedure
upon player startup to identify its NAT configuration and
advertises it to the rest of the Zattoo network. Zattoos NAT
detection procedure implements a UDP-based standard STUN
protocol which involves communicating with external STUN
servers to discover the presence/type of a NAT gateway [9].
The communication occurs in UDP with no reliable transport
guarantee. Lost or delayed STUN UDP packets may lead
to inaccurate NAT detection, preventing NATed clients from
contacting each other, and therefore adversely affect theirsharing performance.
To understand the reliability and accuracy of STUN-based
NAT detection procedure, we utilize Zattoos session database
(Table I) which stores among other things NAT information,
public IP address, and private IP address for each session. We
assume that all the sessions associated with the same pub-
lic/private IP address pair would be generated under the same
NAT configuration. If sessions with the same public/private
IP address pair report inconsistent NAT detection results, we
consider those sessions as having experienced failed NAT
detection, and apply the majority rule to determine the correct
result for that configuration.
Fig. 15 plots the daily trend of NAT detection failure ratederived from ARD, Cuatro and SF2 during the month of June,
2008. The NAT detection failure rate at hour x is the numberof bogus NAT detection results divided by the total number of
NAT detections occurring on that hour. The total number of
NAT detections indicates how busy Zattoo system was through
out. The NAT detection failure rate grows from around 1.5%
at 2-3AM to a peak of almost 6% at 8PM. This means that at
the busiest times, the NAT type of about 6% of clients are not
determinable, leading to them not contributing any bandwidth
to the P2P network.
VII. RELATED WORKS
Aside from Zattoo, several commercial peer-to-peer systemsintended for live streaming have been introduced since 2005,
notably PPLive, PPStream, SopCast, TVAnts, and UUSee from
China, and Joost, Livestation, Octoshape, and RawFlow from
the EU. A large number of measurement studies have been
done on one or the other of these systems [7], [10][16].
Many research prototypes and improvements to existing P2P
systems have also been proposed and evaluated [4], [17][24].
Our study is unique in that we are able to collect network
core data from a large production system with over 3 million
registered users with intimate knowledge of the underlying
network architecture and protocol.
0
1
2
3
4
5
6
7
0 2 4 6 8 10 12 14 16 18 20 22 240.0E+
5.0E+4
1.0E+5
1.5E+5
2.0E+5
2.5E+5
3.0E+5
3.5E+5
NATDetectionFailureRate(%)
Num
berofNATDetections
Hour of Day
NAT failure rate# of NAT detections
Fig. 15. NAT detection failure rate.
P2P systems are usually classified as either tree-based push
or mesh-based swarming [25]. In tree-based push schemes,
peers organize themselves into multiple distribution trees [26]
[28]. In mesh-based swarming, peers form a randomly con-
nected mesh, and content is swarmed via dynamically con-
structed directed acyclic paths [21], [29], [30]. Zattoo is not
only a hybrid between these two, similar to the latest version
of Coolstreaming [4], its dependence on Repeater nodes also
makes it a hybrid of P2P network and a content-distribution
network (CDN), similar to PPlive [7].
VIII. CONCLUSION
We have presented a receiver-based, peer-division multi-
plexing engine to deliver live streaming content on a peer-to-
peer network. The same engine can be used to transparently
build a hybrid P2P/CDN delivery network by adding Repeater
nodes to the network. By analyzing large amount of usage data
collected on the network during one of the largest viewing
event in Europe, we have shown that the resulting network
can scale to a large number of users and can take good
advantage of available uplink bandwidth at peers. We have also
shown that error-correcting code and packet retransmission
can help improve network stability by isolating packet losses
and preventing transient congestion from resulting in PDM
reconfigurations. We have further shown that the PDM and
adaptive PDM schemes presented have small enough overhead
to make our system competitive to digital satellite TV in terms
of channel switch time, stream synchronization, and signal lag.
ACKNOWLEDGMENT
The authors would like to thank Adam Goodman for devel-oping the Authentication Server, Alvaro Saurin for the error
correcting code, Zhiheng Wang for an early version of the
stream buffer management, Eric Wucherer for the Rendezvous
Server, Zach Steindler for the Subsidy System, and the rest of
Zattoo development team for fruitful collaboration.
REFERENCES
[1] R. auf der Maur, Die Weiterverbreitung von TV- und Radioprogrammenuber IP-basierte Netze, in Entertainment Law (f. d. Schweiz), 1st ed.Stampfli Verlag, 2006.
[2] UEFA, Euro2008, http://www1.uefa.com/.
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011
7/31/2019 Live Streaming With
14/14
14
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20 25 30
CurrentCapacityUtilizationRatio
Time Since Joining (Min)
Full coneIP-restricted
Port-restrictedSymmetric
(a) Constant maximum capacity
0
0.2
0.4
0.6
0.8
1
AverageCap
acityUtilizationRatio
Full-cone IP-restricted Port-restricted Symmetric
(b) Adjustable maximum capacity
Fig. 14. Uplink capacity utilization ratio.
[3] S. Lin and D. J. Costello, Jr., Error Control Coding, 2nd ed. PearsonPrentice-Hall, 2004.
[4] S. Xie, B. Li, G. Y. Keung, and X. Zhang, CoolStreaming: Design,Theory, and Practice, IEEE Trans. on Multimedia, vol. 9, no. 8,December 2007.
[5] K. Shami et al., Impacts of Peer Characteristics on P2PTV NetworksScalability, in Proc. IEEE INFOCOM Mini Conference, April 2009.
[6] Bandwidth-test.net, Bandwidth test statistics across different countries,http://www.bandwidth-test.net/stats/country/.
[7] X. Hei, C. Liang, J. Liang, Y. Liu, and K. W. Ross, Insights intoPPLive: A Measurement Study of a Large-Scale P2P IPTV System, inProc. IPTV Workshop, International World Wide Web Conference, 2006.
[8] B. Li et al., An Empirical Study of Flash Crowd Dynamics ina P2P-Based Live Video Streaming System, in Proc. IEEE GlobalTelecommunications Conference, 2008.
[9] J. R. et al, STUN - Simple Traversal of User Datagram Protocol (UDP)Through Network Address Translators (NATs), 1993, RFC 3489.
[10] A. Ali, A. Mathur, and H. Zhang, Measurement of Commercial Peer-to-Peer Live Video Streaming, in Proc. Workshop on Recent Advancesin Peer-to-Peer Streaming, August 2006.
[11] L. Vu et al., Measurement and Modeling of a Large-scale Overlay
for Multimedia Streaming, in Proc. Intl Conf. on HeterogeneousNetworking for Quality, Reliability, Security and Robustness, 2007.
[12] T. Silverston and O. Fourmaux, Measuring P2P IPTV Systems, inProc. ACM NOSSDAV, November 2008.
[13] C. Wu, B. Li, and S. Zhao, Magellan: Charting Large-Scale Peer-to-Peer Live Streaming Topologies, in Proc. ICDCS07, 2007, p. 62.
[14] M. Cha et al., On Next-Generation Telco-Managed P2P TV Architec-tures, in Proc. Intl Workshop on Peer-to-Peer Systems, 2008.
[15] D. Ciullo et al., Understanding P2P-TV Systems Through Real Mea-surements, in Proc. IEEE GLOBECOM, November 2008.
[16] E. Alessandria et al., P2P-TV Systems under Adverse Network Con-ditions: a Measurement Study, in Proc. IEEE INFOCOM, April 2009.
[17] S. Banerjee et al., Construction of an Efficient Overlay MulticastInfrastructure for Real-time Applications, in Proc. IEEE INFOCOM,2003.
[18] D. Tran, K. Hua, and T. Do, ZIGZAG: An Efficient Peer-to-PeerScheme for Media Streaming, in Proc. IEEE INFOCOM, 2003.
[19] D. Kostic et al., Bullet: High Bandwidth Data Dissemination Usingan Overlay Mesh, in Proc. ACM SOSP, Bolton Landing, NY, USA,October 2003.
[20] R. Rejaie and S. Stafford, A Framework for Architecting Peer-to-PeerReceiver-driven Overlays, in Proc. ACM NOSSDAV, 2004.
[21] X. Liao et al., AnySee: Peer-to-Peer Live Streaming, in Proc. IEEEINFOCOM, April 2006.
[22] F. Pianese et al., PULSE: An Adaptive, Incentive-Based, UnstructuredP2P live Streaming System, IEEE Trans. on Multimedia, vol. 9, no. 6,2007.
[23] J. Douceur, J. Lorch, and T. Moscibroda, Maximizing Total Uploadin Latency-Sensitive P2P Applications, in Proc. ACM SPAA, 2007, pp.270279.
[24] Y.-W. Sung, M. Bishop, and S. Rao, Enabling Contribution Awarenessin an Overlay Broadcasting System, in Proc. ACM SIGCOMM, 2006.
[25] N. Magharei, R. Rejaie, and Y. Guo, Mesh or Multiple-Tree: AComparative Study of Live P2P Streaming Approaches, in Proc. IEEE
INFOCOM, May 2007.[26] V. N. Padmanabhan, H. J. Wang, and P. A. Chou, Resilient Peer-to-Peer
Streaming, in Proc. IEEE ICNP, November 2003.
[27] M. Castro et al., SplitStream: High-Bandwidth Multicast in CooperativeEnvironments, in Proc. ACM SOSP, October 2003.
[28] J. Liang and K. Nahrstedt, DagStream: Locality Aware and FailureResilient Peer-to-Peer Streaming, in Proc. SPIE Multimedia Computingand Networking, January 2006.
[29] N. Magharei and R. Rejaie, PRIME: Peer-to-Peer Receiver-drIvenMEsh-based Streaming, in Proc. IEEE INFOCOM, May 2007.
[30] M. Hefeeda et al., PROMISE: Peer-to-Peer Media Streaming UsingCollectCast, in Proc. ACM Multimedia, November 2003.
Hyunseok Chang received the B.S. degree in elec-trical engineering from Seoul National University,Seoul, Korea, in 1998, and the M.S.E. and Ph.D.degrees in computer science and engineering fromthe University of Michigan, Ann Arbor, in 2001and 2006, respectively. He is currently a Member
of Technical Staff with the Network Protocols andSystems Department, Al- catel-Lucent Bell Labs,Holmdel, NJ. Prior to that, he was a Software Ar-chitect at Zattoo, Inc., Ann Arbor, MI. His researchinterests include content distribu- tion network, mo-
bile applications, and network measurements.
Sugih Jamin received the Ph.D. degree in computerscience from the University of Southern California,Los Angeles, in 1996. He is an Associate Professorwith the Department of Electrical Engineering andComputer Science, University of Michigan, AnnArbor. He cofounded a peer-to-peer live streamingcompany, Zattoo, Inc., Ann Arbor, MI, in 2005.Dr. Jamin received the National Science Founda-tion (NSF) CAREER Award in 1998, the Presiden-tial Early Career Award for Scientists and Engineers
(PECASE) in 1999, and the Alfred P. Sloan ResearchFellowship in 2001.
Wenjie Wang received the Ph.D. degree in com-puter science and engineering from the Universityof Michigan, Ann Arbor, in 2006. He is a ResearchStaff Member with IBM Research China, Beijing,China. He co-founded a peer-to-peer live streamingcompany, Zattoo Inc., Ann Arbor, MI, in 2005, basedon his Ph.D. dissertation, and was CTO. He alsoworked in Microsoft Research China and OracleChina as a visiting student and consultant intern.
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 19, NO. 1, FEB 2011