125
www.ischool.drexel.edu INFO 331 Computer Networking Technology II Chapter 7 Multimedia Networking Glenn Booker 1 INFO 331 chapter 7

Chapter 7

Embed Size (px)

Citation preview

Page 1: Chapter 7

www.ischool.drexel.edu

INFO 331Computer Networking

Technology II Chapter 7

Multimedia Networking

Glenn Booker

1INFO 331 chapter 7

Page 2: Chapter 7

www.ischool.drexel.edu

Multimedia Networking

• Recent years have seen massive growth in Internet audio and video apps– Streaming video, IP telephony, Internet radio,

teleconferencing, interactive games, distance learning, etc.

• Older Internet apps (email, WWW, FTP) were very elastic in bandwidth needs, but multimedia is much fussier – delay sensitive– But they’re also more loss-tolerant

2INFO 331 chapter 7

Page 3: Chapter 7

www.ischool.drexel.edu

Multimedia Networking

• Multimedia apps are in three categories– Streaming stored audio & video (AV)– Streaming live AV– Real-time interactive AV

• This excludes download-stuff-and-play, such as MP3s, iTunes, etc.– Download entire file before play starts – FTP or HTTP are fine for them

3INFO 331 chapter 7

Page 4: Chapter 7

www.ischool.drexel.edu

Streaming stored AV

• Here, clients request on-demand compressed AV files that are stored on servers– Content may include lectures, music, TV, etc.– CNN video, YouTube, etc.

• Main features of this app type are:– Stored, prerecorded media; pause, ffwd, rew– Streaming, hence can play part of the media while

downloading more of it– Continuous play out – should keep original timing

4INFO 331 chapter 7

Page 5: Chapter 7

www.ischool.drexel.edu

Streaming live AV

• This is live broadcast of radio or TV over the Internet– Can’t fast forward, since it hasn’t happened yet, but

local storage of what’s been received can allow pausing and rewinding in some cases

– Often accomplished using IP multicasting or IPTV, but more often done via separate unicast streams

– Also has continuous play out, can tolerate some startup delay

5INFO 331 chapter 7

Page 6: Chapter 7

www.ischool.drexel.edu

IPTV

• IPTV (TV over IP) is a challenge in terms of bandwidth– Traditional client/server can’t work– Often use P2P techniques (see

CoolStreaming, PPLive), or content distribution networks (CDNs, which will be discussed later)

6INFO 331 chapter 7

Page 7: Chapter 7

www.ischool.drexel.edu

Real-time interactive AV

• This class of apps allows interaction between people at both ends (or many ends) of a connection, such as Internet telephony or teleconferencing– Apps can be integrated, e.g. Web-phone– Microsoft Live Messenger, Live Meeting (was

NetMeeting), WebEx, GoToMeeting, Skype– Transmission delays under 150 ms are good,

under 400 ms is okay, and over 400 ms is bad

7INFO 331 chapter 7

Page 8: Chapter 7

www.ischool.drexel.edu

Multimedia Challenges

• The Internet provides best-effort service– No guarantees from IP on when stuff will get there,

consistency in delay times, if it will get there, or getting there in order

– All packets are equal in the Internet!

• Streaming stored or live AV has been pretty successful, real-time interactive AV less so

• To help, use tricks to make transmission smoother

8INFO 331 chapter 7

Page 9: Chapter 7

www.ischool.drexel.edu

Multimedia Tricks

• At app level, we’ll look at common tricks to make multimedia smoother, such as– Use UDP to avoid TCP congestion control– Delay playback by 100 ms to help allow for jitter– Timestamp packets to know when they should

be played– Stored data can be fetched in advance, to help cover

slow periods– Send redundant data to help cover data losses

9INFO 331 chapter 7

Page 10: Chapter 7

www.ischool.drexel.edu

Fix the Internet!

• Some argue the Internet should allow for end-to-end bandwidth guarantees, like virtual circuit networks can provide– Would require massive changes to routers to

establish fixed paths for some service types– Hard guarantees mean you’re certain to

receive the QoS you paid for– Soft guarantees mean you’re likely to receive

the QoS you paid for

10INFO 331 chapter 7

Page 11: Chapter 7

www.ischool.drexel.edu

Fix the Internet!

• Others insist the Internet doesn’t need massive changes– Let ISPs upgrade bandwidth as customers

demand it– Increase ISP caching for common requested

stored AV– Add content distribution networks (CDNs) for

paid stored media, conveniently near network edges

11INFO 331 chapter 7

Page 12: Chapter 7

www.ischool.drexel.edu

Fix the Internet!

– Create multicast overlay networks – servers which help distribute streams to large audiences

• Third approach is to add pricing at the network & transport layers, to pay for better service (Diffserv approach)

12INFO 331 chapter 7

Page 13: Chapter 7

www.ischool.drexel.edu

AV Compression

• Compression helps speed information transmission rate by reducing its volume

• Long done for static images (JPG, GIF)– Each pixel is 24 bits of color data (RGB),

• A 1024x768 pixel image is 1024x768x24/8 = 2.36 MB

– Compression can reduce image size a factor of 10 without severe quality loss

• Key tradeoff: more compression = more losses

• Huge field (e.g. ISBN 1565921615)13INFO 331 chapter 7

Page 14: Chapter 7

www.ischool.drexel.edu

Audio Compression

• Digital audio signals have three major characteristics– The sampling rate (samples per second)– The number of channels sampled (mono,

stereo, etc.)– The number of bits per sample

• These all affect the raw amount of data being sent, before compression is applied

14INFO 331 chapter 7

Page 15: Chapter 7

www.ischool.drexel.edu

Audio Compression

• Raw audio signals are recorded at some sample rate per second, per channel– 8k, 16k, or 32k samples/sec (Hz) are common

low grade rates– 44.1k (CD quality), 48k, 96k, and even

192kHz sample rates are used for professional audio

• The number of channels used is typically one (mono), two (stereo), or 5 to 7 (surround)

15INFO 331 chapter 7

Page 16: Chapter 7

www.ischool.drexel.edu

Audio Compression

• Each sample is quantized into some number of bits to describe its relative strength– 8 bits gives 256 values from silent to REALLY LOUD,

which is typical for cheap built-in audio– CD quality uses 16 bits per sample– Pro audio typically uses 24 or 32 bits per sample

• So one minute of absurd quality 7 channel surround would use 192k sample/sec x 7 channels x 32

bits/sample / 8 bit/byte * 60 sec/min = 322.56 MB/min!

16INFO 331 chapter 7

Page 17: Chapter 7

www.ischool.drexel.edu

Audio Compression

• This approach for audio is formally called Pulse Code Modulation (PCM)– Mono speech recording at 8kHz sample rate

and 8 bits/sample equals 64 kbps of data– CD quality (44.1 kHz, 16 bit, stereo) = 1.411

Mbps

• Modems can’t handle 64 kbps, and most broadband users can’t consistently get 1.4 Mbps, so compression is needed even for just audio

17INFO 331 chapter 7

Page 18: Chapter 7

www.ischool.drexel.edu

Audio Compression

• Common audio compression standards include– GSM, G.729, and G.723.3 from ITU– “MPEG 1 layer 3”, a.k.a. MP3, which

compresses to 96, 128, or 160 kbps

• From Drexel library, can get electronic copy of Compression technologies for video and audio, Jerry C. Whitaker, ISBN 0071391460

18INFO 331 chapter 7

Page 19: Chapter 7

www.ischool.drexel.edu

Video Compression

• Video is a series of images presented at 24 or 30 images per second– Hence the size of the images (X by 0.75*X

pixels), and the color resolution (number of bits per pixel) also affect the amount of raw data

– Widescreen image sizes have a 16x9 ratio instead of 4x3 ratio

19INFO 331 chapter 7

Page 20: Chapter 7

www.ischool.drexel.edu

Video Compression

• Video compression standards are mostly the MPEG family– MPEG 1 for CD quality video at 1.5 Mbps– MPEG 2 for DVD quality video, 3-6 Mbps– MPEG 4 for object oriented video– H.261 from ITU– Proprietary formats, such as QuickTime

(includes MPEG-4 and H.264), Real networks, etc.

20INFO 331 chapter 7

Page 21: Chapter 7

www.ischool.drexel.edu

Streaming Stored AV

• Streaming audio/video has become terribly popular (YouTube, AOL Video) because– Disk space is dirt cheap (< 15¢/GB)– Internet infrastructure is improving– There’s enormous demand to entertain me NOW

• In this multimedia mode, clients request compressed AV files that live on servers– Files are segmented, with special headers used for

encapsulating them

21INFO 331 chapter 7

Page 22: Chapter 7

www.ischool.drexel.edu

Streaming Stored AV

• Other examples of streaming stored AV– Rhapsody, a proprietary client from

RealNetworks– MSN Video, usually played through Windows

Media Player (WMP)– Muze, an audio service to retailers

22INFO 331 chapter 7

Page 23: Chapter 7

www.ischool.drexel.edu

Streaming Stored AV

• Various protocols can be used– RTP – Real Time Protocol is used to

encapsulate the segments– RTSP – Real Time Streaming Protocol is

used for client/server interaction

• Users typically request files via a web browser which has a media player plug-in, such as Flash, Quicktime, Shockwave, RealPlayer or Windows Media Player

23INFO 331 chapter 7

Page 24: Chapter 7

www.ischool.drexel.edu

Streaming Stored AV

• The media player has several functions– Decompress the audio or video files– Buffer the incoming data to smooth out jitter– Repair damage from lost packets as an

attempt at error correction

• The media player has a GUI interface, which may be integrated into the rest of a web page, or appear as a new dedicated window

24INFO 331 chapter 7

Page 25: Chapter 7

www.ischool.drexel.edu

Accessing AV via Web Server

• Streaming stored AV can either be on a web server, or on dedicated streaming servers

• In the former case:– A TCP connection is established– An HTTP message requests the desired file– The audio file is encapsulated in the HTTP

response message

• A video file may be separate from audio, so the media player may have to assemble them

25INFO 331 chapter 7

Page 26: Chapter 7

www.ischool.drexel.edu

Accessing AV via Web Server

– HTTP can have parallel downloads, so both audio and video can be downloaded at the same time

• Or audio and video might be in one file, making the process simpler

• The file(s) are passed to the media player, which decompresses them and plays them

• But this assumes the entire file is downloaded before playing begins *grumble*

26INFO 331 chapter 7

Page 27: Chapter 7

www.ischool.drexel.edu

Accessing AV via Web Server

• To avoid this, most media players have the server send the AV file directly to the media player process– Done via a meta file, which tells what kind of

file will be streamed

• The process becomes:– User click on link for the desired file– Hyperlink is to the meta file– HTTP response contains the meta file

27INFO 331 chapter 7

Page 28: Chapter 7

www.ischool.drexel.edu

Accessing AV via Web Server

– Client’s web browser passes meta file to the media player

• The meta file tells the browser which media player to use!

– Media player sets up TCP connection with the HTTP server, and requests the actual AV file

– AV file is sent to the media player, which then streams it

• So this could work, but it’s slow (TCP) and doesn’t allow pausing or rewinding easily

28INFO 331 chapter 7

Page 29: Chapter 7

www.ischool.drexel.edu

Accessing AV via Streaming Server

• To avoid the slowness of HTTP/TCP, AV can be sent via a dedicated streaming server over UDP– Custom protocols can be used in place of

HTTP

• Now one server has HTTP and meta files, and the streaming server has the AV files– These could be two physical servers, or one

29INFO 331 chapter 7

Page 30: Chapter 7

www.ischool.drexel.edu

Accessing AV via Streaming Server

• Now the AV is sent over UDP, at a rate equal to its play rate (or drain rate)– Playout is delayed 2-5 seconds, to allow for

jitter– Data is put in a client buffer, from which it’s

played after the delay

• Or could send AV data over TCP, to get better sound quality, at the risk of sound pauses (client buffer starvation)

30INFO 331 chapter 7

Page 31: Chapter 7

www.ischool.drexel.edu

Accessing AV via Streaming Server

31INFO 331 chapter 7

Page 32: Chapter 7

www.ischool.drexel.edu

Client Buffer

32INFO 331 chapter 7

Page 33: Chapter 7

www.ischool.drexel.edu

RTSP

• The real time streaming protocol (RTSP) allows the user to control playback of audio or video– Defined by RFC 2326

• It does NOT define compression schemes to be used, encapsulation (see RTP), transport protocol (TCP or UDP), or buffering approach– Those are all application-level concerns

33INFO 331 chapter 7

Page 34: Chapter 7

www.ischool.drexel.edu

RTSP

• So what DOES it do?– Allow user to pause/resume, reposition playback, fast

forward, or rewind

• RTSP sends control messages out-of-band, using port 544, could be over TCP or UDP– Recall FTP also used out-of-band control messages– The actual media stream is a separate band

• Like HTTP, commands are plain text, with code-identified responses– But RTSP maintains session client state information

34INFO 331 chapter 7

Page 35: Chapter 7

www.ischool.drexel.edu

RTSP Example

• An HTTP GET command requests the presentation (meta) file from the web server

• Then passes it to the media player to manage getting the media and playing it

• In this example, the audio and video files are separate, but are played together (lipsynch)

35INFO 331 chapter 7

Page 36: Chapter 7

www.ischool.drexel.edu

Meta file example<title>Twister</title> <session> <group language=en lipsync> <switch> <track type=audio e="PCMU/8000/1" src = "rtsp://audio.example.com/twister/audio.en/lofi"> <track type=audio e="DVI4/16000/2" pt="90 DVI4/8000/1" src="rtsp://audio.example.com/twister/audio.en/hifi"> </switch> <track type="video/jpeg" src="rtsp://video.example.com/twister/video"> </group> </session>

36INFO 331 chapter 7

Page 37: Chapter 7

www.ischool.drexel.edu

RTSP Example

37INFO 331 chapter 7

Page 38: Chapter 7

www.ischool.drexel.edu

RTSP

• Many RTSP methods are pretty self-explanatory – PLAY, PAUSE, RECORD– SETUP establishes the connection– TEARDOWN closes the connection– DESCRIBE identifies the media to be played– ANNOUNCE updates the session description

• RTSP is used by Real Networks

38INFO 331 chapter 7

Page 39: Chapter 7

www.ischool.drexel.edu

Internet Phone

• Real-time apps, such as Internet phone and video conferencing, are very sensitive to packet delay, jitter, and packet loss

• See how these issues are handled for Internet phone– Data is generated at 8 kBps– Every 20 ms, a chunk of 160 bytes of data gets a

header attached, and the packet is sent via UDP

• Ideally, these packets all get to the receiver

39INFO 331 chapter 7

Page 40: Chapter 7

www.ischool.drexel.edu

Internet Phone

• The receiver must decide– When to play back a given packet – What to do if a packet is missing

• Recall packets can be lost if they arrive at a router with a full input or output queue

• From 1-20% packet loss can be tolerated– Forward error correction (FEC) can help make

up for lost packets

40INFO 331 chapter 7

Page 41: Chapter 7

www.ischool.drexel.edu

Internet Phone

• End-to-end delay can be confusing for the receiver, if it simply takes too long for the data to get there– Recall the under 150 ms, 150-400 ms, and

over 400 ms ranges for good, ok, and bad delays

– Internet phone may discard packets over 400 ms old

• Jitter is typically caused by variation in queuing delays

41INFO 331 chapter 7

Page 42: Chapter 7

www.ischool.drexel.edu

Internet Phone

• Some jitter problems can be removed by using sequence numbers, time stamps, and playout delays to make sure packets are in order as well as possible

• Playout delay can be fixed or adaptive• For fixed playout delay, anything arriving

after its planned play time is discarded– The amount of delay is typically 150-400 ms,

less under good network conditions

42INFO 331 chapter 7

Page 43: Chapter 7

www.ischool.drexel.edu

Adaptive playout delay

• Adaptive playout delay is often used because long delay is annoying to the users– Want to make the delay as small as possible

without losing a lot of packets– Delay is reassessed for each talk spurt

(period of transmission)

• Similar to calculation of timeout interval for TCP, use previous history, amended by the most recent data

43INFO 331 chapter 7

Page 44: Chapter 7

www.ischool.drexel.edu

Adaptive playout delay

• di = (1-u)*di-1 + u*(ri-ti)

• Where d is the average network delay– u is a fixed value, e.g. 0.01

– ri is when the ith packet was received

– ti is the time that packet was timestamped

by the sender

• Similarly the average deviation is

• vi = (1-u)*vi-1 + u*|ri – ti – di|44INFO 331 chapter 7

Page 45: Chapter 7

www.ischool.drexel.edu

Adaptive playout delay

• Then the playout time for the first packet in a talk spurt is

• pi = ti + di + K*vi

– Where K is typically 4, much likeTimeoutInterval = EstimatedRTT + 4*DevRTT

• This gives us an approach to calculate the playout delay, and keep adjusting it to compensate for network traffic conditions

45INFO 331 chapter 7

Page 46: Chapter 7

www.ischool.drexel.edu

Stored vs real time AV

• Streaming stored audio & video, in contrast, uses the same techniques to help smooth out network jitter– Sequence numbers, timestamps, playout

delay

• But stored AV can tolerate much larger delays before playout begins (compared to real time AV), which gives the app developer much more design flexibility

46INFO 331 chapter 7

Page 47: Chapter 7

www.ischool.drexel.edu

Recovering from packet loss

• The method for compensating for lost packets is the loss recovery scheme– Here, ‘lost’ means it never got there, or it

got there after its planned play time

• Retransmitting lost packets doesn’t make sense here (why?)

• Instead anticipate loss, using methods like– Forward Error Correction– Interleaving

47INFO 331 chapter 7

Page 48: Chapter 7

www.ischool.drexel.edu

Forward Error Correction

• Forward Error Correction adds a little redundant data to packets to help allow recreating what missing packets had in them– RAT (Robust Audio Tool) is an open source

Internet phone app that uses FEC and RTP

• There are lots of FEC approaches; we’ll look at two of them

48INFO 331 chapter 7

Page 49: Chapter 7

www.ischool.drexel.edu

Forward Error Correction

• First is to send a redundant encoded chunk of data after every n normal chunks– The redundant chunk is obtained from

XOR-ing the normal chunks

• If any ONE of the n chunks is missing, it can be reconstructed mathematically– But if two or more packets are lost, tough luck

• This increases transmission rate 100/n%

49INFO 331 chapter 7

Page 50: Chapter 7

www.ischool.drexel.edu

Forward Error Correction

• The second approach is sneakier– Add a second, lower quality data stream in

parallel with the primary stream (e.g. tack on a 13 kbps stream to a 65 kbps stream)

– When loss occurs, use the low quality stream – This increases playout delay little– Many variations on this approach are

possible, especially to allow for higher loss rates

50INFO 331 chapter 7

Page 51: Chapter 7

www.ischool.drexel.edu

Interleaving

• Interleaving breaks the stream of data into smaller chunks, and rearranges how the chunks are sent– So if each chunk of data is broken into four pieces,

then the first unit of data sent has chunks number 1, 5, 9, and 13 instead of just 1-4

• The second chunk sent has 2, 6, 10, & 14; the third chunk has 3, 7, 11, & 15, etc.

– That way if a chunk is lost, the loss is distributed across a wider range of time

51INFO 331 chapter 7

Page 52: Chapter 7

www.ischool.drexel.edu

Interleaving

• This increases latency (have to break up & reassemble chunks)

• Doesn’t increase bandwidth of data – no extra data is being sent

• So how can we fix the data if some is lost?

52INFO 331 chapter 7

Page 53: Chapter 7

www.ischool.drexel.edu

Repair of damaged audio

• Voice is relatively easy to fix if a little data is missing– Loss rates under 15% for packets 4-40 ms long

• One technique is simply repetition – copy the previous packet and play it again– Easy to do and usually works ok

• Or can try to interpolate between before and after packets– More tricky computationally, but sounds better

53INFO 331 chapter 7

Page 54: Chapter 7

www.ischool.drexel.edu

Content Distribution Networks

• Content Distribution Networks (CDNs) are to address the need for lots of people accessing the same stored media often and/or at once– After all, a single server source would have terrible

bandwidth and packet loss issues

• Often a content provider (CNN, MSN, etc.) will pay a CDN company (Akamai) to provide videos to users with as little delay as possible

54INFO 331 chapter 7

Page 55: Chapter 7

www.ischool.drexel.edu

Content Distribution Networks

• The CDN approach is simple – make lots of copies of the media, and put them on lots of servers everywhere

• CDN servers get put throughout the Internet

• Often the CDN company will lease data center space to house the servers– Data centers might be located at second or

third tier ISPs

55INFO 331 chapter 7

Page 56: Chapter 7

www.ischool.drexel.edu

Content Distribution Networks

• The CDN gets source media (videos) from the customer, and copies them to the servers

• When a user requests content, the nearest (or most available) CDN server delivers it– DNS redirection is used to find the correct CDN

server– This results in URLs with two addresses in them – the

first is the CDN, the second is the file name• http://www.cdn.com/www.cnn.com/zoo/turtle.mpg

56INFO 331 chapter 7

Page 57: Chapter 7

www.ischool.drexel.edu

Content Distribution Networks

• The ‘best’ server for each ISP is determined using the same approaches we saw for BGP routing tables

• CDNs may also be used within a large corporation to stream, for example, training videos locally

57INFO 331 chapter 7

Page 58: Chapter 7

www.ischool.drexel.edu

Dimensioning best-effort networks

• Many techniques we’ve seen can be used to improve the quality of service of multimedia apps– We want low RTT, little jitter, few packets lost

• Another solution is to throw enough money at the network to avoid all these problems– Given enough resources, none of these

problems occur!

58INFO 331 chapter 7

Page 59: Chapter 7

www.ischool.drexel.edu

Dimensioning best-effort networks

• To do so requires adequate bandwidth provisioning, and at a larger scale, good network dimensioning; issues include – Model traffic demand between network

end points– Clear performance requirements– Model performance for a given workload – Model to predict minimum bandwidth to

meet all requirements

59INFO 331 chapter 7

Page 60: Chapter 7

www.ischool.drexel.edu

Real-time Interactive Protocols

• Demand for real time interactive applications is huge, so there has been a lot of work on protocols to help make such apps easier to develop

• Here we’ll look at three common ones– RTP, the Real-time Transport Protocol– SIP– H.323

60INFO 331 chapter 7

Page 61: Chapter 7

www.ischool.drexel.edu

RTP

• RTP defines a standard AV chunk-of-data structure for data, sequence numbers, timestamps, etc. – The actual data format can be a proprietary

format, or other AV formats such as PCM, GSM, MP3, MPEG, H.263, etc.

• RTP is defined by RFC 3550, and usually runs over UDP (no specific port number)

• Often used for Internet telephony apps

61INFO 331 chapter 7

Page 62: Chapter 7

www.ischool.drexel.edu

Application

AV dataFormat can be PCM, GSM, MP3, MPEG, H.263, proprietary AV formats, etc.

Header infoTimestamp, sequence number

Pass to transport layer (UDP)

Pass to network layer (IP)

RTP

62INFO 331 chapter 7

Page 63: Chapter 7

www.ischool.drexel.edu

RTP

• An RTP packet consists of the RTP header plus the data– The RTP header is at least 12 B (3 rows x 4B ea.)

• Apps that both use RTP have a better chance of cooperating (e.g. if two users have different apps at each end of a phone call)

• RTP doesn’t guarantee data delivery, or sequence– Routers can’t tell RTP data from anything else

63INFO 331 chapter 7

Page 64: Chapter 7

www.ischool.drexel.edu

RTP

• Every data source (video camera, microphone) can have an RTP data stream– Video conference could have four RTP streams to

send video each direction, and audio each way– Some compression formats (MPEG-1 and -2)

combine audio and video into one stream

• RTP can be used with unicast or multicast– A group of streams from the same origin form an

RTP session

64INFO 331 chapter 7

Page 65: Chapter 7

www.ischool.drexel.edu

RTP Header Fields

• The RTP header is very simple– First line has:

• Nine bits of misc. identifiers (version, how long the CSRC list will be, if custom extensions are used, etc.)

• Payload type (7 b)• Sequence number (16 b)

– Timestamp (32 b)– SSRC (32 b)– CSRC list (0 to 15 lines of 32 b each)

65INFO 331 chapter 7

Page 66: Chapter 7

www.ischool.drexel.edu

RTP Header

• The Payload Type is a numeric code that identifies what format the data has – PCM, GSM, MPEG, etc.

• The Sequence Number starts at a random value, and “increments by one for each RTP data packet sent” [RFC 3550]

• The Timestamp is when the first data was sampled; change from one to the next is inversely proportional to the sampling rate

66INFO 331 chapter 7

Page 67: Chapter 7

www.ischool.drexel.edu

RTP Header

• The SSRC is the synchronization source identifier – just a random number to identify the source– Only goal is that no two SSRCs are the same

in a session

• The CSRC list is the contributing sources for audio data – Is often used for mixing, or to identify specific

data sources (Fred vs. Wilma)67INFO 331 chapter 7

Page 68: Chapter 7

www.ischool.drexel.edu

RTP App Development

• RTP can be implemented two ways– Use RFC 3550 and manually implement RTP

headers within your application– Use existing API libraries (e.g. in C or Java) to

implement RTP for you, and have your app call them

– Hmm, I wonder which is easier?

• RTP can be coded as an app-layer protocol, or used in conjunction with UDP via the API

68INFO 331 chapter 7

Page 69: Chapter 7

www.ischool.drexel.edu

RTP Control Protocol (RTCP)

• RTCP is also defined by RFC 3550 • It defines control packets that are sent by

data senders and receivers in an RTP session via IP multicast– Main purpose is to “provide feedback on the

quality of the data distribution”– It identifies the RTP source by a canonical

name or CNAME– Control rate of data transmission, & identify

users69INFO 331 chapter 7

Page 70: Chapter 7

www.ischool.drexel.edu

RTCP

• RTCP uses the next port number above that used by RTP

• RTCP packets are sent periodically by all session systems

• They have no ‘data’ attached – just headers– Sender Report (SR) has SSRC of the session,

timestamp of most recent RTP packet, # of packets and bytes sent in the stream

70INFO 331 chapter 7

Page 71: Chapter 7

www.ischool.drexel.edu

RTCP

– Sender reports can synch streams in a session

– The Reception Report (RR) has reception statistics

• SSRC of the RTP stream of interest• Fraction of packets lost• Last sequence number received• Interarrival jitter

– There are source description packets, giving user name, email, app, SSRC

71INFO 331 chapter 7

Page 72: Chapter 7

www.ischool.drexel.edu

RTCP

• RFC 3550 has massive detail on the statistics which can be collected (bandwidth, packet size, jitter, transmission interval, packets lost, etc.) but leaves analysis and interpretation of the data to the app developer

• Sender and receiver reports can be stacked into a combined report

72INFO 331 chapter 7

Page 73: Chapter 7

www.ischool.drexel.edu

RTCP Scaling

• The amount of RTP traffic doesn’t change with a large number of receivers, but the amount of RTCP traffic does grow– RTCP tries to limit itself to 5% of session

bandwidth(!)

• This results in different RTCP transmission periods (T) for senders versus receivers– Two senders and 100 receivers will get

T(receiver) = 16.67*T(sender)73INFO 331 chapter 7

Page 74: Chapter 7

www.ischool.drexel.edu

SIP

• The Session Initiation Protocol (SIP) is defined by RFC 3261– SIP can be over UDP or TCP, and uses port

5060

• It is designed to handle Internet telephone across LANs (not local phone exchanges)– It allows calls to be placed over IP– Allows caller to determine IP of callee (varies)– Can add media streams during call

74INFO 331 chapter 7

Page 75: Chapter 7

www.ischool.drexel.edu

SIP Session Example

• Caller sends an INVITE message– Tells caller and callee IP addresses, media

format and encoding, type of encapsulation (RTP) and receiving port number

• Callee sends response msg– Confirms IP address, encoding,

encapsulation, and port number

• Caller sends ACK to callee, and the call can take place

75INFO 331 chapter 7

Page 76: Chapter 7

www.ischool.drexel.edu

SIP Session Example

time time

Bob 'ste rm inal rings

A lice

167.180.112.24

Bob

193.64 .210.89

port 5060

port 38060

Law aud io

G SMport 48753

INVITE [email protected]=IN IP4 167.180.112.24m=audio 38060 RTP/AVP 0port 5060

200 OKc=IN IP4 193.64.210.89

m=audio 48753 RTP/AVP 3

ACKport 5060

76INFO 331 chapter 7

Page 77: Chapter 7

www.ischool.drexel.edu

SIP

• So what?– The two parties can be using completely

different encoding and encapsulation methods, yet carry on the call

– Also note that the control messages stay over port 5060, yet the media are over two other, negotiated ports – so SIP uses out-of-band control messages

– SIP requires ACK of all messages, so either UDP or TCP can be used

77INFO 331 chapter 7

Page 78: Chapter 7

www.ischool.drexel.edu

SIP Addressing

• SIP addresses can look like email addresses, or cite IP addresses, or be a phone number, or even a full personal name– [email protected][email protected][email protected]

78INFO 331 chapter 7

Page 79: Chapter 7

www.ischool.drexel.edu

SIP Messages

• The SIP protocol is huge – 269 pages – so only a brief overview is in order

• The INVITE message typically is addressed to an email-like address, e.g. [email protected]

• Each device the message passes through adds a Via: header with that device’s IP

• A SIP proxy finds the IP of the device the callee is currently using

79INFO 331 chapter 7

Page 80: Chapter 7

www.ischool.drexel.edu

SIP Messages

• Every SIP user is associated with a SIP registrar– When a SIP device is launched, it registers

with the registrar to give its current IP address

• So the SIP proxy asks the callee’s registrar for their IP address– The proxy might be redirected to another

registrar if the callee is not nearby

80INFO 331 chapter 7

Page 81: Chapter 7

www.ischool.drexel.edu

SIP applications

• SIP has been described for voice over IP, but can also be used for any media – video, even text messaging

• SIP software is widely available for many common functions, on various platforms (PC/Mac/Linux)

81INFO 331 chapter 7

Page 82: Chapter 7

www.ischool.drexel.edu

H.323

• H.323 (catchy name, huh?) is an alternative to SIP

• Computers can use SIP or H.323 to connect to plain telephones, as well as do IP-only calling

• H.323 supports audio and optionally, video– Audio must support at least G.711 speech

compression at 56 or 64 kbps

82INFO 331 chapter 7

Page 83: Chapter 7

www.ischool.drexel.edu

H.323

– Video, if used, must support at least QCIF H.261, which is a massive 176x144 pixels

• H.323 is a suite of protocols– Includes a separate control protocol (H.245),

signaling channel (Q.931), and uses RAS for registering with the gatekeeper

– The gatekeeper is between the IP and plain telephone networks

• Whereas, SIP only manages connections83INFO 331 chapter 7

Page 84: Chapter 7

www.ischool.drexel.edu

H.323 vs SIP

• H.323 is from the ITU (mainly telephone basis), whereas SIP is from IETF (the RFC folks)

• SIP can work with RTP, G.711, and H.261, but doesn’t require any of them

• SIP is simple compared to H.323

84INFO 331 chapter 7

Page 85: Chapter 7

www.ischool.drexel.edu

Beyond Best Effort

• So lots of techniques have been used to get the most out of the Internet’s ‘best effort’ approach– The current quality of service (QoS) has no

guarantees– How can we improve on that?

• Change the architecture of the Internet!• Look at a simple network to examine the

problems

85INFO 331 chapter 7

Page 86: Chapter 7

www.ischool.drexel.edu

A Sample Network

86INFO 331 chapter 7

Page 87: Chapter 7

www.ischool.drexel.edu

A Sample Network

• In this example, the local networks are assumed much faster than the 1.5 Mbps connection between them– Two apps are competing for that 1.5 Mbps of

bandwidth, e.g. leaving the first router (R1)

• Scenario 1: If hosts H1 is sending FTP to H3, and then H2 tries to send audio traffic to H4, the router R1 will already be full, making for lost audio packets

87INFO 331 chapter 7

Page 88: Chapter 7

www.ischool.drexel.edu

A Sample Network

• But IF we could mark audio packets versus FTP packets, we could give priority to audio packets, since they are delay sensitive– Principle 1: packet marking lets a router

distinguish between different classes of traffic

• Scenario 2 – what if the FTP traffic was on a high priority (paid) connection, but the audio traffic was normal free Internet– Principle 1a: packet classification lets a router

distinguish between different classes of traffic

88INFO 331 chapter 7

Page 89: Chapter 7

www.ischool.drexel.edu

A Sample Network

• The distinction is important– Marking packets for type of payload is one

way to classify them, but hardly the only one– The packets could be prioritized by the

originating IP address (which could imply whether premium service was used)

• Scenario 3: Bad application– What if the audio app should get priority, but

it abuses that and hogs the whole link?

89INFO 331 chapter 7

Page 90: Chapter 7

www.ischool.drexel.edu

A Sample Network

– Instead of allowing 1.0 Mbps for audio, the app tries to take the whole 1.5 Mbps, thereby shutting off the FTP service

– Just like circuit switching, we want to isolate each traffic flow, so a misbehaving app doesn’t ruin the link for everyone

– Principle 2: Some degree of isolation is desirable to protect traffic flows from each other

– This might imply the need for traffic cops, to ensure app compliance

90INFO 331 chapter 7

Page 91: Chapter 7

www.ischool.drexel.edu

A Sample Network

• So two approaches could address these concerns and improve the QoS– Mark packets, and police app behavior to

share bandwidth fairly– Or, logically isolate traffic flows, and

designate some specific bandwidth for each

• But if the flows are isolated, need to make sure full bandwidth is used if a traffic flow isn’t active

91INFO 331 chapter 7

Page 92: Chapter 7

www.ischool.drexel.edu

A Sample Network

– Principle 3: When traffic flows are isolated, need to use resources (bandwidth, router buffers) as efficiently as possible

• Scenario 4: Two 1.0 Mbps audio apps over a 1.5 Mbps link– Both audio apps can’t get the full bandwidth needed –

each would lose 25% of their packets– To maintain any decent QoS level, the network should

allow or block the flow• Telephones have done this for years!

92INFO 331 chapter 7

Page 93: Chapter 7

www.ischool.drexel.edu

A Sample Network

– So to guarantee a minimum level of QoS we need a call admission process

– Principle 4: a call admission process is needed to admit or block calls from the network, based on the call’s QoS requirements

• Now that we have the principles of guaranteeing QoS, look at how these are implemented in the Internet

93INFO 331 chapter 7

Page 94: Chapter 7

www.ischool.drexel.edu

Scheduling and Policing

• Recall at the link level, packets from various flows are multiplexed and queued on the output buffers to go onto a link

• The link-scheduling discipline is the process for queuing packets, which is vital for making QoS guarantees

• Look at FIFO, priority, round robin, and WFQ approaches for link-scheduling

94INFO 331 chapter 7

Page 95: Chapter 7

www.ischool.drexel.edu

FIFO Scheduling

• FIFO (first-in first-out), also called first-come first-served (FCFS) is the McDonald’s approach to scheduling– Packets are queued in the order in which they

arrived at the device– If the output buffer gets full, the packet-

discarding policy is used to decide which packets are dropped

95INFO 331 chapter 7

Page 96: Chapter 7

www.ischool.drexel.edu

Priority Queuing

• Here, packets are grouped by their priority – a separate queue for each level of priority– How they are grouped could be by type of service,

source IP, or other methods

• The highest priority queue is emptied before the next priority queue is transmitted– If the high priority queue is empty, a lower priority

packet will still be sent immediately

• Within each queue, FIFO applies

96INFO 331 chapter 7

Page 97: Chapter 7

www.ischool.drexel.edu

Round Robin Queuing

• In round robin, packets are sorted into classes which have no relative priority– One packet from class 1 is sent, then one

from class 2, etc.– Then go back to class 1 and keep repeating

the cycle

• A work-conserving round robin tries to keep the link busy, so will check the next class if one is empty

97INFO 331 chapter 7

Page 98: Chapter 7

www.ischool.drexel.edu

Weighted Fair Queuing (WFQ)

• WFQ is a variation on round robin queuing– Here each class is assigned a ‘weight,’ which

determines how much transmission time they get

– In short, some classes are more equal than others

– The weights for all classes total 100%– Incoming packets are classified into the

correct class, which has its own queue

• WFQ is very widely used98INFO 331 chapter 7

Page 99: Chapter 7

www.ischool.drexel.edu

Policing

• We may need to control the rate data are allowed to enter the network – a traffic cop

• Could control three aspects– Long term average data rate– Peak or max rate– Burst size – max number of packets in a short

time interval

99INFO 331 chapter 7

Page 100: Chapter 7

www.ischool.drexel.edu

The Leaky Bucket Analogy

• Bucket holds ‘b’ tokens (packets)

• New tokens are created at rate ‘r’ tokens per sec

• If bucket is full, new tokens are discarded

WFQ

token rate, r

bucket size, b

per-flowrate, R

D = b/ Rmax

arrivingtraffi c

WFQ

token rate, r

bucket size, b

per-flowrate, R

D = b/ Rmax

WFQ

token rate, r

bucket size, b

per-flowrate, R

D = b/ RmaxD = b/ Rmax

arrivingtraffi c

100INFO 331 chapter 7

Page 101: Chapter 7

www.ischool.drexel.edu

The Leaky Bucket Analogy

• The token generation rate, r, limits the long term average rate

• In any time ‘t’, the max number of tokens which could be added to the network is r*t + b, so that limits the peak rate

• The size of the bucket limits the max burst size – No more than b tokens can be removed

in a short time

101INFO 331 chapter 7

Page 102: Chapter 7

www.ischool.drexel.edu

WFQ + Leaky buckets

• If we combine the WFQ scheduling with the leaky bucket policing options, what happens to the delay for any class of packets?– If the class’ bucket is full, and ‘b1’ packets

arrive, the rate the bucket is emptied is at least R*wi/sum(w) where R is the link rate, wi is the weight of this class, and sum(w) is the sum of all active classes’ weights

102INFO 331 chapter 7

Page 103: Chapter 7

www.ischool.drexel.edu

WFQ + Leaky buckets

• So the time for the bi packets to be handled is time = quantity/rate – di = bi/[R*wi/sum(w)] – Hence there is a provable maximum delay

• As long as the rate this class can handle packets is less than [R*wi/sum(w)], the above di equation is the max delay in a WFQ queue

103INFO 331 chapter 7

Page 104: Chapter 7

www.ischool.drexel.edu

Integrated vs Differentiated Services

• Now we’ve covered the types of mechanisms needed to provide QoS guarantees in the Internet– Cool. So, how is this being implemented?

• Two major architectures are trying to provide an answer – Intserv and Diffserv– Integrated Services = Intserv– Differentiated Services = Diffserv

104INFO 331 chapter 7

Page 105: Chapter 7

www.ischool.drexel.edu

Intserv

• Intserv provides a way for QoS to be specified for a given session– Kind of like setting up a virtual circuit

• Intserv depends on two key features– Must know what resources are already

occupied by routers along a desired path– Must do call setup to prepare for a session

105INFO 331 chapter 7

Page 106: Chapter 7

www.ischool.drexel.edu

Intserv

• Call setup includes– A session must define the QoS needs

• An Rspec defines the type of QoS service desired• A Tspec defines the traffic the session will be sending

– See RFC 2210 and 2215 for Rspec and Tspec– Then a signaling protocol, RSVP, is used to set up

along the path’s routers– Each router (element) decides whether to allow the

session

106INFO 331 chapter 7

Page 107: Chapter 7

www.ischool.drexel.edu

Intserv

• Intserv provides two different types of QoS– Guaranteed QoS – the bounds on queuing

delays for each router are assured• See RFC 2212 for gory details how this is done

– Controlled-load Network Service – the session will receive a “close approximation” to the QoS an unloaded element would provide

• Oddly, this QoS is not quantified• RFC 2211, since you were wondering

107INFO 331 chapter 7

Page 108: Chapter 7

www.ischool.drexel.edu

Diffserv

• Diffserv (RFC 2475) was in response to problems with Intserv:– Scalability problems with per-flow (session)

reservations • Gets very time consuming for large networks

– Flexibility in service classes – Intserv only provides specific service classes

• Want ability to grade or rank service classes (low/med/high, silver/gold/platinum, etc.)

108INFO 331 chapter 7

Page 109: Chapter 7

www.ischool.drexel.edu

Diffserv

• Diffserv assumes the whole network will be aware of Diffserv protocols, and that most traffic flows are under Diffserv– You will be assimilated!

• Two main functions under Diffserv– At the edges of the network, hosts or first hop

routers mark the DS field in the packet header– Within the core of the network, traffic is

handled based on its class’ per-hop behavior109INFO 331 chapter 7

Page 110: Chapter 7

www.ischool.drexel.edu

Diffserv

• Analogy– The marking process is like getting your hand

stamped at a night club, or having a VIP or backstage pass, or getting comped for spending too much money at a casino

• Packets are classified based on source or destination IP or port, or (transport) protocol– There could be different marking algorithms

for various classes of flows

110INFO 331 chapter 7

Page 111: Chapter 7

www.ischool.drexel.edu

Diffserv

• Classes of flows could follow a traffic profile– May limit their peak transmission rate, or burstiness– A metering function could be used to control flow rate

into the network, or shape flow rates

• The differentiated services (DS) field (RFC 3260) replaces the type of service and traffic class fields from IPv4 and IPv6– The packets are marked using the DS field

111INFO 331 chapter 7

Page 112: Chapter 7

www.ischool.drexel.edu

Diffserv

• Within the network, each class’ per-hop behavior (PHB) is critical to controlling its handling– PHB influences the forwarding behavior

(performance) at a Diffserv node– No method for implementing behavior is given– Only key is that they are externally measurable

• Two PHBs have been defined – expedited forwarding (EF), and assured forwarding (AF)

112INFO 331 chapter 7

Page 113: Chapter 7

www.ischool.drexel.edu

Diffserv

• Expedited forwarding (EF, RFC 3246) requires the departure rate of a class must equal or exceed a certain rate– Implies isolation from other traffic classes, since the

absolute rate is guaranteed

• Assured forwarding (AF, RFC 2597) divides traffic into four classes– Each is guaranteed bandwidth and buffer space– Per class there are three levels of drop preference

113INFO 331 chapter 7

Page 114: Chapter 7

www.ischool.drexel.edu

Intserv, Diffserv, and Reality

• So while techniques exist to provide QoS guarantees, they aren’t widely used

• Why? The Internet is highly distributed, and not everyone will agree 1) to implement QoS methods, 2) on the type of QoS needed, and 3) on how to pay for it

• And if traffic levels are low, ‘best effort’ will provide the same service as Intserv/Diffserv!

114INFO 331 chapter 7

Page 115: Chapter 7

www.ischool.drexel.edu

RSVP Protocol

• In order to reserve resources in the Internet, a signaling protocol is needed– The Resource reSerVation Protocol (RSVP,

RFC 2205) does so– Here we focus on bandwidth reservation, but

RSVP can be used for other applications

• To work, hosts and routers need to implement the RSVP protocol

115INFO 331 chapter 7

Page 116: Chapter 7

www.ischool.drexel.edu

RSVP Protocol

• RSVP provides reservations for bandwidth in multicast trees (unicast is easy then)

• The receiver of the data controls getting the reservations (not the sender)

• In a session, each sender can control many flows (audio and video, e.g.) – Each flow has the same multicast address– To distinguish flows, use the flow identifier

(IPv6)116INFO 331 chapter 7

Page 117: Chapter 7

www.ischool.drexel.edu

RSVP Protocol

• RSVP does not specify how the bandwidth is obtained – see earlier scheduling methods (priority scheduling, WFQ, etc.)

• RSVP depends on other routing protocols to determine the path needed, and hence the specific links in the path– If the path changes during the session,

RSVP re-reserves the path

117INFO 331 chapter 7

Page 118: Chapter 7

www.ischool.drexel.edu

RSVP Protocol

• In multicast, receivers could have many different speeds – dial up, cable, DSL, etc.

• Sender doesn’t need to know the rate of all receivers, only the rate of the fastest one

• Encode the media in layers at different data rates (e.g. 20 kbps + 100 kbps)– A slow base layer (dial up), and add other layers to

enhance the image/signal for faster receivers– Each receiver picks up the layers they can use

118INFO 331 chapter 7

Page 119: Chapter 7

www.ischool.drexel.edu

RSVP Protocol

• The source of a multicast advertises content using RSVP path messages through a multicast tree– Tells the bandwidth required, timeout interval, and

upstream path to the sender

• Receivers send RSVP reservation messages– Tells what rate they want the data– Message is passed upstream to the source, where

each router allocates the bandwidth

119INFO 331 chapter 7

Page 120: Chapter 7

www.ischool.drexel.edu

RSVP Protocol

• Routers can merge requests for the same multicast event – saves having many reservations for the same signal!– Hence routers only need to reserve the

maximum downstream receiver rate– Routers only send one reservation message

upstream

120INFO 331 chapter 7

Page 121: Chapter 7

www.ischool.drexel.edu

RSVP Protocol

• In a conference call situation, all participants are typically senders and receivers of video, so the bandwidths are added

• For audio only, it’s rarer for many to speak at once, to typically if one user is transmitting at rate ‘b’ bps, reserving 2*b bps is typically adequate bandwidth

121INFO 331 chapter 7

Page 122: Chapter 7

www.ischool.drexel.edu

RSVP Protocol

• When a router gets a reservation request, it must do an admission test

• If there isn’t enough bandwidth to handle the reservation, the router rejects the reservation and returns an error message to the receiver– RSVP is very receiver-oriented, since they

request reservations

122INFO 331 chapter 7

Page 123: Chapter 7

www.ischool.drexel.edu

Soft & Hard State Reservations

• In general, a signaling protocol can have soft or hard state reservations

• Soft state reservations will time out unless renewed periodically– You have to say “I’m alive!” or I’ll assume

you’re dead– A system could crash and reboot, and it

wouldn’t lose its reservation, if it responded fast enough

– RSVP, PIM, SIP, and IGMP use soft state123INFO 331 chapter 7

Page 124: Chapter 7

www.ischool.drexel.edu

Soft & Hard State Reservations

• Hard state signaling means the reservation stays forever, until explicitly removed by an uninstall, delete, tear-down, or other message– Requires a means to look for orphaned states

(its installer is dead or gone)– Typically used for reliable systems (not best

effort) so few Internet protocols use it– RSVP has optional removal of reservations

124INFO 331 chapter 7

Page 125: Chapter 7

www.ischool.drexel.edu

Summary

• Multimedia is huge in the Internet, and getting bigger!– May replace circuit switched telephones

• Here we reviewed the types of multimedia apps, looked at methods for best-effort networking, added methods for QoS guarantees, and outlined architectures for providing QoS

125INFO 331 chapter 7