IP Multicasting - KTH | V¤lkommen till KTH

IP Multicasting

Internetworking

Olof Hagsand

Literature:Forouzan, TCP/IP Protocol Suite Ch 10, 15

IP Multicast applications

● Unicast is point-to-point● But many applications relays the same information to

many.● Examples:

– Radio

– TV – Conferencing

– Distribution of control information

– Distributed games● But in fact, few use native multicasting today, except

control and IPTV

IP Multicast Summary

● The Internet abstraction of hardware multicasting.● Prime architect: Steve Deering● Group addresses (class D)● Exploits multicast-capable networking hardware if available● Best-effort delivery semantics (unreliable)● Receiver-based multicast:

– Senders send to any group

– Receivers join groups

● Dynamic group membership– Hosts leave and join groups dynamically

● Notation:

– (S, G) – specific sender S to group G

– (*, G) – all senders to group G

Reliable multicast

● There is no “multicast-TCP”. Why?

● Problem: how to deal with all acknowledgments

– TCP-like ACKs would cause “ACK-implosions”

● Ideas:

– Acknowledgment aggregation points - keep copies of data for retransmission cases

– Use NACKs (Negative acknowledgements)

– Send redundant information so that lost information can be recomputed from the information received, i.e. use forward error correcting codes (FEC)

● There is no general-purpose reliable multicast protocol for the Internet

– Only application-specific

IP Multicast Architecture

3. Host-to router protocol

2. Link-level Multicast

4. Intra-domain Multicast Routing

1. Multicast in hosts

5. Inter-domain Multicast Routing

IP Multicast Addresses● IP-multicast addresses, class D addresses (binary prefix: 1110)

224.0.0.0 - 239.255.255.255 ● 28 bit multicast group id● Selected addresses reserved by IANA for special purposes:

Scope restricted to one site239.253.0.0/16

Global Scope224.0.1.0 – 238.255.255.255

All Routers on this subnet224.0.0.2All Systems on this subnet224.0.0.1

DVMRP Routers224.0.0.4RIP Routers224.0.0.9

Scope restricted to one organization239.192.0.0/16

Limited Scope239.0.0.0-239.255.255.255

Local Network Control Block (dont forward)

224.0.0.0 – 224.0.0.255DescriptionAddress

Multicast in hosts● Send is easy – just use a class D

address as destination

● Receive is more complicated

● Processes create sockets and bind to port

● Process calls setsockopt(s, IP_PROTO, IP_ADD_MEMBERSHIP)

– Joins multicast group

● IP-stack sends IGMP report on network

● More than one process may join same group

– Can use different (UDP) ports

● But only one IGMP report is sent

● When last process leaves IGMP leave is sent

Process1

Socket

Kernel

Userspace

Process2

UDPIP

Link

Network

Link-level/Hardware Multicast

● Ethernet - good example of hardware multicast– Most Ethernet NICs support multicast

● Ethernet multicast addresses:

– The low order bit of the high order byte is 1:

*1:**:**:**:**:**● Many NICs on the same network may listen to the same

Ethernet multicast address.● Other Link-level layers may not support multicast, being

NBMA (Non-Broadcast Multiple Access)– eg ATM, FR, X25

Mapping IP Multicast to Ethernet

● To use HW multicast on a LAN, the IP multicast address is translated to an Ethernet multicast address. – How is this done?

● The 23 low order bits of the IP multicast address placed in the 23 low order bits of the Ethernet IP multicast address: 01:00:5E:00:00:00

● Example, IP multicast address 227.141.54.33 (0xE38D3621):0xD3621 into 01:00:5E:00:00:00 --> 01:00:5E:0D:36:21

● IP to Ethernet multicast address mapping is not unique! – 32:1 overlap

– IP may receive multicast despite the lack of receiving process

– IP-layer must be able to do filtering (based on IP multicast address)

Host to router: IGMP

● Group membership communication between hosts and multicast routers

– Not for routing of multicast packets

● IGMP enables routers to maintain group members to each router interface

– Without IGMP, routers would have to broadcast all multicast packets

● Internet Group Management Protocol

– version 1 – RFC 1112 (Historic)● Query from router and response from host

– version 2 – RFC 2236 (most common today)● Leave group by host

– version 3 – RFC 3376 (coming up)● Source filtering

Position of IGMP in TCP/IP

● Part of the network layer– Encapsulated in IP (like ICMP)

● IGMP messages always addressed to a multicast address – often all systems (224.0.0.1), all routers (224.0.0.2)

– or to a specific multicast group

IGMPv2 Messages

● General membership query– Sent regularly by routers to query all membership

● Specific membership query – Sent by routers to query specific group membership

● Membership report– Sent by hosts to report joined groups

● Leave group – Sent by hosts to leave groups

Host Behaviour

● A process joins a multicast group on a given interface● Host sends IGMP report to group address when first process joins a

group.

– Host keeps a table of all groups which have a reference count > 0

● Host sends IGMP leave to 224.0.0.2 when last process leaves group

– In IGMPv1 hosts did not send explicit leaves

● Router sends IGMP queries to 224.0.0.1 at regular intervals.

– general query: group = 0.0.0.0

– specific group query: group = multicast address of the group

● Host responds to IGMP query by sending IGMP report to group address

– Hosts snoop for other hosts’ reports

– Set random timer Suppress if other host on same segment sends it

Dynamics of IGMP Messages

RouterHost

General Membership QuerySent to 224.0.0.1

General Membership QuerySent to 224.0.0.1

Membership reportSent to M

join multicast group M

Membership reports (M, N,...)

Leave group M – sent to 224.0.0.2

Specific Membership QuerySent to M

Random timer

leave multicast group M

Periodic ~60s

IGMP v3

● Allows selection of senders – not only groups– Enables: Source specific multicast

● A host can join a group and specific sender:– (S, G) not only (*, G)

● This may allow for pruning of certain senders● IGMP v3 is not commonly deployed.

IGMP snooping

● IGMP is an L3 protocol. Switches are L2.● When a multicast L2 message comes in to a switch, it

does not know where the members are -> must flood on all ports.

● For large bridged networks, this may waste a lot of bandwidth, especially for high bw applications (eg IPTV)

● Bridges may therefore listen to IGMP messages, and prune/graft traffic based on this.

● This is called IGMP snooping● Many modern bridges have this capability

Multicast Router

● Listens to all multicast traffic and forwards if necessary.● Multicast router listens to all multicast addresses.

– Ethernet: 223 link layer multicast addresses

– Listens promiscuously to all LAN multicast traffic

● Communicates with directly connected hosts via IGMP● Communicates with other multicast routers with

multicast routing protocols

Multicast in routers

● Packets are replicated to many output ports

● Forwarding used to be done in slowpath – in the main CPU

● Modern routers can do forwarding in the linecards or in hardware

● Replication can be made in:

– incoming linecards,

– backplane

– outgoing linecards

● Routes computed by main CPU by multicast routing protocol are installed in the linecard FIB

Network

LC 1 LC 2 LC 3

Backplane

CPU

Multicast Routing● A packet received on a router is forwarded on many

interfaces

● The network replicates the packets – not the hosts

● All routing protocols aim at building delivery trees through a network

● Mostly multicast routing is more concerned where packets come from instead of where they are going

Delivery Trees

● Group Shared Trees: (*, G)

– Same delivery tree for all senders

– Router state: O(G)

– But suboptimal paths and delays

– Unidirectional or Bidirectional

● Source Based Trees: (S, G)

– Different delivery tree for each sender

– Router state: O(S*G),

– Optimal paths and delay.

● Data-driven / Push

– Trees built when data packets are sent

● Demand-driven / Pull

– Build when members join

Group Shared Trees

Rendezvouspoint

Sender

Receiver

Sender

Receiver

Receiver

● Build one common tree per group (*, G)

– Bidirectional – data flows up and down the tree

– Unidirectional – data flows only downwards in the tree - must first be sent to rendez-vous point (in the figure)

Source Based Trees

Sender

Receiver

Sender

Receiver

Receiver

● Build shortest path trees, one per sender and group

● (S,G)

Protocol classification● Dense-mode protocols

– Intended for domains with high receiver density

– Push, Source trees

– Examples: PIM-DM, DVMRP

● Sparse-mode protocols

– Intended for domains with small receiver density

– Pull, shared trees (not always,..)

– Examples: PIM-SM, CBT

● Link-state protocols

– Source trees– Examples: MOSPF

Reverse Path Forwarding (RPF)

● Basic forwarding principle in multicast forwarding● Forward a multicast datagram only if it arrives on the

interface used to send unicast to the source– Send out on all other interfaces– Flooding!

● Make a lookup of the source address in the FIB!

shortestpath

packet forwarded

not shortestpath

packet is dropped

DVMRP

● Distance-Vector Multicast Routing Protocol

– Based on unicast distance vector (eg RIP)

● DVMRP is data-driven/push and uses source-based trees

● DVMRP builds Truncated Broadcast Trees

– Reverse Path Broadcasting (Forouzan)

– Builds routing tables with source networks

– Using poison reverse (infinity is 32 in DVMRP)

● DVMRP floods trees with data

● Then prunes and grafts tree depending on receiver dymanics.

Example: Building truncated broadcast trees (1)

N

R1 R2

S1E0 E0 E1

● Assume sender S1 on network N

● Two routers R1 and R2 with tables as shown

● R1 sends DVMRP Router Report message on 224.0.0.4 to neighbors (every ~60s)

N 1 E1 1


N

R1 R2

S1 E0 E0 E1

● R2 adds 1 to metric and add in its table

● Next report sent, it adds 32 (poison reverse) for routes it learned on that interface

● Now R1 knows it is upstreams for sources from N,

● And adds E0 in its delivery tree for N

N 2 E0N 1 E1

Delivery tree

34


N

R1 R2S1

E0 E0

● Both R3 and R4 announces N on shared network M

● The one with lowest metric (R3) is elected as designated router for forwarding multicast from N on network M

● R4 truncates the delivery tree.

N 2 E0N 1 E1

S0 S0

E0 E0

N 3 S0N 2 S0

M

R3R4

34

1

2 3

341 352

E1


N

R1 R2S1

E0 E0

● Multicast traffic from S1 will flow according to the final delivery tree below.

● The tree is a shortest path tree – a source-based tree.

S0 S0

E0 E0 M

R3R4

E1

S0

Multicast forwarding: flooding When multicast data arrives from sender S1 to a group G,

– An RPF check is made using the DVMRP table● Packets arriving on wrong interface are discarded

– An (S, G) entry is created in the router (data-driven) with the outgoing interface list according to the truncated broadcast tree.

– Packets are flooded according to the (S, G) entry● Example: (S,G): E0, S0

N

R1S1

E0 E1

S0

DVMRP pruning● But suppose there are no receivers on a segment

– Router knows this because of IGMP join/leaves

● Router sends DVMRP Prunes upwards in the tree

● Interface removed from (S,G) entry

– If (S,G) entry empty, send prune message upwards in tree

RPM Pruning Example

Xprune tree: no members

XParent router

propagate pruning msgs up shortest path tree

X

X

DVMRP grafting● Suppose a new receiver joins a group

– Router knows this because of IGMP join

● Router sends DVMRP Graft upwards in the tree

● Interface added to (S,G) entry upstreams

– If new (S,G) entry, send graft messages upwards in tree

● Forouzan calls truncated broadcast trees + pruning + grafting for reverse path multicasting.

Link-State Multicast: MOSPF● Add multicast to a given link-state routing protocol

– MOSPF

● Uses the multiprotocol facility in OSPF to carry multicast information

● Extend LSAs with group-membership LSA

– Only containing members of a group

● Uses the link-state database in OSPF to build delivery trees

– Every router knows the topology of the complete network

– Least-cost source-based trees using metrics

– One tree for all (S,G) pairs with S as source

● Expensive to keep all this information

– Cache active (S,G) pairs

– MOSPF is Data-driven: computes Dijkstra when datagram arrives

Core Based Tree - CBT

● CBT - Core Based Trees● Group shared multicast trees – (*, G)● Demand-driven

– Routers send join messages when hosts join groups

● Divide the internet into regions where each region has a core router

● When a host joins a multicast group the nearest multicast router attaches to the forwarding tree by sending a join request towards its core router

● Senders sends multicast datagrams to core router encapsulated in unicast datagrams – unidirectional shared trees

Protocol Independent Multicasting - PIM

● PIM relies on any unicast routing tables for its RPF check

– Does not build its own tables (as DVMRP)

● Split into two protocols for different uses

– PIM-DM – PIM Dense mode

– PIM-SM – PIM Sparse mode

● PIM - Dense Mode– Broadcast-and-prune strategy – similar to DVMRP

PIM-SM shared tree joins

● Explicit joins – traffic only reaches the part of the network where there is traffic

● Joins travel up to rendez-vous point to join the shared tree

– (*, G) join● Install a new (*,G) entry for every new interface that

the join is received on

PIM-SM join example (1)

IGMP join

(*, G) Join

Receiver

Sender


(*, G) Join

Receiver

Receiver

IGMP join

Sender

PIM-SM shortest path trees

● For performance reasons, PIM-SM tries to build source-trees after the shared tree join

– Most vendors: try to build source tree directly

● As soon as a receiving router gets its first multicast data packet from a new source S, it sends an (S, G) Join upwards towards the source.


(S, G) Join

Receiver

Receiver

Sender S

Registering sources

● PIM-SM trees are unidirectional, senders must send to RP to distribute traffic.

– All routers must know the Group-to-RP mapping● Upstreams routers sends PIM Register messages

towards the RP

– Multicast data is encapsulated within

● The RP then joins the source tree!

– Sends a (S,G) Join towards the source

PIM-SM source registring

Register

Sender S

(S, G) Join

MSDP – Multicast Source Discovery Protocol

● Discovering sources is a major problem in PIM, especially in the inter-domain routing case:

– One AS manages an independent RP for that domain – following the autonomous systems architecture

– But what if there are senders in other domains?

● MSDP operation:

– The RPs in different domains communicate with each other on a peer-to-peer basis

– The RP keeps track of all senders in one domain and sends MSDP Source Active (SA) messages to the RPs in the other domains

– If an RP has receivers in a group G, and discovers a sender S in another domain, it performs a (S,G) Join towards S.

– SA messages are reliably flooded to all RP peers● Concern for scalability

MSDP Example(1)

● Assume a group G and a sender S in AS3 and receiver R in AS5

● S starts sending to G and registers with RP3 in AS3 and R has joined the shared tree in AS5

● RP3 floods Source Active messages for S

AS1

AS5

AS2

AS3AS4

R

S RP3

RP1 RP2

RP4

RP5

SA(S,G)SA(S,G)SA(S,G)

SA(S,G) SA(S,G)

SA(S,G)

MSDP Example(2)

● RP5 receives the SA (S,G), and since it has receivers in its domain,

● It joins the source-specific tree (S,G)

AS1

AS5

AS2

AS3AS4

R

S RP3

RP1 RP2

RP4

RP5Join(S,G)Join(S,G)

Join(S,G)Data

MBGP

● Multiprotocol BGP (Multicast BGP)● MSDP and PIM uses regular unicast routing to make

RPF● But sometimes, one may wish to traffic engineer the

multicast traffic, without affecting normal unicast traffic● Advertise new attributes:

– MP_REACH_NLRI/ MP_UNREACH_NLRI● Makes it possible to advertise different paths for unicast

and multicast traffic flow between AS:s

MBGP Example

● AS3 announces prefix S/24 to AS2 for unicast, but to AS4 for multicast

● Now, unicast traffic to S flows a different way than multicast traffic from S

AS1

AS5

AS2

AS3AS4

R

S/24 RP3

RP1 RP2

RP4

RP5

Announce S(unicast)

Unicast to S

Multicast from S

Announce S(multicast)

Source-specific multicast

● Regular IP multicast model:

– A receiver R joins a group G (*, G)

– R receives all traffic destined to G

– Senders unknown a priori -> The routing protocol must detect senders

● Source-specific multicast

– A receiver R joins a group G for sender S only

– R receives traffic only from S destined to G

– The routing protocol need not detect sender

● Requires changes to socket API

● IGMPv3

● But detection of senders not necessary

Tunneling● All routers need not be multicast enabled - does this mean

that we cannot reach all hosts that want to join a multicast group on the Internet?

– No, because we can use tunneling over non-multicast enabled sub-nets

– VPN and Ipv6 also often works like this

● This is the way the MBONE – the Multicast BackBONE is constructed

– Islands of multicast enabled routers interconnected by tunnels

Non-multicast enabled network

Multicast enabled routers

Deployment

● Multicast Routing is in general not deployed in current networks

● Some sites (eg ”stadsnät”) have deployed local multicast delivery– IPTV distribution may be the killer application

● IP Multicast is slowly gaining acceptance

Summary: IP Multicast

● Multicast routing uses network resources more efficiently than unicast emulation

● IP multicast is receiver-based and uses best effort delivery

● Multicast architecture

– Host behaviour

– Link-level multicast

– Host-to router: IGMP

– Multicast routing

– Inter-domain multicast routing

● Multicast routing protocols

– DVMRP, MOSPF, CBT, PIM-DM, PIM-SM, MSDP, MBGP

Documents

IP Multicasting - KTH | V¤lkommen till KTH