14
Congestion Issues Stewart Bryant [email protected]

Congestion Issues Stewart Bryant [email protected]@cisco.com

Embed Size (px)

Citation preview

Page 1: Congestion Issues Stewart Bryant stbryant@cisco.comstbryant@cisco.com

Congestion Issues

Stewart Bryant [email protected]

Page 2: Congestion Issues Stewart Bryant stbryant@cisco.comstbryant@cisco.com

Creation of PWE3 WG

• When the PWE3 WG was formed it was put in the Transport Directorate because of concerns that the PWs would congest the Internet.

• Arguments at the time that PWs would only run over traffic engineered networks we were countered by noting that in all instances where a protocol could be run over the public Internet it was eventually run over the public Internet.

• We have explicit examples of this occurring - TDM PWs that are known to be running over the Internet.

Page 3: Congestion Issues Stewart Bryant stbryant@cisco.comstbryant@cisco.com

PWE3 Charter

Whilst a service provider may traffic engineer their network in such a way that PW traffic will not cause significant congestion, a PW deployed by an end-user may causecongestion of the underlying PSN. Suitable congestion avoidance mechanisms are therefore needed to protect theinternet from the unconstrained deployment of PWs.

Page 4: Congestion Issues Stewart Bryant stbryant@cisco.comstbryant@cisco.com

Requirements

• RFC 3916 (Requirements for Pseudo-Wire Emulation Edge-to-Edge (PWE3)) says nothing about congestion.

• RFC 4197 (Requirements for Edge-to-Edge Emulation of Time Division Multiplexed (TDM) Circuits over Packet Switching Networks ) says– TDM circuits run at a constant rate, and hence offer constant

traffic loads to the PSN. The rate varying mechanism that TCP uses to match the demand to the network congestion state is, therefore, not applicable.

– The ability to shut down a TDM PW when congestion has been detected MUST be provided.

– Precautions should be taken to avoid situations wherein multiple TDM PWs are simultaneously shut down or re-established, because this leads to PSN instability.

– Further congestion considerations are discussed in chapter 6.5 of [RFC3985].

Page 5: Congestion Issues Stewart Bryant stbryant@cisco.comstbryant@cisco.com

Architecture• RFC 3985 (PW Architecture) says• The PSN carrying the PW may be subject to congestion which will vary with the PSN type, the network

architecture and configuration, and the loading of the PSN.• If the traffic carried over the PW is known to be TCP friendly no additional congestion avoidance action is

necessary. • If the PW is operating over a PSN that provides enhanced delivery, the PEs should monitor packet loss to

ensure that the requested service is actually being delivered. • If it is not, then the PE should assume that the PSN is providing a best-effort service and should use a

best-effort service congestion avoidance measure. • If best-effort service is being used and the traffic is not known to be TCP friendly, the PEs should monitor

packet loss to ensure that the loss rate is within acceptable parameters. Packet loss is considered acceptable if a TCP flow across the same network path and experiencing the same network conditions would achieve an average throughput, measured on a reasonable timescale, not less than that which the PW flow is achieving.

• This condition can be satisfied by implementing a rate-limiting measure in the NSP, or by shutting down one or more PWs.

• The choice of which approach to use depends upon the type of traffic being carried. • Where congestion is avoided by shutting down a PW, a suitable mechanism must be provided to prevent it

from immediately returning to service and causing a series of congestion pulses. • The comparison to TCP cannot be specified exactly but is intended as an "order-of-magnitude" comparison

in timescale and throughput. The timescale on which TCP throughput is measured is the round-trip time of the connection. In essence, this requirement states that it is not acceptable to deploy an application (using PWE3 or any other transport protocol) on the best-effort Internet, which consumes bandwidth arbitrarily and does not compete fairly with TCP within an order of magnitude.

• One method of determining an acceptable PW bandwidth is described in [RFC3448].

Page 6: Congestion Issues Stewart Bryant stbryant@cisco.comstbryant@cisco.com

The Encapsulation Drafts

• The packet and ATM Encapsulation drafts say little about congestion, other than to reference RFC3985.

• The TDM drafts introduce the special considerations that apply to TDM, but do not provide a congestion avoidance design.

Page 7: Congestion Issues Stewart Bryant stbryant@cisco.comstbryant@cisco.com

ATM draft

• draft-ietf-pwe3-atm-encap-11.txt • DISCUSS: RFC3985 also says: "Where

congestion is avoided by shutting down a PW, a suitable mechanism must be provided to prevent it from immediately returning to service and causing a series of congestion pulses.“I don't see such a mechanism in this draft.”

• Discuss resolved by agreeing that PWE3 WG should work on the issue of congestion.

Page 8: Congestion Issues Stewart Bryant stbryant@cisco.comstbryant@cisco.com

PWE3 work on congestion

The only attempt to address the PWE3 congestion has been the draft:

PWE3 Congestion Control Framework draft-rosen-pwe3-congestion-03.txt

Which we summarise in the next

Page 9: Congestion Issues Stewart Bryant stbryant@cisco.comstbryant@cisco.com

Congestion Control and PWE3

• Congestion: – Everyone sends as much as they want

• less and less traffic gets through,• because more and more is dropped

• Congestion Control: – Feedback loop:

• packet loss causes transmission rates to go down

• absence of packet loss allows rates to climb

Page 10: Congestion Issues Stewart Bryant stbryant@cisco.comstbryant@cisco.com

Does PWE3 Need It?

• No, PWE3 mostly carries IP payloads, already congestion controlled

• No, PWE3 only runs in environments in which congestion is impossible:– Never on public Internet– Always traffic engineered and policed

• First point worth considering, second seems doubtful.

• So maybe answer is yes.

Page 11: Congestion Issues Stewart Bryant stbryant@cisco.comstbryant@cisco.com

Existing Solutions Apply?

• Existing solutions aimed at endsystem implementations

• PWE3 aimed at router implementation:– No acks– No complex state machines invoked in the data path– Limited interactions between:

• hardware and software,• between line cards, • between line cards and central processor, etc.

• Rules out, e.g., use of TCP to carry PWs– And much else

Page 12: Congestion Issues Stewart Bryant stbryant@cisco.comstbryant@cisco.com

Detecting Congested PWs

• By sequence number gaps: problematic• Periodic marking of PW data stream:

– VCCV packets with transmit counts (inserted by hardware just before transmission??)

– Receiving hardware inserts receive count and passes to processor

– Report discrepancies via control plane, perhaps per-tunnel

• Per-tunnel approximation??

Page 13: Congestion Issues Stewart Bryant stbryant@cisco.comstbryant@cisco.com

Responding to Loss

• AIMD vs. TFRC vs. ??– TFRC is slower responding, and rate based, may be

more appropriate– Don’t want response too slow, but really don’t want

false alarms

• Why TCP –friendly?– May be on Internet– An SP may have paying non-PWE3 customers

• Adjusting rates on TDM PWs– Selective stopping of channels?

Page 14: Congestion Issues Stewart Bryant stbryant@cisco.comstbryant@cisco.com

Proposal

1. Adopt draft-rosen-pwe3-congestion-03.txt as a WG draft and use as a framework for the congestion solution.

2. Form a design team consisting of PWE3, L2TP and Congestion experts to write a protocol draft to address the PWE3/L2TPv3 congestion problem