Upload
winstonhong
View
1.176
Download
3
Embed Size (px)
DESCRIPTION
Recent collapses of SIP servers in the carrier networks indicate that the built-in SIP overload control mechanism cannot mitigate overload effectively. In this paper, by employing a control-theoretic approach that models the interaction between an overloaded downstream server and its upstream server as a feedback control system, we investigate the root cause of SIP server crash by studying the impact of the retransmission on the queuing delay of the overloaded server. Then we design a PI rate controller to mitigate the overload by regulating the retransmission rate based on the round trip delay. We derive the guidelines for choosing PI controller gains to ensure the system stability. Our OPNET simulation results demonstrate that our proposed control theoretic approach can cancel the short-term SIP overload effectively, thus preventing widespread SIP network failure. Survey on SIP overload control algorithms: Y. Hong, C. Huang, and J. Yan, “A Comparative Study of SIP Overload Control Algorithms,” Network and Traffic Engineering in Emerging Distributed Computing Applications, Edited by J. Abawajy, M. Pathan, M. Rahman, A.K. Pathan, and M.M. Deris, IGI Global, 2012, pp. 1-20. http://www.igi-global.com/chapter/comparative-study-sip-overload-control/67496 http://www.researchgate.net/publication/231609451_A_Comparative_Study_of_SIP_Overload_Control_Algorithms
Citation preview
IEEE ICC 2011
Yang Hong, Changcheng Huang, and James YanDepartment of Systems and Computer Engineering,
Carleton University, Ottawa, Canada
Wh t i SIP?What is SIP? Session Initiation Protocol protocol that establishes,
manages (multimedia) sessions [RFC 3261]
used for VoIP presence &
Internet
Proxy Proxy used for VoIP, presence & video conference
SIP consists of two basic l t
Proxy Server
Proxy Server
elements UA (User Agent) and P-Server
(Proxy Server)
UA UA
About 1000 companies produce SIP products Microsoft’s Windows
M (≥4 7) i l d SIP
Simplified SIP Network Configuration
Messenger (≥4.7) includes SIP
2
IMS SIP Server Overload – A f h llPerformance Management Challenge
3GPP has adopted SIP as the basis of IMS architecture
Problem: Server(s) cannot complete the processing of requests underthe processing of requests under overload conditions
Multiple causes: Insufficient pcapacity, Component Failures, Unexpected traffic surges, DOS attacks [RFC 5390]
Impact: Performance degradation, drop in throughput, revenue loss, network collapse
Si lifi d IMS C t l L O iSimplified IMS Control Layer Overview
3
Why Worry About SIP Message Retransmission?Why Worry About SIP Message Retransmission?
Retransmission built-in to maintain SIP reliability yagainst message loss
Loss is detected as long delay in acknowledgmentLoss is detected as long delay in acknowledgment
Surge in user demand can cause SIP server overload and long delay to acknowledge SIPoverload and long delay to acknowledge SIP messages
Long delays may trigger more retransmissions and a Long delays may trigger more retransmissions and a positive feedback exacerbating server overload
4
C t ib ti f Thi PContributions of This Paper
Using control-theoretic approach tog ppmodel the interaction of overloaded server and its
upstream server as a feedback control system
Proposing Round Trip Delay Control (RTDC) algorithm (a PI rate control algorithm) to mitigate the overload by regulating retransmissionsclamping round trip delay below a desirable target value
Performing OPNET simulations under two typical overload scenarios tovalidate RTDC (implicit SIP overload control) algorithm
5
OutlineOutline
SIP Retransmission Mechanism Overview
Related Work on SIP Overload Control
Queuing Dynamics of Overloaded Server Queuing Dynamics of Overloaded Server
Control-Theoretic Design for Overload Control Based R d T i D lon Round-Trip Delay
Performance Evaluation to Validate RTDC SIP Overload Control Algorithm
Conclusions
6
Typical SIP Procedure
7
Retransmission MechanismRetransmission Mechanism
Purpose: Confirmation of successful transmission between UA and UA via P serversbetween UA and UA via P-servers
Two Types: Hop by HopHop by HopFirst retransmission after T1 , subsequent one is 2
times previous interval. Total intervals up to 64 x T1 (maximum 6 retransmissions). Default T1 = 0.5 s.
End-to EndFi t t i i ft T b t i 2 tiFirst retransmission after T1 , subsequent one is 2 times
previous interval up to a maximum of T2 . Total intervals up to 64 x T1 (maximum 11 retransmissions). 1 Default T2 = 4.0 s.
8
Related Work on Overload Control Most of existing overload control solutions adopt
push-back mechanism cancel the overload effectively by introducing overhead to advertise upstream servers to
d di t reduce message sending rate
produce overload propagation from sever to server until end-users
block a large amount of calls unnecessarily cause revenue loss of service providers
• Our Proposal: Reduce retransmission rate only to mitigate overloadby maintaining original message rate toby maintaining original message rate to keep the revenue of service providers
9
SIP Overload Control Mechanism Classification
Figure 3. The classification for the existing SIP overload control schemes
Y. Hong, C. Huang, and J. Yan, “A Comparative Study of SIP Overload Control Algorithms,”N k d T ffi E i i i E i Di ib d C i A li i Edi d bNetwork and Traffic Engineering in Emerging Distributed Computing Applications, Edited byJ. Abawajy, M. Pathan, M. Rahman, A.K. Pathan, and M.M. Deris, IGI Global, 2012, pp. 1‐20.
10
Queuing Dynamics of Overloaded ServerQ g y 100Trying response
Invite requestInvite request Server 2 Message buffer
2(t)1(t)1
)('2 tr
Invite request2
100Trying response
Server 1
TiReset timer Timer fires
q2(t) 2(t) 1(t) q1(t)
r2(t) r1(t) 2(t) 1(t)
Queuing dynamics of Server 2
Timer buffer
Timer starts Timer expires
qr1(t)
)()()()()( 22222 tttrttq (1)
Notation: 1(t) original message rate, r1 (t) message retransmission rate, (t) service rate (t) response rate q (t) queue size
)()()()()()( 11'2111 tttrtrttq Queuing dynamics of Server 1 (2)
2(t) service rate, 1 (t) response rate, q1 (t) queue sizeOverload Scenario: Server slowdown at Server 2 due to routine maintenanceOverload Collapse: 2(t) 2(t) > 2(t) (see Eq. (1)) q2(t) t i ' (t) (t) i (t) i kl
0)(2 tq trigger r'2(t) r2(t) increases q2(t) more quicklyOverload Propagation: r'2(t) enter Server 1 (see Eq. (2)) q1(t) 0)(1 tq
11
Overload Controller Design gUpstream Server 1 can process all arrival messages without any delay
• before the overload is propagated from its downstream Server 2• 2(t)=1(t) and r2(t)=r'2(t) (see previous slide #11) 2( ) 1( ) 2( ) 2( ) ( p )
Queuing dynamics of Server 2 )()()()()( 22212 tttrttq (3)
(4)Queuing delay of Server 2 )(/)]()()()([)( 222122 tttttrt ( )g y• Each request message corresponds to a response message [SIP RFC]• Thus request message service rate (i.e., the response message rate 1(t)) can approximate the total service rate 2(t)
Queuing delay of Server 2 (5))(/)]()()()([)( 112122 tttttrt
• Round trip delay of upstream server can approximate queuing delay of overloaded downstream server 2(t)
PI controller regulates retransmission rate r'2(t)
t
overloaded downstream server 2(t) when overload happens and queuing delay is dominant
12
t
IP
tIP
KtK
eKteKtr
0 2020
02
))d(())((
)d()()(
Feedback Overload Control System
Figure 4.Block diagram of feedback SIP overload control systemg g y
Control plant P(s)=2(s)/r'2(s)= {2(t)}/ {r'2(t)}1/(1s)
PI controller C(s)=KP+KI/s( ) P I
Open-loop overload control system G(s)=C(s)P(s)=(KP+KI/s)/(1s)
Positive phase margin of G(s) can guarantee control system stabilityPositive phase margin m of G(s) can guarantee control system stabilityPI controller gains can be obtained based on phase margin m
)tan(1 mPK
1K
13
)(tan1 2m
PK )(tan1 2
mIK
SIP Overload Control Algorithm (RTDC)SIP Overload Control Algorithm (RTDC)
14
k l lSIP Network Topology For OPNET Simulation
15
Scenario to Validate Overload Control Algorithm
• Poisson distributed message generation rate and service rate• Two typical overload scenarios • 4 originating servers generated original messages with the same rate• 4 originating servers generated original messages with the same rate o= (1/4)1; Mean message arrive rate of Server 1 was 1=4o
• Mean service capacity of each originating server was Co=500 messages/secScenario 1 • Mean arrival rate 1=800 messages/sec (emulating aScenario 1Initial overload at Server 1 due to demand burst
Mean arrival rate 1 800 messages/sec (emulating a short surge of user demands) from time t=0s to t=30s
• Mean arrival rate 1=200 messages/sec (emulating regular user demands) from time t=30s to t=90sregular user demands) from time t 30s to t 90s
• Mean service capacities of two proxy servers were C1=C2=1000 messages/sec
S i M i l t 200 /Scenario 2Initial overload at Server 2 due to server slowdown
• Mean arrival rate 1=200 messages/sec
• Mean server capacity C1=1000 messages/sec
• Mean server capacity C2=100 messages/sec (emulatingMean server capacity C2 100 messages/sec (emulating server slowdown) from time t=0s to t=30s, and C2=1000 messages/sec from time t=30s to t=90s
16
Simulation Results of Scenario 1Simulation Results of Scenario 1
7
8x 10
4
s)
NOLC q1
12000
sage
s)
8
10
ages
)
NOLC q0
OLC q0
3
4
5
6
ze q
1 (mes
sage
s
OLC q1
6000
9000
ue si
ze q
0 (mes
4
6
e si
ze q
0 (mes
sa
0 10 20 30 40 50 60 70 80 900
1
2
3
Que
ue si
0 10 20 30 40 50 60 70 80 90
0
3000
NO
LC Q
ueu
0 10 20 30 40 50 60 70 80 90
0
2
OLC
Que
ue
Queue size q1 (messages) of Server 1 versus time
Queue size qo (messages) of an originating server versus time
0 10 20 30 40 50 60 70 80 90
Time (sec)0 10 20 30 40 50 60 70 80 90
Time (sec)0 10 20 30 40 50 60 70 80 90
• Without overload control algorithm applied, Server 1 became CPU overloaded overload deteriorated as time evolves, leading to eventual crash of Server 1
• Overload control algorithm made queue size of Server 1 increase slowly taking 27s to cancel the overload at Server 1 after new user demand rate taking 27s to cancel the overload at Server 1 after new user demand rate reduced at time t=30s 11s faster than RRRC algorithm proposed by IEEE Globecom 2010 17
Simulation Results of Scenario 2
4
5x 10
4
mes
sage
s)
8
10
essa
ges)
NOLC q1
OLC q11.6
1.8
2x 104
ges)
NOLC q2
OLC
2
3
Que
ue si
ze q
1 (m
4
6
Que
ue si
ze q
1 (me
0.6
0.8
1
1.2
1.4
ue si
ze q
2 (mes
sag OLC q2
Q i ( ) f S 1
0 10 20 30 40 50 60 70 80 900
1
NO
LC Q
Time (sec) 0 10 20 30 40 50 60 70 80 90
0
2
OLC
Q
0 10 20 30 40 50 60 70 80 900
0.2
0.4
Time (sec)
Que
u
Queue size q1 (messages) of Server 1 versus time
Queue size q2 (messages) of Server 2 versus time
• Without overload control algorithm applied, overload was propagated from Server 2 t S 1 h i iti l l d h d t S 22 to Server 1 when initial overload happened at Server 2
• Persisted overload would crash Server 1 after Server 2 resumed its normal service
• Overload control algorithm prevent overload propagation to Server 1O e oad co t o a go t p e e t o e oad p opagat o to Se e taking only 7s to cancel the overload at Server 2 2s faster than RRRC algorithm proposed by IEEE Globecom 2010
18
Conclusions Employing control-theoretic approach to
model SIP overload problem as a feedback control problemp p
Developing Round Trip Delay Control (RTDC) algorithm (a PI rate control algorithm) to mitigate the overload by t lli t i i t controlling retransmission rate claiming round trip delay below desirable target value
Simulation results demonstrate that RTDC (implicit SIP ( poverload control) can prevent the overload propagation cancel the overload effectively cancel the overload effectively
Our solution does NOT require modification in the SIP header and time-consuming standardization process can be freely implemented in any SIP servers of different carriers
19
Remarks (1) Explicit SIP overload control algorithm requires the modification in the
SIP header and the cooperation among different carriers in different countries
Implicit SIP overload control algorithm does NOT require the modification in the SIP header and the cooperation among different carriers in different countries. Any carrier can freely implement implicitcarriers in different countries. Any carrier can freely implement implicit SIP overload control algorithm in its SIP servers to avoid potential widespread server crash
OPNET i l ti d f 3 i li it SIP l d t l l ith OPNET simulation code for 3 implicit SIP overload control algorithms (RRRC, RTDC, and RTQC) published by IEEE Globecom 2010/ICC 2011 available for non-commercial research use upon request
RTDC algorithm (proposed by this IEEE ICC 2011 paper) has been recommended as White paper by TechRepublic (an online trade publication and social community for IT professionals, part of the CBS Interactive)
http://www.techrepublic.com/whitepapers/design-of-a-pi-rate-controller-for-mitigating-sip-overload/25142469
20
Remarks (2) Journal version discusses how to apply RTDC algorithm to mitigate SIP Journal version discusses how to apply RTDC algorithm to mitigate SIP
overload for both SIP over UDP and SIP over TCP (with TLS) “Applying control theoretic approach to mitigate SIP overload,”
Telecommunication Systems, 54(4), 2013, pp. 387-404. Available aty ( )http://www.researchgate.net/publication/257667871_Applying_control_theoretic_approach_to_mitigate_SIP_overload
Survey on SIP overload control algorithms: “A Comparative Study of SIP y g p yOverload Control Algorithms,” Network and Traffic Engineering in Emerging Distributed Computing Applications, IGI Global, 2012, pp. 1-20.
http://www.igi-global.com/chapter/comparative-study-sip-overload-t l/67496control/67496
http://www.researchgate.net/publication/231609451_A_Comparative_Study_of_SIP_Overload_Control_Algorithms
Discussions on control system design can be found in the answers to the ResearchGate question “What are trends in control theory and its applications in physical systems (from a research point of view)? ”
https://www researchgate net/post/What are trends in control theory and itshttps://www.researchgate.net/post/What_are_trends_in_control_theory_and_its_applications_in_physical_systems_from_a_research_point_of_view2
21