SIP over Client Initiated Connections

HELSINKI UNIVERSITY OF TECHNOLOGYDepartment of Computer ScienceLaboratory of Telecommunication Software and Multimedia

Yang Yang

SIP over Client Initiated Connections

Master’s Thesis submitted in partial fulfillment of the requirements for the degreeof Master of Science in Technology.

Otaniemi, May 1, 2007

Supervisor: Professor Antti Yla-Jaaski

Instructor: Sasu Tarkoma, Ph.D.

HELSINKI UNIVERSITY OF TECHNOLOGY ABSTRACT OF THE

OF TECHNOLOGY MASTER’S THESIS

Author: Yang Yang

Name of the Thesis: SIP over Client Initiated Connections

Date: May 1, 2007 Number of pages: 46 + 9

Department: Department of Computer Science

Professorship: T-110 Telecommunications Software and Multimedia

Supervisor: Prof. Antti Yla-Jaaski

Instructor: Sasu Tarkoma, Ph.D.

SIP outbound as an extension of SIP enables the client initiated connections in SIP

signaling system. This feature is desirable in the case of NAT or firewall present between

the public and the private side. In such situation, connections are only allowed from

the private side to the public side. SIP outbound proposes a mechanism which keeps

the client initiated connections between a UA and proxies and later reuses the same

connections to push data to the UA from the proxy sides. This mechanism ensures the

successful traversal of NAT/firewall.

In this thesis we implemented SIP outbound protocol as an extension of SIP and in-

tegrated to the WeSAHMI experimental infrastructure and then evaluated the perfor-

mance of the system as a whole.

Keywords: SIP, SIP outbound, STUN keepalive, backoff mechanism, flow token, NAT.

ii

Acknowledgements

I want to thank my supervisor, Professor Antti Yla-Jaaski, and instructor Ph.D.

Sasu Tarkoma, for giving me the oppertunity to participant the WeSAHMI project

and instructions to accomplish my thesis.

Many thanks go to Jani Heikkinen and Sergio Lembo for their constructive ideas

and practical helps.

My gratitude also goes to my parents, my husband and my friends for their mental

support.

Otaniemi, May 1, 2007

Yang Yang

iii

Contents

Abbreviations vi

List of Figures ix

List of Tables x

1 Introduction 1

1.1 Research problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Brief motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Structure of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Background 5

3 System Model 8

3.1 WeSAHMI architecture . . . . . . . . . . . . . . . . . . . . . . . . . 8

3.2 WeSAHMI security architecture . . . . . . . . . . . . . . . . . . . . . 9

3.3 SIP outbound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4 SIP client-initiated outbound 11

4.1 Overview of the mechanism . . . . . . . . . . . . . . . . . . . . . . . 11

4.2 User agent behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

4.2.1 Flow establishment . . . . . . . . . . . . . . . . . . . . . . . . 12

4.2.2 Flow recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4.2.3 Keepalive mechanism . . . . . . . . . . . . . . . . . . . . . . 14

4.3 Edge proxy behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4.3.1 Flow token . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4.3.2 Forwarding Mechanism . . . . . . . . . . . . . . . . . . . . . 18

iv

4.3.3 Keepalive mechanisms . . . . . . . . . . . . . . . . . . . . . . 19

4.4 Registrar behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.5 Authoritative proxy behavior . . . . . . . . . . . . . . . . . . . . . . 20

5 Implementation 22

5.1 Open source libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5.2 User agent routine . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.2.1 Termination of a flow . . . . . . . . . . . . . . . . . . . . . . 23

5.2.2 Failures of a flow . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.2.3 Re-registation . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.3 TCP keepalive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.4 STUN keepalive over UDP . . . . . . . . . . . . . . . . . . . . . . . . 25

5.4.1 Overview of the mechanism . . . . . . . . . . . . . . . . . . . 25

5.4.2 STUN server and client . . . . . . . . . . . . . . . . . . . . . 26

5.4.3 STUN attributes . . . . . . . . . . . . . . . . . . . . . . . . . 27

5.4.4 STUN retransmission mechanism . . . . . . . . . . . . . . . . 27

5.5 Edge proxy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

6 Experimentation 29

6.1 Experimental infrastructure deployment . . . . . . . . . . . . . . . . 29

6.2 Experiment for SIP over UDP with SIP outbound features . . . . . . 30

6.2.1 Experiment for STUN keepalive . . . . . . . . . . . . . . . . 33

6.3 Experiment TCP keepalive . . . . . . . . . . . . . . . . . . . . . . . 34

7 Discussion 37

8 Conclusions 39

8.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

8.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

A Appendix 44

A.1 Important data structures . . . . . . . . . . . . . . . . . . . . . . . . 44

A.2 Important modifications to the eXosip and osip libraries . . . . . . . 45

A.3 APIs for base64 encoding . . . . . . . . . . . . . . . . . . . . . . . . 45

A.4 APIs for STUN keepalive . . . . . . . . . . . . . . . . . . . . . . . . 45

v

Abbreviations

AOR Address of Record, a well-known address for a user. In SIP, it is a

SIP URI.

ALG Application Layer Gateway

API Application Programming Interface

B2BUA Back to Back User Agent

DNS Domain Name System, a global de-centralized directory that trans-

lates domain names into IP addresses.

DNSSRV Domain Name System Service Record Working Group, an IETF

working group that specified a DNS extension enabling finding of

an IP address of a service based on a protocol and domain.

DHCP Dynamic Host Configuration Protocol, and Internet protocol for

automating the configuration of devices using TCP/IP.

DTLS Datagram Transport Layer Security

EP Edge Proxy, any proxy that is located topologically between the

registering User Agent and the Authoritative Proxy.

HTTP Hyper Text Transport Protocol, a web browsing protocol.

HMAC Hash message Authentication Code, is a type of message authenti-

cation code calculated using a cryptographic hash function in com-

bination with a secret key.

ICE Interactive Connectivity Establishment

IETF Internet Engineering Task Force

vi

IP Internet Protocol

NAT Network Address Translation, enables a local are network to use one

set of IP addresses for internal traffic and a second set of addresses

for external traffic.

NTP Network Time Protocol, a protocol for synchronizing the clocks of

computer systems data networks.

SDP Session Description Protocol: A format for describing the types of

media to use in a session.

SHA-1 Secure Hash Algorithm Version 1.0, a standard for computing a

condensed representation of data.

SIPCOMP Signaling compression: A framework used to compress signaling

message using arbitrary compression algorithms.

SIP Session Initiation Protocol

SIP URI A uniform resource identifier with the scheme ”sip:”. SIP systems

use the domain component along with DNS to determine where to

send SIP messages.

SMTP Simple Mail Transport Protocol, a protocol for email

SSL Secure Socket Layer, a predecessor of TLS.

STUN Simple Traversal Underneath Network Address Translation

TCP Transmission Control Protocol, an Internet protocol that estab-

lishes reliable connections over IP.

TLS Transport Layer Security

UAC User Agent Client

UDP User Datagram Protocol, a connectionless Internet protocol run-

ning on top of IP.

UMTS Universal Mobile Telecommunications System,

URL Uniform Resource Locators, names used to represent addresses or

locations in the Internet.

vii

UUID Universally Unique Identifier.

WeSAHMI Web Services in Ad-Hoc and Mobile Infrastructure.

viii

List of Figures

2.1 Data push and pull service . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2 Data pull service with a edge proxy . . . . . . . . . . . . . . . . . . . 7

3.1 Deployment of SIP outbound in WeSAHMI security architecture . . 10

4.1 Explicit probe before sending STUN messages . . . . . . . . . . . . . 15

4.2 Explicit probe after no success STUN response received . . . . . . . 16

4.3 The format of S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.4 Forwarding mechanism of EPs . . . . . . . . . . . . . . . . . . . . . . 19

6.1 Experimental environment . . . . . . . . . . . . . . . . . . . . . . . . 29

6.2 Flow sequence of SIP messages . . . . . . . . . . . . . . . . . . . . . 30

ix

List of Tables

4.1 Updated binding behaviour in SIP outbound . . . . . . . . . . . . . 20

5.1 Registration behavoir . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.2 STUN attributes supported by the implementation . . . . . . . . . . 27

6.1 REGISTER request proxied to the primary EP . . . . . . . . . . . . 31

6.2 REGISTER request proxied to the secondary EP . . . . . . . . . . . 32

6.3 200OK response received by the UA from the primary EP . . . . . . 33

6.4 200OK response received by the UA from the secondary EP . . . . . 34

6.5 SUBSCRIBE request sent by the UA to its Notifier. . . . . . . . . . 35

6.6 NOTIFY request sent from the Notifier to the UA. . . . . . . . . . . 36

6.7 STUN Binding request. . . . . . . . . . . . . . . . . . . . . . . . . . 36

A.1 Modification to eXosip and osip libraries . . . . . . . . . . . . . . . . 45

x

Chapter 1

Introduction

The increase of the Internet usages result in the assimilation of telephony services

into the Internet Protocol [6] technology, which stimulates the generation of signaling

protocols to set up and tear down multimedia sessions. Some communities propose

solutions in accordance with their own priorities and interests. Session Initiation

Protocol (SIP), born in a computer science laboratory within a decade, satisfies the

growing thirst for a new generation of IP based services [4].

The SIP is a signaling protocol used for establishing sessions in an IP network. It

is developed by IETF as part of the Internet Multimedia Conferencing Architecture

[29]. It incorporates elements of two well-known protocols: the Web’s Hyper Text

Transfer Protocol (HTTP)formatting protocol [23] and the Simple Mail Transfer

Protocol (SMTP) e-mail protocol [22] [1]. Its first major use has been signaling in

Internet telephony [30]. But gradually SIP’s utility does not end with telephony: it

is already employed as a basic technology for instance messaging and presence.

SIP resolves two significant issues in establishing these real time communication

sessions. First of all, it helps participants going to communicate locate each other

on the Internet (rendezvous). Then it allows those participants to negotiate how

they are willing to communicate.

Nowadays, more and more carriers and providers offer SIP-based services such as

local and long distance telephony, presence and instant messaging, voice message,

push-to-talk, rich media conferencing, and so on. All these media communications

resort to SIP as a signaling protocol, since SIP allows proxy servers to to initiate

TCP connections and send asynchronous UDP datagram to User Agents (UAs).

SIP will be used as the primary signaling technology in the next generation mobile

communication.

However, because of the presence of Network Address Translators (NATs) and fire-

1

CHAPTER 1. INTRODUCTION 2

walls, network is segmented, which causes SIP servers, such as registrars or proxies,

can not initiate connections to UAs. A firewall device will block connections to the

UA between the UA and the proxy servers. Similarly NATs only allow connections

from the private address side to the public side.

Researches about the effect of NAT have been done in recent years. Several

extensions are proposed to the original SIP specification [8], which allows a UA to

receive incoming signaling requests from the server side.

1.1 Research problem

The SIP enables the end systems and proxy servers to establish multimedia sessions

with each other. However, according to the above discussion, only connections to

the server initiated by the UA can be established, but connections in the reverse

direction, server initiated connections, are not possible. It is because a SIP endpoint

behind a NAT only sends messages with its private address and unmapped port,

which will be useless to other endpoints not behind the same NAT. Moreover, most

NATs/firewall prevent incoming TCP connections and UDP traffic from the public

side. This drawback of the NATs impedes the end-to-end connectivity of SIP. A

SIP endpoint will not work in such situation, without implementation of external

extensions of SIP.

The above problem can be partially resolved by deploying an Application Layer

Gateway (ALG) [20] inside the NAT. A SIP-aware ALG can inspect the message,

and map the internal addresses and ports to outside addresses and ports. But this

always requires the ALG to know the nuances of a new use of SIP. Since SIP is a

framework protocol instead of a single application, this method can not be a cure-all

mechanism.

A improved version of the same idea is to put a pair of UAs back to back across

that NAT/firewall point. The a pair of UAs is known as Back to Back User Agent

or session border controller [10]. The B2BUA acts as a UA server on one side and

as a UA client on the other side, terminating and re-originating signaling and media

on both sides. However, a B2BUA has to learn any new protocol features before

allowing them to pass.

To make the endpoints to traverse NAT easier, the Simple Traversal Underneath

NATs (STUN) [12] was proposed years ago. Through the STUN protocol, a SIP

UA can detect the mapping o f its IP address and port on a NAT device between

the private side to the public side. But the addresses obtained may not be usable

by all peers. So only STUN itself can not solve the NAT traversal problem. An


extension of STUN, known as the Traversal Using Relays around NAT (TURN),

allows a SIP client behind a NAT/firewall to receive incoming data over TCP or

UDP connections [11]. However, it only supports the connection of a client behind a

NAT to a single peer. And the cost of providing a TURN relay server is so high that

the TURN would only be desirable as a last resort. The Interactive Connectivity

Establishment (ICE) methodology [26] can be used to discover optimal means of

connectivity using various techniques, such as STUN and TURN [27].

In the worst case, a SIP client may find itself behind a NAT/firewall that prevents

all incoming traffic except packets of a TCP stream the client opened. The SIP

outbound extension is proposed [5], which reuses the connection initiated by the

UA to the EP after the UA establishes a connection to the EP successfully by

sending REGISTER requests. Since the server can not reach the UA, it is the UA’s

duty to keep the connection active. When a UA initiates a connection to the proxy,

the proxy can later reuse this flow to push SIP message to the UA. So the UA has

to assure the flow is always active.

This thesis represented the SIP outbound extension based the internet draft [5]

and did reasonable experiment and evaluation against the new features to inspect

the complexity and usability of SIP outbound. Some updates were proposed to the

specification for implementation needs.

1.2 Brief motivation

This thesis was carried out in the WeSAHMI project. In WeSAHMI project, an

experimental infrastructure for interactive wireless applications, that can operate in

an ad-hoc networking environment, is implemented. In addition, a demo application

suite for an airport environment is to be implemented [2]. SIP is employed as the

communication protocol in the session level in IP networks by the WeSAHMI secu-

rity architecture. After the upper layer accomplishes identification for all entities,

the client system starts a secure session with a gateway. This thesis implemented

the SIP outbound protocol as an extension of SIP. So with the extended features

addressed in [5], the client can initiate a secure channel to open ports for the client

in the gateway (namely EP in the following chapters). After the secure channel has

been established, the channel is kept active by the client. So the gateway later can

push SIP messages to the client.


1.3 Structure of the thesis

Chapter 1 introduces the general background knowledge and presents the research

problem. Chapter 2 addresses the effects of combining NAT and firewall with SIP

signaling and background information of WeSAHMI project. Chapter 3 introduces

the system model of WeSAHMI project and how SIP outbound fits to the whole

WeSAHMI architecture. In chapter 4 we present SIP outbound protocol in more

details, and discuss its challenges in the view of implementation practices. Chapter

5 reviews the procedure of our SIP outbound implementation, and how we integrated

STUN protocol to SIP. Chapter 6 experiments the implementation in a simplified

system against the required the features in SIP outbound. Chapter 7 discusses the

performance of the system after extended by SIP outbound and other possibilities for

the flow token algorithms. Chapter 8 concludes the thesis and presents conclusions

and future works.

Chapter 2

Background

Originally, NAT devices are used to connect an isolated address to an external realm

with globally unique registered addresses [21]. So it effectively extends the address

space. Because SIP packets go out from a NATed client with their private IP ad-

dresses packed into the message headers (Via and Contact headers) and SDP bodies

[9], a NAT device are not aware of them. So when the packets get to their destina-

tion, they are processed and responded to completely useless source addresses.

The effect of NAT and firewalls to signaling system become active research topic

[17] [28] [34]. Several solutions were proposed to allow SIP to traverse NAT and fire-

wall effectively [5], [26]. Solutions to this include using TCP for SIP instead of UDP,

employing keep alive program to maintain NAT bindings, or using STUN/TURN

servers.

The key to successful NAT/firewall traversal is that the remote host know which

global port and IP address has been assigned by the NAT for a given flow. The

extension of SIP, called ICE [26] relies on two new protocols being developed in the

IETF, STUN and TURN. STUN allows a host to learn the global IP address and

UDP port assigned by its outermost NAT box. The address can be subsequently

conveyed by SIP to allow direct UDP connectivity between hosts. TURN allows a

host to select a globally-addressable TCP relay, which can subsequently be used to

bridge a TCP connection between two NATed hosts. Unlike STUN, TURN does

not allow direct connectivity between NATed hosts.

Different from the ICE extension, SIP outbound inserts an extra network entity,

edge proxy, to traverse NAT and firewall, with a client-initiated connection mecha-

nism. The SIP client initiates secured connections to EPs (at least two) by sending

REGISTER requests. These secured connections will be maintained by the client

and EPs so that later EPs can push data to the client through these connections.

5

CHAPTER 2. BACKGROUND 6

This feature requires the EP to work not only as a SIP proxy but also as a keep

alive server. And the EP has to be able to distinguish different connections initiated

by different clients. The EP identifies different connections by assigning different

flow tokens for each connection. Communications to untrusted external domains

are allocated to EPs since clients are invisible to outer domain. Failure tolerance

mechanism is also considered in [5] by proposing multiple registrations and multiple

physical hosts deployment.

As part of the security model of WeSAHMI architecture, this thesis represented

the implementation of SIP outbound as an extension of SIP. The WeSAHMI project

implemented an application for an airport environment. In the airport scenario, a

crucial matter is the delivery of real-time information updates to the passengers and

employees of the airport. Such kind of information updates include flights’ delay or

cancellation, the changes of departure gates of the flights and such. The time delay

caused by the process of information delivery is also crucial. The airline information

system would push the information of the updated situation to passengers on time.

In the WeSASHMI project, two principal services for communication are required

between the Finnair application server and the passengers: pull and push services.

Both of these services are carried out through SIP.

SIP enables clients to register to certain services. Once registered, clients can

pull information from the content server, and the server can send asynchronous

notifications to the client. As shown in the left side of Figure 2.1, the client sends

a SUBSCRIBE message, which is acknowledged by the notifier with a NOTIFY

message. This is the push service.

The pull-service is similar. The client has to know what content to pull from the

notifier. The notifier can send descriptions of available content by using push service.

Once the client knows what services are available, it can decide what content to pull

from the notifier. As shown in right side of Figure 2.1, the notifier first sends a

NOTIFY message which carries a description of the available services. Later, the

client sends a SUBSCRIBE request to query the service, which is acknowledged by

a NOTIFY with the real data of the service.

The security architecture of WeSAHMI system is used to establish authentication

and authorization between clients and the WeSAHMI server. SIP outbound proposes

an additional networking element (Edge Proxy) consisting of transport and security

mechanisms. The EP will be inserted between the UA and the notifier topologically.

So the procedure pull services above has to be adjusted as shown in Figure ??. The

push service is similar, so it is not illustrated in the figure. All incoming and outgoing

messages have to be forwarded to the EP.

CHAPTER 2. BACKGROUND 7

Figure 2.1: Data push and pull service

Figure 2.2: Data pull service with a edge proxy

Chapter 3

System Model

3.1 WeSAHMI architecture

In [2], an experimental infrastructure is specified for interactive wireless applications

operating in a mobile ad-hoc networking [25] environment. A practical application

is deployed for an airport environment. The system provides mobile check-in service

for passengers in the airport. The user of the system is entitled to take necessary

actions with her or his mobile device, such as check-in, registration for a flight,

baggage drop and security gate.

To support the above functions, the infrastructure must be characterized by iden-

tification of mobile users and tracking of their presence, delivery of content, notifica-

tions, and status updates to mobile users in a server-initiated fashion, and managing

and updating the state of both clients and servers in real time.

The WeSAHMI architecture consists of the following components:� WeSAHMI server: a central role as relaying data from the external model to

client brower,� client browser: a X-smile browser on a client node,� security architecture is used to establish secure channel between clients and

server.� WWW server: an Apache WWW server to host user interface components

and relay client input to the WeSAHMI server.

8

CHAPTER 3. SYSTEM MODEL 9

3.2 WeSAHMI security architecture

Our implementation hosts in the security architecture. The security architecture is

designed to push data from the trusted WeSAHMI environment to untrusted wireless

network environment. An extra network entity (namely edge proxy) is added to the

architecture to ensure secure data delivery push. The edge proxy is equipped with

transport and security mechanisms. The edge proxy is a logical entity. Physically,

we can deploy multiple hosts to decrease the possibility of lost notification caused

by a single element failure.

Other elements included in the architecture are mobile hosts and notification

service. The mobile host, working as a SIP UA, can initiate a connection to the

EP by sending REGISTER request to the registrar. And then the registrar will

challenge the mobile host for authentication. After successful registration indicated

by receiving 200 OK response, the mobile host sends STUN Binding requests over

the same flow for sending SIP messages to keep the flow active. This established and

ongoing flow will later be used for secure push. The notification service also works

like a SIP UA. It fetches the contact address of the mobile host by querying the

registrar. The NOTIFY request is forwarded to the EP and then the EP forwards

it to the mobile host through the existing connection initiated by the mobile host.

3.3 SIP outbound

SIP is used to provide pull- and push- services to the WeSAHMI system. For exam-

ple, a client can register to certain services, and then pull data to the service provider

or receive asynchronous notifications from the service provider. But because of the

NAT and firewalls presence, the connections from the server side to the clients side

become impossible. That is, the service provider can not deliver asynchronous data

to clients, which is an expected feature for the WeSAHMI system. To solve this

problem, we have to add new features to the basic SIP according to one of the SIP

extensions, that is SIP outbound [5].

We insert an extra entity to the security architecture, namely the EP. So any

clients who want to subscribe to certain service, must first establish a direct flow

to their EPs by sending REGISTER requests. A local daemon on the client takes

charge of the registration and also handles the SUBSCRIBE/NOTIFY messages.

After successful registrations, the daemon may send a SUBSCRIBE message to a

content server forwarded through one of its outbound EPs, to which the content

server acknowledges with a NOTIFY. On the other hand, if a message from the

CHAPTER 3. SYSTEM MODEL 10

content server has arrived, the daemon will deliver the message to the client appli-

cation, such as the browser. Figure 3.1 shows where we deploy the SIP outbound

component in the WeSAHMI architecture.

Figure 3.1: Deployment of SIP outbound in WeSAHMI security architecture

The client daemon uses keep alive mechanism to keep the flow to its outbound

EPs always active. So when the content server wants to push messages to clients,

it can always reach the client from the public side through a secured channel.

Chapter 4

SIP client-initiated outbound

This chapter briefly describes SIP outbound extension. We adjusted the structure

of the SIP outbound draft [5], and organized it to be convenient for implementation.

4.1 Overview of the mechanism

SIP outbound is specified to be applied to the environment in which a registrar, or

more general a proxy server, can not initiate direct connections to the UA behind a

NAT box or firewall. So the key idea of SIP outbound is that when a UA initiates a

connection to a proxy server, the proxy server can later reuse the same connection

to forward requests to the UA. Certainly, the UA must ensure the connection active

by using certain keep alive mechanism.

To achieve high reliability of connections, the UA can form multiple flows to the

proxy server (known as EP in SIP outbound) by registering multiple times over

different connections for the same SIP AOR. Each REGISTER request includes an

instance-id (used to identify the UA uniquely) and a reg-id label (to distinguish

different flows). And each flow is kept active by using STUN keep alive mechanism

over UDP connection or TCP keep alive.

In the following sections, we will introduce more specifically about different be-

haviors of four networking entities (UA, EP, registrar and authoritative proxy),

supporting SIP outbound features.

11

CHAPTER 4. SIP CLIENT-INITIATED OUTBOUND 12

4.2 User agent behavior

4.2.1 Flow establishment

At configuration time UAs obtain a set of SIP URIs representing the default out-

bound proxy set. In [5], the configuration mechanism is excluded. However, this

should also be a key point for the implementation. For more implementation details,

please refer to chapter 5 and 7. The number of URIs in this set should be at least

two and no more than four. For each outbound proxy URI in the set, the UA must

send a REGISTER request to form a direct flow to the EP. The EP forwards the

request to the registrar, and then every thing works as normal SIP: the registrar

may challenge the UA for authentication; the UA sends its credential and waits for

the 200OK response from the registrar which indicates a successful registration.

The UAC is required to support the Path header mechanism, by including the

’path’ option-tag in a Supported header field value in its REGISTER requests.

Successful registrations are indicated by the presence of ’outbound’ option-tags in

Supported header field values in responses, which reveals the registrar and all EPs

traversed by the UAC support SIP outbound extension.

The failure of a registration is indicated by the UA’s receiving 503 (Service Un-

available) responses with a Retry-After header field. So the UA needs to recover

the flow by employing backoff mechanism to decide the time for re-registration. De-

tails about flow recovery can be found in section 4.2.2 the paragraph about backoff

mechanism.

Instance ID and Register ID

SIP outbound [5] introduces two new parameters for the Contact header field: In-

stance Identifier (instance-id) and Registration Identifier (reg-id). In a signaling

system supporting SIP outbound, each UA is identified uniquely by a persistent

instance-id URN. This instance-id must be persistent even if the UA reboots or

power cycled, and must not change as the device moves from one network to an-

other. The UA uses a UUID URN [19] as its instance-id and attaches it to the

Contact header field as a ”+sip.instance” media feature tag.

The UUID URN does not require central registration process so no centralized

authority is required to administer them. In our mobile wireless environment, this is

a favorable feature to minimize additional entities. Furthermore, a UUID is a fixed

size of 128 bits URN which is reasonably small compared to other alternatives. And

the unique ability to generate a new UUID without a registration process allows for


UUIDs to be one of the URNs with the lowest minting cost[19].

Another new Contact header field parameter is reg-id, added by the UA. The

UA uses reg-id to distinguish different flows, since it can register multiple times

over different connections for the same SIP AOR. The reg-id does not have to be

incremented sequentially, but it has to be unique for each flow. And when the UA

power cycles or reboots the reg-id has to remain the same as the previous flow’s so

that the registrar can replace the older registration[5].

4.2.2 Flow recovery

An ongoing flow may fail because of various network problems. So the UA should

be able to detects failures by certain mechanisms, such as keepalive mechanisms. If

a flow fails, the UA uses the procedure described in section 4.2.1 to form a new flow

to replace the failed one. However, before the recovery of the flow, the UA should

wait for some time as described in the following paragraph.

Backoff mechanism

The UA employs backoff mechanism to avoid avalanche restart on EPs. That is, the

UA needs to wait amount of time before trying to establish a new flow to replace

the failed one.

The following algorithm is used to calculate the waiting time in seconds:

TIMEwait = min(TIMEmax, (TIMEbase × (2failures)))� TIMEmax: the default value is set to 1800 seconds.� failures: is the number of consecutive registration failure.� TIMEbase: is set to 30 seconds if all of the flows to every URI in the outbound

proxy set have failed; otherwise, if at least one of the flows has not failed, it

is set to 90 seconds.

A flow is considered successful if outbound registration succeeded and keepalives

have not expired for min-regtime seconds (default of 120 seconds) after a registration.

The time to re-register, known as delay time, is computed by selecting a uniform

random time between 50 and 100 percent of the TIMEwait. The UA must wait

for the value of the delay time before re-registration. The default flow registration

backoff time table can be found in the Appendix A in [5].


4.2.3 Keepalive mechanism

Two keepalive methods are proposed: STUN over UDP and TCP keepalive. For

SIP over UDP, a limited version of STUN [12] keepalive mechanism is employed.

The only STUN messages required by this usage are Binding Requests, Binding

Responses, and Error Responses.

The UAC sends STUN messages over the same UDP flow used for sending SIP

messages. On the server (EP or registrar) side, it must also provide a limited version

of a STUN server listening on the same network interface and port as the SIP proxy

server.

The UA needs two phases of validation for STUN keepalive support. The first

phase allows a UA to inspect if the URIs in its outbound proxy set containing the

’keep-stun’ parameter, or not. In most circumstances, this explicit indication should

be sufficient. But misconfiguration may happen sometimes. If sending binary STUN

data to a proxy that does not support STUN, the node could be blacklisted for UDP

traffic. So we need the second phase of validation, namely an explicit probe. A UA

can send an OPTION request to the next hop by setting the Max-Forwards header

field to 0, and expect that the next hop responses with the ’sip-stun’ option tag in

its Supported header field. Otherwise, if either of these two validation phases fails,

the UA must stop sending additional STUN messages.

The UA can perform explicit probe just after it establishes a direct flow to the

EP as shown in Figure 4.1, or probe STUN support after it sends a STUN Binding

Request and does not receive a STUN success response as shown in Figure 4.2. The

order of these two phases of validation is implementation specific issue, and is left

for the implementor to decide.

For SIP over TCP or SIP over TLS over TCP, TCP keepalive is sufficient to remain

the flow active. Some operating system, such as Linux, supports per connection TCP

keepalive, which facilitates the keepalive support.

4.3 Edge proxy behavior

The Edge Proxy is located topologically between the UA and the AP and works

as a stateless forwarding proxy. It receives SIP requests and then forwards these

requests to the next hop (a registrar, another EP, or a UA). And if it wishes to be

revisited for any subsequent requests, it will add itself to the Path vector [35]. As

we expect, the EP should be able to use the ongoing flow to forward. To achieve

this feature, it will insert an identifier–containing information about the flow from


Figure 4.1: Explicit probe before sending STUN messages


Figure 4.2: Explicit probe after no success STUN response received


the previous hop–in its Path URI.

4.3.1 Flow token

When the EP receives a REGISTER request from a UA, it needs to create an

identifier value that uniquely identifies this flow, and add this identifier to its user

part of Path URI. The identifier allows the EP to map future requests back to the

correct flow. Moreover, an indirect examination of user’s authentication is done by

checking the presence of the identifier returned by a successful registration response.

SIP outbound [5] proposes flow token as a flow identifier, and also two algorithms

for stateless flow token mechanisms. For the sake of security, in our implementa-

tion we used algorithm 2 proposed in SIP outbound, but modified its input S by

replacing local IP and port with the file descriptor, and then encode it with base64

encoding[15].

In SIP outbound[5] the first algorithm generates a 16 octets long token. The

equation 4.1 is for a TCP connection. NTP is the time the connection is created

[18]. The equation 4.2 is for a UDP based transport, so no NTP time is needed, but

the remote IP and port are required .

Algorithm 1:

Token = BASE64encode(fileDescriptor||NTP ) (4.1)

Token = BASE64encode(fileDescriptor||remoteIP ||remotePort) (4.2)

This algorithm itself has no security assurance, so an attacker can hijack another

user’s call without a hitch. Unless, we employ SIP level security protection, this

algorithm must not be used. But security mechanism in SIP level is expensive. So

we preferred the second algorithm.

Algorithm 2:

Token = BASE64encode(HMACSHA1−80(K,S)||S) (4.3)

In equation 4.3, K is a 20-octet crypto random key distributed (can be obtained

from a trusted third party) and shared among EPs. The input S is formated as

shown in the following Figure 4.3. We used HMAC-SHA1-80 [16] to compute the

keyed-hash value of S, and then encoded the concatenation of the HMAC of S and

S by using base64 encoding [15]. This will result in a 32-octet identifier.

In our implementation, we used algorithm 2, but replaced the local IP and local


Figure 4.3: The format of S

port fields of S with the file descriptor of the socket.

4.3.2 Forwarding Mechanism

There are two kinds of requests traversing the EP. One kind of requests is an inter-

mediate request which is generated by a UA in another domain and has no direct

flow to the EP. Another kind is that EP can receive requests from a UA or another

EP, depending on the configuration. As an intermediate proxy receiving a request

from another EP and it is the host in the topmost Route header field value, the

proxy compares the flow in the flow token with the source of the request. If these

refer to the same flow, the EP removes the Router header and continues processing

the request. If the flow token is invalid, the EP has to reject the request.

Figure 4.4 shows a concrete example. The solid bi-directional arrowed lines indi-

cate direct flow between entities. The dash lines mean flows established when being

needed. UA1 in domain 1 wants to contact UA2 (any kind of SIP request), first

UA1 refers to its registrar and get the contact information of UA2 and also the

flow token for the Path header [35]. Then it proxies its request to EP1 which has a

direct flow to it. EP1 finds itself is the topmost host in the Route header, and the

Route header contains a flow token, so EP1 check if it is a valid flow token. If so,

it applies normal routing procedure to decide the next hop. We assume that EP1’s

next hop is EP2, so it routes the request to EP2. When it receives the request, the

EP2 checks if the request contains a valid flow token and if the flow token is created

by itself. In this example EP2 notices the destination is UA2 who has a direct flow

to it. So EP2 sends the request to UA2 through the direct flow.

EP1 and EP2 proceed the flow token according to the algorithm they use to

generate the token: If they use algorithm 1: They first decode the user part of

the Route header using base64. Then for a TCP-based transport, if a connection

specified by the file descriptor matches its creation time, they forward the request

over that connection. For a UDP-based transport, they forward the request from

the encoded file.

If they use algorithm 2: Equivalently they decode the flow token. Then they


Figure 4.4: Forwarding mechanism of EPs

verify if the HMAC is correct by recomputing the HMAC and checking if they match

each other. If the HMACs mismatch, EPs should send a 403 (Forbidden) response.

Otherwise, they should forward the request on the flow that was specified by the

information in the flow identifier. To ensure the mid-dialog requests are routed over

the existing flow, [13] proposes the EP adds a Record-Route entry to each dialog

initiating request. The Record-Route contains a SIP URI which is comprised of a

flow token and a domain name. If this flow no longer exists, the EP should send a

430 (Flow Failed) response to the request side.

4.3.3 Keepalive mechanisms

Meanwhile, the EP must also support keepalive mechanisms and function as a STUN

server for UDP connections or TCP keepalive as presented in section 4.2.3.

4.4 Registrar behavior

As described in the SIP specification [8], a SIP client sends REGISTER request

periodically to a server (known as a SIP registrar) to associate the client’s SIP or

SIPS URI with the machine into which the client is currently logged (conveyed as a

SIP or SIPS URI in the Contact header field). The registrar writes this association,

also called a binding, to a database, called the location service. REGISTER request

can add a new binding between an AOR and one or more contact addresses. A

client can also remove previous bindings or query to determine which bindings are

currently in place for an AOR.


SIP outbound updates the definition of a binding in [8]. The updated binding

behavior is shown in the following table 5.1, according to the presence of instance-id

and reg-id.

instance-id reg-id Binding Behaviour

Registrar * * Bind an AOR with the combination of

* instance-id and reg-id

* Invalide reg-id to be ignored

Normal binding behaviour

Table 4.1: Updated binding behaviour in SIP outbound

According to the table 5.1, a Contact header field value with an instance-id but

no reg-id is still valid. But this is not applied to the reverse situation which only has

a reg-id but no instance-id. So the reg-id parameter will be simply ignored when the

instance-id is not present. Moreover, the registrar must also be prepared to receive,

for the same AOR, some registrations that use instance-id and reg-id and some do

not. This implies the registrar has to work as a normal SIP registrar and a registrar

supporting SIP outbound when needed.

The registrar must store all the Contact header field information, and store the

time at which the binding was last updated. If a Path header field is present, the

registrar stores this information as well. If the registrar receives a re-registration, it

must update any information that uniquely identifies the network flow over which

the request arrived, and should update the time the binding was last updated.

The registrar must include the ’outbound’ option-tag in a Supported header field

value in its responses to REGISTER requests for which it has performed outbound

processing. This explicitly informs EPs and UAs that this registrar supports SIP

outbound.

4.5 Authoritative proxy behavior

The AP entity is present when location service is needed by the UA. The location

service contains information that allows a proxy to input a URI and receive a set of

zero or more URIs that tell the proxy where to send the request [8]. This information

is created by registrations. As shown in Figure 4.4, UA1 looks up a registration

binding to get the contact information of UA2 by using the location service provided


by the AP and then sends a request through EP1. An AP selects a contact to use

normally, with a few additional rules:� The proxy must not populate the target set with more than one contact with

the same AOR and instance-id at a time. If a request for a particular AOR and

instance-id fails with a 430 (Flow Failed) response, the proxy should replace

the failed branch with another target (if one is available) with the same AOR

and instance-id, but a different reg-id.� If the proxy receives a final response from a branch other than a 408 (Request

Timeout) or a 430 (Flow Failed) response, the proxy must not forward the

same request to another target representing the same AOR and instance-id.

The targeted instance has already provided its response.

Chapter 5

Implementation

In this chapter, we will describe how the SIP outbound [5] was implemented as an

extension of the existing SIP framework. And how our implementation integrated

to the WeSAHMI architecture.

5.1 Open source libraries

We used the open source SIP libraries eXoSIP and oSIP to build the basic SIP

application routine. To minimize changes to the original libraries’ interfaces, we

extended most SIP outbound features in application level. That is, all SIP out-

bound features, except for keepalive mechanisms such as STUN and TCP keepalive,

were implemented by calling APIs provided by eXosip library. The eXosip2 is an

extension of the oSIP library which is a low level SIP library implementing SIP

transactions. The oSIP library provides SIP message parsing and wrappers. The

eXosip sends and receives SIP messages in isolation, and creates a separate thread

for the SIP application built upon the eXosip2 and oSIP libraries. A transaction

state machine of the oSIP library calls callback functions to send SIP messages. A

listening socket needs to be initialized in another thread to receive incoming SIP

messages in the application program. The eXosip2 provides the implementation of

the callback functions for sending the outgoing SIP request over a network trans-

port. In order to reuse the already established TCP connections, the eXosip2 looks

up a data structure which stores all the previous active UDP or TCP connections.

The STUN keepalive mechanism and flow token algorithm was implemented in

separate files. Please refer to appendix A for important modifications and data

structures. Other open source libraries including openSSL, uuid and base64, were

also used to facilitate our implementation. OpenSSL is a cryptography implemen-

22

CHAPTER 5. IMPLEMENTATION 23

tation of the SSL and TLS [32] and the DTLS [7] protocols. We used APIs provided

by OpenSSL to construct HMAC for the flow token.

5.2 User agent routine

First the UA daemon, or called client daemon, initiated the eXosip library, which

constructs some important data structure. Then it registers to the registrar by

forwarding the two REGISTER requests to its primary and secondary EPs respec-

tively. The registrar may challenge the UA. So the UA should provide its identity

as its credential. Successful registrations are indicated by the UA receiving 200OK

responses. This finally leads to the establishment of two direct flows between the

UA and its EPs. Nevertheless, if either of these two flows failed, such as any situa-

tion (as described in section 5.2.2) occurred, the UA should use backoff mechanism

to re-register.

After these initial flow establishment, a timer is trigged, and the the UA can

start normal SIP traffic. We assume it sends a SUBSCRIBE to a remote service

provider. So first the UA should consult the registrar to get the contact information

of the service provider. Then it proxies the request to any of its two proxies using

an established flow. There is no preference which EPs should be used first. In our

implementation, we always pick the primary EP to proxy requests. More intelligent

mechanism is discussed in chapter 7. When the timer expired, keepalive messages

were sent. For SIP over UDP, STUN binding requests were sent (refer to section 5.4

for details); for TCP or TLS over TCP, Linux kernel used TCP keepalive to keep

the flow active.

5.2.1 Termination of a flow

Our system should be able to terminate a flow elegantly. Once the user wants to

terminate SIP communication, he or she can send a REGISTER request with 0

value in Expire header field. The registrar removes the binding so that no further

requests will be sent to the user’s UA.

Depending on the presence of the Contact and Expires headers [14] in the REG-

ISTER request, the registrar will take different actions as shown in Table 5.1.

The REGISTER request may contain an expires parameter in the Contact header

or an Expires header field. According to [8], the REGISTER request with a wild

card Contact header field must only be used with the Expires header whose value

is 0 to remove all registrations. The expires parameter in the Contact header is


Request headers Registration behavior

Contact:* Cancel all registrations

Expires: 0

Contact:sip:[email protected]; Add URL to current registrations;

expires=30 registration expires in 30 minutes

Table 5.1: Registration behavoir

optional and only indicates the desired expiration time of the registration. If it is

absent, the Contact header uses the Expires header as the default value.

5.2.2 Failures of a flow

Taking the STUN keepalive and implementation practices, we categorize the situa-

tions of a flow failure as follows:� 503 (Service Unavailable) response;� XOR-MAPPED-ADDRESS attribute changes in the STUN Binding Response;� 408 (Request Timeout)response to a next-hop OPTIONS probe for STUN

support;� 430 (Flow Failed) response;� any transport layer failure, such as a fatal ICMP error;� failure of a STUN request, such as STUN retransmission.

If any of the above situation occurs, that is a UA receives any of the above

messages, the UA considers that this flow is failed. So it clears up this flow, and

waits for the right time to re-register by using the backoff mechanism.

5.2.3 Re-registation

We implemented the backoff mechanism described in section 4.2.2. So before the

UA registers again, it has to wait for certain amount of time. The UA has to use

the same reg-id as its previous flow. So the registrar knows this is a new flow to

replace the old one.


5.3 TCP keepalive

For SIP over TCP, or SIP over TLS over TCP, we use TCP keepalive. Linux

kernel supports per-connection TCP keepalive. But by default, TCP keepalive is

disabled. We enabled its support by setting TCP socket options to SOL SOCKET

and SO KEEPALIVE [31]. This feature is integrated to the eXosip library. Namely,

when the UA routine program called eXosip listen addr using TCP protocol, the

eXosip creates a TCP socket which enables keepalive mechanism. Besides, we still

need configure three TCP keepalive parameters:� /pro/sys/net/ipv4/tcp keepalive time: the number of seconds the keepalive

routines wait for before sending the first keepalive probe;� /pro/sys/net/ipv4/tcp keepalive intvl: the time interval between keepalive mes-

sages after the first prob;� /pro/sys/net/ipv4/tcp keepalive probes: the number of consecutive probes be-

fore the connection is marked as broken.

Many other alternative methods can also be used to modify the parameters. We

just picked the one convenient for you.

5.4 STUN keepalive over UDP

5.4.1 Overview of the mechanism

Before addressing more technical details, we must clarify one point may appear

confusing later. STUN support is relatively independent to SIP outbound. SIP

outbound requires STUN support, but any UA or proxy supports STUN, does not

necessarily need to support SIP outbound. So STUN or more general keepalive

mechanism can be perceived as an extension of SIP. This is one reason why STUN

keepalive was integrated in eXosip as independent files.

As specified in [5], we implemented a limited version of STUN client and server on

the SIP UA and the SIP EP respectively. Only STUN Binding Requests, Binding

Responses, and Error Responses are needed.

The UA must generate STUN keepalive messages towards the EP to refresh the

binding on NAT before it expires. Rather than using expensive application layer

messages such as SIP message, the UA sends a STUN binding request to the EP to

exact the same transport address used for SIP, such as port 5060 or 5061. This has


the effect of keeping the bindings in the NAT alive. The STUN binding responses

inform the UA that the EP is still responsive, and also inform the UA if its transport

address towards the EP has changed. In our case, a change of transport address

suggests a failure of flow. The time interval between STUN Binding requests is a

random time between 24 and 29 seconds [12].

The binding refresh usage requires to multiplex STUN traffic on the same trans-

port address as SIP. So first STUN messages must be separated from SIP messages.

A quite distinguishable feature of SIP packets is that all STUN messages start with

the first byte either 0 or 1, but the first byte of a SIP packet has never a value of 0

or 1. This may not be suffice if there are valid application layer data packets which

could be confused with STUN packets. STUN defines a special field called the magic

cookie which is a fixed 32-bit value, 0x2112A442. So even if the SIP packet can have

the same value with the magic cookie in its second 32 bit word, there is only a one

in 232 chances that they are the same.

For SIP over UDP, eXosip opened one UDP socket and we accessed it through

eXosip.net interface[0].net socket. The variable of eXosip is globally visible when

eXosip library is initiated. STUN messages are sent through this socket periodically.

To reduce processing consumption on the UA (which is a mobile phone in WeSAHMI

senario) all the registrations share the same timer. That is, when the timer expires,

the UA traverses all of its registrations and sends STUN Binding requests through

all these registration.

5.4.2 STUN server and client

On the STUN server side, the server daemon reads the buffer from a socket and

then checks if this is a SIP or STUN packet. If this is a STUN message, the daemon

will send the message to STUN message parser, instead of SIP parser. According to

the type of STUN requests, the SIP state machine may mark three kinds of events.

These events do not trigger any states transaction in SIP state machine. They are

just used to mark the type of non-SIP messages received from the SIP port.

New events added to the oSIP event types is shown as follows:� RCV BIND REQUEST: an incoming STUN BINDING request� RCV BIND RESPONSE: an incoming STUN BINDING response� RCV BIND ERROR RESPONSE: an incoming STUN ERROR response

So the receiver (either STUN client or server) may generate the above events,


after parsing the buffer. If it is a STUN Binding request, the server encodes the

STUN Binding response including STUN attributes and sends it over the same flow.

5.4.3 STUN attributes

The following attributes may present in STUN response messages in the field of

attributes as shown in table 5.2:

Value Name Binding Response Error Response

0x0001 MAPPED-ADDRESS *

0x0004 SOURCE-ADDRESS *

0x0005 CHANGE-ADDRESS *

0x0009 ERROR-CODE *

0x000A UNKNOWN-ATTRIBUTES *

0x0020 XOR-MAPPED-ADDRESS *

Table 5.2: STUN attributes supported by the implementation

After receiving the STUN response with any of the above attributes, the STUN

client decides its next action, by checking the attributes present in Binding response.

5.4.4 STUN retransmission mechanism

Because the UDP is connectionless transport protocol, the reliability of STUN mes-

sages is guaranteed by the STUN client retransmission mechanism. Clients should

retransmit the request starting with an interval of RTO[33], doubling after each

retransmission.

Initial value for RTO should be configurable. 3 seconds is recommended [33]. The

value of RTO must not be rounded up to the nearest second.

The value of RTO should be cached by an agent after the completion of the

transaction, and used as the starting value for RTO for the next transaction to the

same host. The value should be considered stale and discarded after 10 minutes.

Retransmissions continue until a response is received, or a total of 7 requests have

been sent. If no response is received by 1.6 seconds after the last request has been

sent, the client should consider the flow to have failed [12].


5.5 Edge proxy

For the sake of security, our system preferred to use the second algorithm as de-

scribed in section 4.3.1, since the first algorithm can only be used if the connection

between the EP and the registrar is integrity protected. The second algorithm uses

keyed HMAC to assure the integrity of the flow token. This is a cheap and efficient

way to protect against malicious modification.

When it decides to generate a flow token according to the mechanism described in

section 4.3.2, the EP first generates a 20-octet random key, and then computes the

keyed hash value of S formatted according to the figure 4.3 with the just generated

random key. By calling APIs provided by the OpenSSL library, we can get a 20-

octet message digest. The EP will only use the first 10-octet and concatenate it

with S. The final step is to apply base64 encoding to the string.

The validation of the token is just the reverse procedure. We base64 decode the

token and compute the HMAC of S extracted from the token. Then check if they

are identical. We implemented base64 encoding in independent files. The important

interfaces can be found in appendix A.

Chapter 6

Experimentation

6.1 Experimental infrastructure deployment

The experimental environment is shown in figure 6.1, used for testing our imple-

mentation. In the initial stage, the UA is manually configured with two outbound

proxy URIs (the minimal number of URIs required in [5]). We ignored DNS and

location service and used IP addresses directly for the sake of simplicity. Another

open issue, left for future work, is that we did not experiment the reliability of our

system. Even though we established two direct flows to the UA’s two EPs, we did

not experiment how our system would behave if the primary EP failed and it had

to use the secondary EP.

The solid bi-directional arrowed lines indicate the direct flows between the UA

Figure 6.1: Experimental environment

29

CHAPTER 6. EXPERIMENTATION 30

Figure 6.2: Flow sequence of SIP messages

and the EP. Namely an always active UDP or TCP flow. The dash bi-directional

arrowed lines indicate indirect flows between the EPs and registrar, because the flow

is established when needed.. We did not deploy APs, since we ignored the location

service.

Figure 6.2 illustrates a basic registration and SUBSCRIBE/NOTIFY procedure

we experimented against our system. In following sections, we present these mes-

sages in details.

6.2 Experiment for SIP over UDP with SIP outbound

features

The UA registers twice to the same registrar through its primary and secondary EPs

respectively. The REGISTER requests generated by the UA are listed as follows:

These two REGISTER requests are almost the same, except for the Route headers

and the reg-id parameters in the Contact header fields, as shown in table 6.1 and

6.2. In the field of Route header, we specified the two EPs IP addresses with two

parameters. Through this way, the REGISTER requests are proxied to these two

EPs, and the two parameters indicate EPs support loose route and STUN keepalive,


REGISTER sip:10.1.0.7 SIP/2.0

Via: SIP/2.0/UDP 10.1.0.10:5060;rport;branch=z9hG4bK1835142445

Route: <sip:10.1.0.11;lr;keep-stun>

From: <sip:[email protected]>;tag=37305113

To: <sip:[email protected]>

Call-ID: [email protected]

CSeq: 1 REGISTER

Contact: <sip:[email protected]:5060>;

+sip-instance=”<urn:uuid:c00bb5b6-677f-4ab3-bdd7-f9ae756ea544>”;

reg-id=1

Max-Forwards: 70

User-Agent: eXosip/3.0.1

Expires: 3600

auth: ffn:hash

Supported: path

Content-Length: 0

Table 6.1: REGISTER request proxied to the primary EP

that is EPs can work as STUN keepalive servers. In table 6.2, the reg-id parameter

is set to 2 in the Contact header of the SIP body sent to its secondary EP. So we

later use this parameter to identify different flows established by the same UA. This

information is recorded by the registrar with its Contact header. According to the

Supported header, we can see the UA supports Path header. So EPs can later use

this function if needed. We used a very simple authentication mechanism, adding a

Auth header to the request. The registrar is configured to recognize the value of this

field so that other requests with different values will be denied. A more intelligent

mechanism is expected in the future work.

After received the REGISTER requests, the two EPs proxy REGISTER requests

to the registrar and delivered responses from the registrar to the UA. The responses

received by the UA from the registrar through two EPs are listed as follows:

In table 6.3 and 6.4, we notice a new header, Path header, with three parameters

appeared in responses. That is because EPs generate and insert a flow token to


REGISTER sip:10.1.0.7 SIP/2.0






CSeq: 2 REGISTER



reg-id=2

Max-Forwards: 70


Expires: 3600

auth: ffn:hash

Supported: path

Content-Length: 0

Table 6.2: REGISTER request proxied to the secondary EP

the Path header, and pack the Path header to REGISTER requests. After these

actions, EPs proxy requests to the registrar. The registrar records the flow token as

part of the binding information. Then the registrar forms responses by copying the

Path header, which eventually becomes the 200OK responses received by the UA.

The value of Supported header is set to outbound indicating that EPs supports SIP

outbound extension.

After the UA receives two 200OK responses, it sends SUBSCRIBE request as

shown in table 6.5 to its content service provider, Notifier, through its primary EP.

To use primary or secondary EP is decided randomly. In the case of the failure of

one EP, the UA can use another one. In our experiment, the logical Notifier hosts in

the registrar physically. Comparing to the REGISTER request, a new field affiliates

with the first parameter of Route header. It is the flow token the UA extracted from

the Path header of 200OK response. We do not list the response for SUBSCRIBE

request, since it is mainly the normal SIP response.


SIP/2.0 200 OK

Via: SIP/2.0/UDP 10.1.0.10:5060;rport=5060;branch=z9hG4bK1835142445





CSeq: 1 REGISTER



reg-id=1

Path:<sip:[email protected]:5060;lr;ob>

Max-forwards: 70

User-agent: eXosip/3.0.1

Expires: 3600

auth: ffn:hash

Supported: outbound

Content-Length: 0

Table 6.3: 200OK response received by the UA from the primary EP

Table 6.6 lists the NOTIFY request sent by the notifier. Similarly, we notice the

flow token in the Route header. This request is forwarded to the UA’s primary EP,

who sends it to its final destination by parsing the flow token to find out the exact

flow

6.2.1 Experiment for STUN keepalive

After the first successful registration, we set the STUN keepalive interval to a random

time between 24 to 29 seconds. Then the UA will send STUN Binding requests

periodically.

The STUN Binding request sent by the UA to its two EPs in its Hexadecimal

form. In table 6.7 we listed the parsed binary data in a human readable form. As

you can see we did not give any value for the attributes field. This field may be used


SIP/2.0 200 OK

Via: SIP/2.0/UDP 10.1.0.10:5060;rport=5060;branch=z9hG4bK1094232440





CSeq: 2 REGISTER



reg-id=2

Path: <sip:[email protected]:5060;lr;ob>

Max-forwards: 70

User-agent: eXosip/3.0.1

Expires: 3600

auth: ffn:hash

Supported: outbound

Content-Length: 0

Table 6.4: 200OK response received by the UA from the secondary EP

later when errors occur in STUN messages. For the rest of the STUN message, we

just padded zero to align it to 20 bytes. The data structure used in the program is

listed in appendix A.

The STUN Binding response is similar with the Binding request except for the

field of STUN message type which is 0x0101.

6.3 Experiment TCP keepalive

TCP keepalive is supported by Linux kernel. We enable TCP keepalive feature in

our code as described in chapter 5, section 5.3. We captured TCP keepalive replies

which were the ACK set without data.


SUBSCRIBE sip:[email protected] SIP/2.0


Route: <sip:[email protected];lr;keep-stun>




CSeq: 20 SUBSCRIBE

Contact: <sip:[email protected]:5060>

Max-Forwards: 70


Expires: 3600

Event: resource-update

Service: finnair

Content-Length: 0

Table 6.5: SUBSCRIBE request sent by the UA to its Notifier.


NOTIFY sip:[email protected]:5060 SIP/2.0


Route: <sip:[email protected];lr;keep-stun>


To: <sip:[email protected]>;tag=1530564204


CSeq: 21 NOTIFY

Contact: <sip:[email protected]:5060>

Max-Forwards: 70


Subscription-State: active;expires=3595

Event: resource-update

Content-Type: application/soap+xml

Content-Length: 9

Table 6.6: NOTIFY request sent from the Notifier to the UA.

Name Length Value

Header First two bits 2 bits 0

Message type 2 bytes 0x0001

Message length 2 bytes 0x0000

Magic cookie 4 bytes 0x2112A442

Table 6.7: STUN Binding request.

Chapter 7

Discussion

SIP outbound [5] does not specify the configuration mechanism of outbound proxy

registration URIs. The configuration procedure can be considered as an implemen-

tation practices issue. A trusted third party can be used to distribute the outbound-

proxy-set to UAs in the initial stage. In WeSAHMI scenario, the WeSAHMI server,

who provides a backbone for the whole platform, can be used as the third party.

Each URI in the outbound-proxy-set can be resolved to several different physical

hosts. This means one URI represents one logical EP. But one logical EP can be

deployed to several physical hosts. Such kind of deployment enhances the scalability

and reliability, since a single server’s failure can not hinder the whole system. To

deploy the system in this fashion, DNS service is needed so that the various URIs

in the outbound proxy set can not resolve to the same host.

Every UA may have at least two and up to four logical EPs. To choose which one

to proxy requests, is not specified in the SIP outbound draft. In our implementation,

we just simply picked the primary EP to proxy requests unless it fails. But in a

large system, which has a lot of UAs, the primary EP may overload but other EPs

just run in vain.

To optimize the system, we may design a way to assign work load evenly. We

might regulate a limited number of direct flows from a EP to UAs. When the fixed

number is reached, the EP refuses a UA’s connection and responses with a kind

of message informing the UA to try another EP in its outbound proxy set. This

response message may use 200OK SIP response with a special header different from

normal responses to requests. As to the value of fixed number of direct flows, it

should be decided after practical measurement or mathematical model.

We only implemented STUN over UDP. So client retransmission is desirable to

achieve reliability. The STUN is transparent to transport protocols. So it is possible

37

CHAPTER 7. DISCUSSION 38

to implement it over TCP. If we implement STUN over TCP, we do not need to add

client retransmission to STUN, since TCP is connection oriented.

Chapter 8

Conclusions

8.1 Summary

In this thesis, we addressed SIP outbound protocol and its applications. Then

we described our implementation of SIP outbound as a component of WeSAHMI

system. SIP outbound, as an extension of SIP, updates several behaviors of general

SIP. It makes the traverse behind NAT possible. And then we described how our

implementation was integrated to the WeSAHMI architecture and how it worked

with the whole system. In the end, we designed several experiments for evaluation

of our implementation. The experiments are mainly about client initiated connection

features of SIP outbound and keepalive mechanisms.

During the procedure of implementation, most difficulties we encountered were

the lack of documentation for these open source libraries, including eXosip and

oSIP. This may be the common problem for most open source developers. Our

implementation is built in the application level of these two libraries , so only to

know what kind of application programming interfaces (APIs) they provide is enough

for us. But the documents are not clear and sufficient, about how to use these APIs

so that we had to inspect the source code thoroughly. It was time consuming to go

through such a big bunch of source code. However, this is good for us to learn how

the SIP transaction was implemented in the library. After learning these knowledge,

we may later be able to integrate all the SIP outbound features to the library. So

other application developers can use the library to build SIP application which

supports SIP outbound extension directly.

39

CHAPTER 8. CONCLUSIONS 40

8.2 Future work

Our implementation only realized STUN keepalive over UDP and enabled TCP

keepalive in the kernel. The [5] also proposed CRLF keepalive. To make our system

more intelligent, in future, we may entitle the UA to select a keepalive approach

according to its transport protocol and preferences.

In our experimentation, we colocated the registrar and notifier on one physical

host. For the logical registrar, we stored the binding information to random memory

instead of a database or any hardware. It was just a temporary solution for the

registrar which should be improved in the future. To write the binding information

to files, we need to consider how to format information to make the information

easy to lookup.

Since STUN keepalive is transport to transport protocol, we may also extend it

over TCP connection. Reasonable performance evaluation may be done as compar-

ison to the kernel enabled TCP keepalive. We may also implement client STUN

retransmission mechanism for STUN over UDP to achieve higher reliability.

Scalability is also expected for the SIP outbound system. To achieve high scala-

bility and failure tolerance, multiple physical hosts may be deployed for one logical

EP entity. This may need extra mechanism such as DNS SRV [3]. Moreover, a indi-

vidual timer for each registration should be set when the registrar does its binding

operation.

SIP outbound also mentions about SigComp compression [24]. When SigComp

is applied, both two communicating endpoints need to perform compression and

depression. This feature will be desirable, since the SIP message may reach up to

two thousand bytes or more which is too large for wireless transmission.

Bibliography

[1] Understanding SIP. Internet, 2007. www.sipcenter.com/sip.nsf/.

[2] WeSAHMI System Specification, 2007.

[3] P. Vixie A. Gulbrandsen and L. Esibov. A DNS RR for specifying the location

of services (DNS SRV). Network Working Group, 2000.

[4] Shoma Chakravarty Abhijit Sur, Dean Skidmore. Web services based SOA for

next generation telecom networks. In IEEE international conference on services

computing, page 520, 2006.

[5] R. Mahy C. Jennigns. Managing Client Initiated Connections in the Session

Initiation Protocol. Internet Draft (work in progress), Internet Engineering

Task Force, 2007.

[6] Marina del Rey. Internet Protocol. Network Working Group, September, 1981.

[7] N. Modadugu E. Rescorla. Datagram Transport Layer Security. Network Work-

ing Group, April, 2006.

[8] J. Rosenberg et al. SIP: Session Initiation Protocol RFC 3261. Internet Engi-

neering Task Force, 2002.

[9] V. Perkins C. Handley, M. Jacobson. SDP: Session Description Protocol. Net-

work Working Group, July, 2006.

[10] Alan B. Johnston Henry Sinnreich. Internet Communication Using SIP. 1th

edition, October, 2001.

[11] P. Matthews D. Wing J. Rosenberg, R. Mahy. Traversal Using Relays around

NAT (TURN): Relay Extensions to Session Traversal Utilities for NAT (STUN).

Internet Engineering Task Force, 2007.

41

BIBLIOGRAPHY 42

[12] R. Mahy J. Rosenberg, C. Huitema and D. Wing. Simple Traversal Under-

neath Network Address Translators (NAT) (STUN). Internet Draft (work in

progress), Internet Engineering Task Force, 2006.

[13] K. Johns. Routing of mid dialog requests using sip-outbound. Internet Draft

(work in progress), Internet Engineering Task Force, 2006.

[14] Alan B. Johnston. Understanding the Session Initiation Protocol. 1th edition,

2001.

[15] S. Josefsson. The Base16, Base32, and Base64 Data Encodings RFC 3548.

Internet Draft (work in progress), Internet Engineering Task Force, 2006.

[16] H. Krawczyk. HMAC: Keyed-Hashing for Message Authentication RFC 2104.


[17] Kundan Singh Milind Buddhikot, Adiseshu Hari and Scott Miller. MobileNAT:

A New Technique for Mobility Across Heterogeneous Address Spaces. Mobile

Networks and Applications, 10(3), 2005.

[18] David L. Mills. Computer Network Time Synchronization: The Network Time

Protocol. 1th edition, March, 2006.

[19] R. Salz P. Leach, M. Mealling. A Universally Unique iDentifier (UUID) URN

Namespace RFC 4122. Internet Engineering Task Force, 2005.

[20] M. Holdrege P. Srisuresh. IP network address translator (NAT) terminology

and considerations RFC 2663. Network Working Group, 1999.

[21] M. Holdrege P. Srisuresh. IP Network Address Translator (NAT) Terminology

and Considerations. Internet Draft (work in progress), Internet Engineering

Task Force, August, 1999.

[22] Jonathan B. Postel. Simple Mail Transfer Protocol. Network Working Group,

August, 1982.

[23] J. Mogul H. Frystyk L. Masinter P. Leach T. Berners-Lee R. Fielding, J. Gettys.

Hypertext Transfer Protocol–HTTP/1.1. Network Working Group, June, 1999.

[24] J. Christoffersson H. Hannu R. Price, C. Bormann and Z. Liu. Signaling Com-

pression (SigComp). Network Working Group, January 2003.

BIBLIOGRAPHY 43

[25] Howard Rheingold. Smart Mobs: The Next Social Revolution. 1th edition,

October, 2002.

[26] J. Rosenberg. Interactive Connectivity Establishment (ICE): A Methology for

Network Address Translator (NAT) Traversal for Offer/Answer Protocols. In-

ternet Draft (work in progress), Internet Engineering Task Force, 2005.

[27] J. Rosenberg. Interactive Connectivity Establishment (ICE): A Protocol for

Network Address Translator (NAT) Traversal for Offer/Answer Protocols. In-

ternet Draft (work in progress), Internet Engineering Task Force, 2007.

[28] Yutaka Takeda Saikat Guha and Paul Francis. NUTSS: A SIP-based Approach

to UDP and TCP Network Connectivity. ACM SIGCOMM, 2004.

[29] What is SIP? Internet, 2007. http://www.sipcenter.com/sip.nsf/html/Background.

[30] Robert Sparks. SIP Basics and Beyond. ACM Press, 2007.

[31] W. Richard Stevens. UNIX Network Programming Volume 1 Networking APIs:

Sockets and XTI. 2th edition, January, 1998.

[32] E. Rescorla T. Dierks. The Transport Layer Security (TLS) Protocol Version

1.1. Network Working Group, April, 2006.

[33] M. Allman V. Paxson. Computing TCP’s Retransmission Timer RFC 2988.


[34] Samir Chatterjee Victor Paulsamy. Network Convergence and the

NAT/Firewall Problems. In System Sciences, 2003. Proceedings of the 36th

Annual Hawaii International Conference, page 10, 2003.

[35] D. Willis and B. Hoeneisen. Session Initiation Protocol (SIP) Extension Header

Field for Registering Non-Adjacent Contacts RFC 3327. Internet Engineering

Task Force, 2002.

Appendix A

Appendix

A.1 Important data structures

STUN message header data structure and STUN message data structure:

struct stun_msg_hdr

{

u_int16_t msgType;

u_int16_t msgLength;

u_int32_t magic_cookie;

u_int96_t id;

};

struct stun_msg

{

stun_msg_hdr_t msgHdr;

int hasMappedAddress;

stun_atr_address4_t mappedAddress;

int hasSourceAddress;

stun_atr_address4_t sourceAddress;

int hasChangedAddress;

stun_atr_address4_t changedAddress;

int hasErrorCode;

stun_atr_error_t errorCode;

int hasUnknownAttributes;

stun_atr_unknown_t unknownAttributes;

44

APPENDIX A. APPENDIX 45

int hasXorMappedAddress;

stun_atr_address4_t xorMappedAddress;

};

A.2 Important modifications to the eXosip and osip li-

braries

Library File name Function name

eXosip eXconf.c eXosip keep alive

eXregister api.c eXosip register send register

eXtransport.c eXosip tcp connect socket

udp.c eXosip read message

stun.c; stun.h new files

base64.c; base64.h new files

osip osipevent.c osip message parse

osip message parse.c pro stunmsg; compare addr

Table A.1: Modification to eXosip and osip libraries

A.3 APIs for base64 encoding

void base64_encode (const unsigned char *in, size_t inlen,

unsigned char *out, size_t outlen)

bool base64_decode (const unsigned char *in, size_t inlen,

unsigned char *out, size_t *outlen)

A.4 APIs for STUN keepalive

int stun_parse_message( char* buf, unsigned int bufLen,

APPENDIX A. APPENDIX 46

stun_msg_t *pmsg, int verbose)

unsigned int stun_encode_message( const stun_msg_t msg, char* buf,

unsigned int bufLen, int verbose)

Documents

SIP over Client Initiated Connections