63
BP109: Sametime Voice and Video in the Real World Jeremy Sanders, ThinkRite Ltd

BP109 : Sametime Voice and Video in the Real World

Embed Size (px)

Citation preview

Page 1: BP109 : Sametime Voice and Video in the Real World

BP109: Sametime Voice and Video in the Real World

Jeremy Sanders, ThinkRite Ltd

Page 2: BP109 : Sametime Voice and Video in the Real World

Bigger On the Inside When It's Working Troubleshoot It When It's Fixed Make It Mobile Bigger On the Outside When It's Resilient, Break It When It's Secure Hack It Lessons Learned Outside the Box

Page 3: BP109 : Sametime Voice and Video in the Real World

Bigger on the Inside

More to unlock - which you have already paid for - than you may have previously thought

Simple per-user licensing

No additional software cost to add voice and video

No additional software cost to cluster for scale and reliability

Mobile device access is always included

Page 4: BP109 : Sametime Voice and Video in the Real World

Bigger on the Inside - Sametime AV Product Options

“Sametime Audio Video” : Connect Client to Client AV “calls”, click user (no number)

“Sametime Voice” / “ST ” / “SUT-Lite” : Client calls to/from Phones and external Video System/Clients – by numbers/SIP URIs (sip:...@...)

Android / iOS Mobile Clients provide connectivity for the Mobile User

Sametime Meetings offers a zero-download AV browser client

Sametime Video Manager/MCU will talk to any/all such clients for Conferencing

(Full-Fat) Sametime Unified Telephony : phone control (full telephony)

All of the above uses SIP, SDP and RTP – for more details see last year’s presentation

http://www.slideshare.net/kbmsg/jmp206

Com

municate

Conference

CO

MP

LET

E

SU

T

Page 5: BP109 : Sametime Voice and Video in the Real World

Crash Recap of Voice, Video, Conferencing and Telephony Terminology

SIP - Session Initiation Protocol: standard for making calls (sessions) between endpoints using INVITEs, endpoints which may move typically REGISTER first

SDP - Session Description Protocol: standard for describing audio/video/etc sessions

(S) RTP - (Secure) Real-time Transport Protocol: standard for sending/receiving audio/video/etc in packets

Codec - standard for packaging audio/video – G.711 is telephone quality voice, G.729 (patented/licensed) and iLBC (open source/free) are highly compressed audio

MCU - Multipoint Control Unit: audio/video mixer for conference calls

TLS - Transport Layer Security: encryption standard providing secure communications Early versions of TLS were called SSL (Secure Socket Layer).

Page 6: BP109 : Sametime Voice and Video in the Real World

Do you want to cut costs by reducing phone handsets?

– Does Sametime Voice and Video therefore need to be as reliable as your PBX?

– Can ST fit into your dialplan?

Or cut costs by centralizing external calls?

– Watch out for internal billing issues as well as regulatory restrictions

– If you want to keep external calls routing out each site configuration it is very complex without SUT

Or cut costs by using internal conferencing?

Or simply Improve Collaboration?

KickOff: Consider Your Raison D’Etre for Sametime Voice and Video

Page 7: BP109 : Sametime Voice and Video in the Real World

KickOff: When you think you know what to do, Assume you don’t!

Hold at least one full day workshop with all parties - including decision makers - to

– Discuss functional as well as non-functional (scale, resilience, security) requirements

– Ensure everyone is aware of all the possibilities

Compile, document and plan to perform a comprehensive Test Plan

– Anything not tested is not guaranteed to work – therefore do and not just cover a few use cases

Plan a Pilot in an equivalent environment to production OR Plan a suitably sized Reference/Staging environment – clustered and secure if these will be used in production – reconfiguring and re-testing for complex issues in production is painful!

Page 8: BP109 : Sametime Voice and Video in the Real World

Bigger On the Inside When It's Working Troubleshoot It When It's Fixed Make It Mobile Bigger On the Outside When It's Resilient, Break It When It's Secure Hack It Lessons Learned Outside the Box

Page 9: BP109 : Sametime Voice and Video in the Real World

“When It’s Working, Troubleshoot It!”

Understanding the basic ways the calls flow

– Arm yourself for real troubleshooting

– Prepare your mind for the added complexities of clustering

Page 10: BP109 : Sametime Voice and Video in the Real World

ST Connect Client to Client Call Flow

Client

CS CM

SIPPR

BWM

Client

VP

VP

SIP

SIP

RTP

1

2

3

5

4

8

7

9

6

Client asks Conference Manager (CM) to set up a call via Virtual Places (VP) request to Community Server (CS) (1,2)

CM sends all SIP requests through SIP Proxy and Registrar (SIPPR) (3,6)

SIPPR may consult with Bandwidth Manager (BWM) – a B2BUA* which can modify SDP or deny call (4,7)

CM/SIPPR sends requests to Caller Client first (3,5) and then Called client (6,8)

Clients accept calls (200 OK) with media details in SIP SDPs - these flow through the above paths in ACKs and (re-)INVITEs, giving each client the details

Real Time Protocol (RTP) audio/video flows directly from client to client (9)

* B2BUA = SIP Back to Back User Agent, this is two SIP User Agents (UAs) combined: a User Agent Server (UAS) which receives a call and a separate User Agent Client (UAC) which initiates a new call based heavily on the original call but modified as required

Video Manager not involved even for Video Calls

Two “calls” without BWM, Four with it

SIPPR and BWM “see everything” EXCEPT conditions at/between the Clients

Page 11: BP109 : Sametime Voice and Video in the Real World

Conference Leg Call Flow

Client

CS CM

SIPPR

BWM

VP

VP

SIP

SIP

SIP

1

2

3

5

4

9

7 6

SIP

Client asks CM to set up calls via VP request to CS (1,2)

CM sends SIP requests to clients through SIPPR (3,5)

SIPPR consults with BWM if configured (4)

SIPPR sends request to Client and it accepts call (200 OK) (5)

CM sends call request direct to Video Manager (VMGR) (6)

VMGR sends new call request (like a B2BUA) to Video MCU (VMCU) via SIPPR and BWM if configured (7,8,9) – VMCU accepts call (200 OK), responding via port 15000

Confirmations (ACK) flow through the above paths, ultimately exchanging client AV details with the VMCU in the SIP SDPs

RTP AV flows between VMCU and clients (10)

There are, as a result of BWM and VMGR, 5 calls/sessions here and even SIPPR cannot see the entire set of calls/sessions (as CM talks directly to VMGR with call/session details different to the VMGR call to VMCU) – without BWM there would still be 3 calls, two of which SIPPR would not see as CM would send one call to VMGR and the VMGR would send a call with different call/session details to VMCU directly

Note: The CM, VMGR and VMCU also communicate with each other to ready conference bridges for use by means of XML over HTTPS/HTTP on various ports such as 8443, 9443, 443 and 8080

VMCU

8

VMGR

10

Three “calls” without BWM, Five with it

SIPPR “blind” to CM <-> VMGR

VMGR always involved

Page 12: BP109 : Sametime Voice and Video in the Real World

Bigger On the Inside When It's Working Troubleshoot It When It's Fixed Make It Mobile Bigger On the Outside When It's Resilient, Break It When It's Secure Hack It Lessons Learned Outside the Box

Page 13: BP109 : Sametime Voice and Video in the Real World

“When It’s Fixed, Make it Mobile!”

Mobile Client access is available for all ST Packages: Communicate, Conference and Complete

Traversing the public internet, DMZ, etc. securely adds significant complexity

Page 14: BP109 : Sametime Voice and Video in the Real World

HRPS

External/Mobile Clients (Best Practice)

Mobile

Client

SIP

EDGE

TURN

DMZ Internet Private Intranet /

Wifi / 4G etc Corporate Intranet

tunnelled RTP

SIP

STUN/

TURN/

ST

Proxy CS

SIPPR

VMCU

VMGR

CM

HTTPS VP

Mobile Clients use an HTTP Reverse Proxy Server (HRPS) to talk to the Sametime Proxy Server which translates from HTTPS to Virtual Places (VP), allowing Mobile Clients to access all of the services of Community Server

Mobile Clients rely on SIP EDGE server for SIP to reach SIPPR and TURN server for RTP to reach intranet - such as VMCU or other clients

An External ST Connect Client would use a Sametime Multiplexer (MUX) in the DMZ instead of HRPS and ST Proxy but would still use SIP EDGE and TURN servers

The Sametime Meetings zero-download browser client plugin for AV also uses the Sametime Proxy Server (and HRPS if external)

BWM

DB2

APNs

SIP

RTP

Page 15: BP109 : Sametime Voice and Video in the Real World

Client to External/Mobile Client Call

Client

CS CM

SIPPR

BWM

Client

VP

VP

SIP

SIP

SIP

SIP

RTP

1

2

3

5

4

8

7

6

SIP

EDGE

TURN

DMZ

Inte

rne

t

Priva

te In

tra

ne

t /

Wifi / 4

G e

tc

Co

rpo

rate

In

tra

ne

t

9

12

tunnelled

RTP

10

SIP

11

STUN/

TURN

Flow is as Client to Client flow but SIP Edge server handles SIP to External/Mobile Client (9)

External/Mobile Client uses Interactive Connectivity Establishment (ICE) with STUN (Session Traversal Utilities for NAT) / TURN (Traversal Using Relay NAT) server to determine all RTP candidates (10)

before media flows - which in this case uses TURN server to relay the RTP (11,12)

SIP

Page 16: BP109 : Sametime Voice and Video in the Real World

Conference Leg with External/Mobile Client

CS CM

SIPPR

BWM

Client

VP

from ST Proxy

or MUX in DMZ

VP

SIP

SIP

SIP

SIP

RTP

1

2 3

5

4

8

7

6 SIP

EDGE

TURN

DMZ

Inte

rne

t

Priva

te In

tra

ne

t /

Wifi / 4

G e

tc

Co

rpo

rate

In

tra

ne

t

13

tunnelled

RTP

10

SIP

11

STUN/

TURN

9

VMGR

VMCU

SIP

12

Flow is as Conference Leg flow but SIP Edge server handles SIP to External/Mobile Client (6)

External/Mobile Client uses ICE with STUN / TURN server to help determine RTP candidates (7) before final negotiation of RTP stream - which in this case uses TURN server to relay the RTP (12,13)

SIP

Page 17: BP109 : Sametime Voice and Video in the Real World

Considerations for Mobile/External Clients

Split Horizon DNS if have internal and external service availability – inside and outside addresses for:

– SIP Proxy and Registrar / SIP EDGE

– TURN Server (0.0.0.0 internally)

– Sametime Proxy Server / HRPS

– Sametime Meeting Server / HRPS

– Community Server / Mux

Consistent domain name (eg, thinkrite.com) for LTPA tokens to work correctly

TLS Certificates from official Certificate Authority using this consistent domain name

For STUN/TURN no NAT can be configured and firewalls must be in transparent/bridging mode as Clients must be able to connect to TURN servers in DMZ directly for STUN (3478) to work

VMCU must be able to talk to TURN in the same way and send/receive RTP (20830+/40000+)

Page 18: BP109 : Sametime Voice and Video in the Real World

Troubleshooting Mobile/External Clients

488 Not Available Here often indicates an unexpected failure to establish AV via ICE/STUN/ TURN – check that TURN server (via its hostname on STUN port 3478) AND other Client is reachable (Firewalls / NAT / VPNs / routing / DNS may prevent it – this may not be immediately evident as both Clients may be able to chat through CS/MUX/Proxy, reach SIPPR/EDGE, etc.)

ICE time-out errors – AV may still be established – network/negotiations may be strangely slow – try changing RTO in Media Manager ICE properties in Sametime System Console to 500

Page 19: BP109 : Sametime Voice and Video in the Real World

Bigger On the Inside When It's Working Troubleshoot It When It's Fixed Make It Mobile Bigger On the Outside When It's Resilient, Break It When It's Secure Hack It Lessons Learned Outside the Box

Page 20: BP109 : Sametime Voice and Video in the Real World

Scaling Up SIPPR/CM: WAS-SIP Container-Based Servers

WebSphere Application Server can be clustered vertically (on same machine) or horizontally (on different machines) – in either case the active memory is shared

For SIP Applications the amount of communications to share active memory between physical machines is very high

WAS Clusters must be fronted by a WebSphere Proxy Server (simple to create using SSC/Deployment Manager) with the main IP address, this is a stateless SIP Proxy which load balances WAS instances and offloads the actual TCP/IP or TLS connections from them

WPS WAS1

WAS2

WAS Cluster

Shared Environment

WS Proxy Server

Distributes Load,

Maintains Session

TCP/

UDP/

TLS

Page 21: BP109 : Sametime Voice and Video in the Real World

Gotchas for Scaling up Conference Manager / SIP Proxy & Registrar

Without Clustering both CM and SIPPR can be on same server, but with Clustering they must be in separate clusters

Limiting factor is the ability of WS Proxy to handle connections (now in SIP/SIPS_PROXY_CHAIN > inbound channel, was 20,000 before)

OS capabilities may need to be tuned as may external factors such as LDAP

Installing multiple WAS instances on the same machine may result in port conflicts (can be resolved by manual editing or WAS 8.5.5.2)

Some manual editing of files outside of SSC configuration is required - clustered CMs each need a separate stavconfig.xml file with a different NotificationServerHost (CM’s own FQDN) / NotificationServerPort (normally 9443)

– http://www-01.ibm.com/support/docview.wss?uid=swg21663243

Page 22: BP109 : Sametime Voice and Video in the Real World

How One becomes Many – Scaling Up

PS1 CM1

CM2

CM WAS Cluster CM WS Proxy Server

PS2 PR1

PR2

SIPPR WAS Cluster SIPPR WS Proxy Server

CM

PR

Clustered Media Manager

Standalone Media Manager

(Could also include SSC and DB2) SSC

DB2

SSC

DB2

SSC includes deployment

manager for all CM,

SIPPR, PS, etc.

:5080

SIP

SIP

:5060

:508x

:508y

:506x

:506y

:5060

:5080

Page 23: BP109 : Sametime Voice and Video in the Real World

Gotchas for Scaling up ST 9 SIPPR

Single Handled Domain must be configured for ST9

– Use the same FQDN as the DNS for SIPPR, same domain as in your certificates

– Clients/trunks setting this domain is all important – all incoming calls/SIP is expected to feature this name in the Request URI/To headers for SIPPR to use rules to send calls to clients – all other SIP will just be forwarded according to Request URI (which could result in a loop and 483 Too Many Hops if that address comes back to SIPPR itself)

– For Sametime Voice/Phone/SUT-Lite Conference Manager constructs a MESSAGE for client notification based on the received INVITE, only sending it to the Proxy Registrar if the Request URI for a received call matches the SIP Proxy Registrar FQDN shown in stavconfig.xml

sippr.thinkrite.com

Page 24: BP109 : Sametime Voice and Video in the Real World

Scaling Up VMGR and VMCU Servers VMCUs run on Linux only and are not

WebSphere/Java-based, they can be configured in resource pools for specific geographic areas or for other purposes

VMGRs while running with SIP in WebSphere (on Linux only) do not use the WebSphere SIP Container so cannot use the WebSphere Proxy – they include their own load balancer component running on ports 5080 and 7443 instead of 5060 and 8443

Solid database replicates information from Master (M) to Hot Standby (HS) and other Replicas (R)

VMGR1

VMGR2

VMCU1

VMCU2

VMGR3 VMCU3

VMCU Farm VMGR Farm

Distributes Load,

Maintains Session

VMCU pool 2

VMCU pool 1

:5060

:8443

:5060

:8443

:5060

:8443

:5060

:8080

:5060

:8080

:5060

:8080

VMGR

MLB

VMGR

HSLB

VMGR

RLB :5080

:7443

:5080

:7443

:5080

:7443

Page 25: BP109 : Sametime Voice and Video in the Real World

“End to End” AV Scaling (without EDGE/TURN)

VMGR1

VMGR2

VMGR Farm

VMGR

LB1

VMGR

LB2

WPS3

BWM Cluster WS Proxy

PR1

PR2

SIPPR Cluster with WS Proxy

WPS2 CM1

CM2

CM Cluster with WS Proxy

WPS1 BWM1

BWM2

DB2

VMCU1 VMCU2 VMCU3

VMCU Farm

Client

Calls to Clients

Inbound Calls

(SUT-Lite)

Conference Calls

CS CS

Page 26: BP109 : Sametime Voice and Video in the Real World

Bigger On the Inside When It's Working Troubleshoot It When It's Fixed Make It Mobile Bigger On the Outside When It's Resilient, Break It When It's Secure Hack It Lessons Learned Outside the Box

Page 27: BP109 : Sametime Voice and Video in the Real World

“When it’s Resilient, Break It!”

(Take full backups and test restore procedure first!)

Perform Failover testing, initially “gently” but also try more severe tests

Have clients logged in and make calls at the time of the tests to see what happens

Page 28: BP109 : Sametime Voice and Video in the Real World

Redundancy for WAS-SIP Container-Based Servers like SIPPR and CM

For a single IP address to reach these clusters use simple Load Balancers/IP Sprayers such as WebSphere EDGE Components LB for IPv4/6 or F5 BIG-IP LTM

WPS1

WPS2

LB1

LB2

WAS1

WAS2

WPS3 WAS3

(Virtual IP

Address) Can

Failover to…

TCP/

UDP/

TLS

WAS Cluster

Shared Environment

WS Proxy Servers

Distribute Load,

Maintain Session

TCP/

UDP/

TLS

Load

Balancers

Sprays IP/details

from single address

VIP

Page 29: BP109 : Sametime Voice and Video in the Real World

Load Balancers

One Load Balancer server can theoretically be used for all Sametime Servers (a redundant pair is obviously recommended!)

– Needs a FQDN and Virtual IP address for each Sametime Service (SIPPR, CM, VMGR, Proxy, Meetings, TURN) – plus its own physical address(es)

MAC Forwarding – fastest option (and LB out of IP connection) but must be on same VLAN

– Necessary to set up a loopback (extra, non-ARP, not in routing table) IP address on the WS Proxy etc. to receive packets from the Load Balancer

– LVS/IPVS uses same technique for “Direct Connection” (F5 calls this L2 nPath routing)

Other methods overcome VLAN/loopback limitations but are slower and interfere more

– Encapsulation/Tunnelling (F5 L3 nPath routing), NAT/SNAT (source address translation), etc.

– With SNAT must configure special settings in WS Proxy to rewrite packet details – IP address of Load Balancer to FQDN of service

/etc/sysctl.conf / sysctrl –w net.ipv4.conf.all.arp_ignore=3 net.ipv4.conf.all.arp_announce=2

ip addr add $CLUSTER_ADDRESS/32 scope host dev lo

Page 30: BP109 : Sametime Voice and Video in the Real World

Gotchas for Scaling up

The Load Balancer must be extremely simple for the WS Proxy / Application Server logic to work correctly

– Ideally just the Layer 2 (MAC) address details of the IP packet are changed to forward the packet, allowing the WS Proxy to take over negotiating the entire TCP/TLS session

– If the Load Balancer is to actually read and forward a new TCP/TLS packet no SIP details should be changed and no new headers should be added

– For F5 BIG-IP Local Traffic Manager (LTM) do not configure SIP / SIP Persistence / mblb profiles as these result in LTM acting like a SIP UA/Proxy and Via/Record-Route headers are added – this results in lost connections after around 5 minutes because:

- the WS Proxy detects this SIP UA is in front of the client and doesn’t add RFC 5626 flow tokens

- special TCP keep-alive messages on the SIP connection do not make it through the F5

Page 31: BP109 : Sametime Voice and Video in the Real World

WS Proxy Health Check Settings

An intelligent Load Balancer will only send packets to online WS Proxies – which they can determine from responses to SIP OPTIONS requests – the WS Proxy should respond immediately to such OPTIONS

(in comparison the WS Proxy uses Distribution and Consistency Services (DCS) rather than SIP to determine if its Application Servers – eg, SIPPR - are running)

If you need to configure more than two addresses it is possible to modify the comma separated LBIPAddr setting in the file proxy-settings.xml – but returning to the configuration page will remove all but the first two addresses

Page 32: BP109 : Sametime Voice and Video in the Real World

WS Proxy IP Forwarding Load Balancer and other Custom Properties contactRegistryEnabled false for faster shutdown

disableAllHostNameLookups should be set to true for performance, this does not affect the use of hostnames in the below IPSprayer settings…

tcp/tls/udp.IPSprayer.host is the hostname of the virtual IP of the load balancer – ie, for the SIPPR it is the hostname of the address to which clients expect to connect

ipForwardingLBEnabled true – replaces the host and port from LB with the IPSprayer.host/port details

isSipComplianceEnabled false to avoid logging interoperability events for TCP keep-alives, etc.

enableMultiClusterRouting true to allow (eg, keep-alive) packets with apparently invalid routing info to SIPPR

http://www-01.ibm.com/support/docview.wss?uid=swg21666746

Page 33: BP109 : Sametime Voice and Video in the Real World

WS Proxy Custom Property for Older Clients

Older clients (Including ST 8.5.2 embedded in Notes 9 – especially common on Linux where full AV/SUT is not yet available in ST 9) need special handling:

– Import WebSphereSIPProxy/ConnectionReuseFilter.jar from disk 1 of Media Manager as an Asset on the WebSphere Proxy Server

– Configure a Business Level Application (BLA) and BLA CU (Composition Unit) using this artefact

– Set forceRport=true custom property

http://www-01.ibm.com/support/knowledgecenter/SSKTXQ_9.0.0/admin/install/inst_config_clus_av_sippr_wasproxy_filter.dita

Page 34: BP109 : Sametime Voice and Video in the Real World

How One becomes Many – SIPPR/CM Redundancy

CM

PS1

PR

PS1

LB1

LB2

CM1

PR1

CM and PR WAS Clusters WS Proxy Servers Load

Balancers

VIP

CM

PS2

PR

PS2

CM2

PR2

CM

PR

Clustered Media Manager

Standalone Media Manager

(Could also include SSC and DB2) SSC

DB2

DB2

SSC

DB2 HADR

SSC includes deployment

manager for all CM,

SIPPR, PS, etc.

SIP

:5080

:5080

SIP

SIP

:5060

:5080

:5060

:508x

:506x

:508y

:506y

:5060

:5080

:5080 VIP

Page 35: BP109 : Sametime Voice and Video in the Real World

Redundancy for VMGR and VMCU Servers VMCUs run on Linux only and are not

WebSphere/Java-based, they can be configured in resource pools for redundancy

VMGRs while running with SIP in WebSphere (on Linux only) do not use the WebSphere SIP Container or WebSphere Proxy – they include their own load balancers which are aware of where requests were previously sent and are being handled

For a single IP address to reach the VMGRs use an IP Sprayer which is SIP (5080/5081) and HTTP/HTTPS (7443) compliant (the same as for other Sametime servers is fine)

VMGR1

VMGR2

IS1

IS2

VMCU1

VMCU2

VMGR3 VMCU3

VMCU Farm VMGR Farm

Distribute Load,

Maintain Session

IP Sprayers

Sprays IP/details

from single address

VMCU pool 2

VMCU pool 1

(Virtual IP

Address) Can

Failover to…

VIP

:5060

:8443

:5060

:8443

:5060

:8443

:5060

:8080

:5060

:8080

:5060

:8080

VMGR

MLB

VMGR

HSLB

VMGR

RLB :5080

:7443

:5080

:7443

:5080

:7443

Page 36: BP109 : Sametime Voice and Video in the Real World

“End to End” AV Redundancy (without EDGE/TURN)

LB1

LB2

Load

Balancers

VIP

LB1

LB2

Load

Balancers

VIP

VMGR1

VMGR2

VMGR Farm

VMGR

LB1

VMGR

LB2

LB1

LB2

VIP

WPS5

WPS6

LB1

LB2

BWM Cluster WS Proxys Load

Balancers

VIP

PR1

PR2

SIPPR Cluster with WS Proxys

WPS3

WPS4

CM1

CM2

CM Cluster with WS Proxys

WPS1

WPS2

DB2

BWM1

BWM2

DB2 HADR

VMCU1 VMCU2 VMCU3

VMCU Farm

Client Calls to Clients

Inbound Calls

(SUT-Lite)

Conference Calls

CS CS

Load

Balancers

Page 37: BP109 : Sametime Voice and Video in the Real World

How Highly Available is a Clustered Sametime AV Environment? Failover of a MAC-Forwarding Load Balancer should not affect calls

– Load Balancer is not involved in the actual connection, only new incoming connections

– Connection information can also be replicated from one Load Balancer to its partner(s)

Loss of a WebSphere Application Server should not affect calls – shared environment

– However some SIP being processed by that Application Server could be lost, disrupting call set-up, tear-down or continuation of a very small number of calls

Loss of a WS Proxy will result in calls being lost

– Unless you use UDP (which cannot normally cope with the size of packets which include all the Sametime Codecs) the TCP/TLS connection from the client was established to a specific WS Proxy so if that goes down its connections are dropped

– Each connection is a client’s ability to make/receive/continue calls so any calls are lost and the clients will have to re-REGISTER when they detect the failure (within 1 minute, configurable)

– WS Proxies can be clustered but this does not provide High Availability / Connection information being shared or any method to maintain TCP/TLS connection

Page 38: BP109 : Sametime Voice and Video in the Real World

SIPSM2

SIPSM1 CSTASM1

High Availability Comparison – Sametime Unified Telephony

LB3

LB4

VIP

PR1

PR2

SIPPR Cluster with WS Proxys

WPS3

WPS4

Client

SIPSM1

SIPSM2

Active/Active Telephony Control Server

(TCS) Cluster 99.999% available

UCE1 VIP

FW1

MS1

FW2

MS2

Telephony Application Server Cluster with

Hot Standby:

Framework (FW) and Media Server (MS) on

one SAN partition and WebSphere

Application Server (WAS) on another

WAS1

WAS2

VIP

VIP

VIP

SAN

System Automation for MultiPlatforms

(SAMP) and Reliable Scalable Cluster

Technology (RSCT) manages failover to

spare node

CSTASM2

Hot/Hot Solid DB replication

Hot/Hot Universal Call Engine (UCE)

with shared call context memory

SIP Service Manager (SIPSM) and

Computer Supported Telecoms Apps

Service Manager (CSTASM) can failover

Solid

DB

Solid

DB

UCE2

Softphone calls still go through SIPPR

Cluster

IP

PBX IP

PBX

CS CS

FW?

MS? WAS?

Page 39: BP109 : Sametime Voice and Video in the Real World

Comparing Other types of High Availability and Scalability High Availability Disaster Recovery (HADR) replication for DB2 server pair with SAMP/RSCT

handling failover – no Virtual IP Addresses/Aliases – DB2 clients aware of both servers

VMware High Availability – much like SUT TAS but fails over the entire virtual machine

VMware Fault Tolerance – much like SUT TCS, second virtual machine in vLockStep becomes active upon failure of first - but can only use one vCPU until SMP-FT in ESXi 6.0

Page 40: BP109 : Sametime Voice and Video in the Real World

Scalability and Redundancy for Other Sametime servers

SIP EDGE Servers scale up in the same way as SIPPR and CM using WS Proxy and LBs

Sametime Meeting Servers scale up in the same way using WS Proxy for HTTP and LBs

Bandwidth Manager can scale up in the same way but only with two nodes

– uses WAS7 so needs its own Deployment Manager to configure the cluster

Sametime Proxy Servers do not need WS Proxy Servers (just Load Balancers)

TURN Servers can be fronted by IP or MAC Forwarding Load Balancers – http://www-01.ibm.com/support/knowledgecenter/SSKTXQ_9.0.0/admin/install/inst_config_turn_properties.dita

– Remember that no NAT can be configured and firewalls must be in transparent/bridging mode, Clients must (appear to) be able to connect to TURN servers in DMZ directly

Page 41: BP109 : Sametime Voice and Video in the Real World

Bigger On the Inside When It's Working Troubleshoot It When It's Fixed Make It Mobile Bigger On the Outside When It's Resilient, Break It When It's Secure Hack It Lessons Learned Outside the Box

Page 42: BP109 : Sametime Voice and Video in the Real World

“When it’s Secure, Hack It!”

HTTPS / TLS / SRTP should be configured

– Force web traffic to SSL/TLS using boundary devices/firewalls

– Test media with TCP/RTP first and then switch to TLS/SRTP and re-test

– 3rd party devices may need certificates exchanged for TLS to work

– If need be can have some (eg, intranet to VCS) connections using TCP and others using TLS

Certificates from official Certificate Authority should be used on internet side

Discover what it takes to decode TLS using Wireshark

Discover what it could take to commit fraud or a DoS attack

Appreciate why you need to keep certificates and their (non-default!) passwords safe

Tighten security as a result of any findings and re-test to check nothing is broken openssl pkcs12 -

in k

ey.

p12 -

nocert

s -

nodes -

out

decry

ptk

ey.

pem

Page 43: BP109 : Sametime Voice and Video in the Real World

SSO and Securing anonymous access Edit stavconfig.xml changing SIPAuthenticationType to LTPA if have configured SSO

Enable anonymous access by token authentication on CS to avoid DoS attacks

http://www-01.ibm.com/support/knowledgecenter/SSKTXQ_9.0.0/admin/config/st_adm_security_allow_token_auth_enable.dita

Ensure there is an anonymous user in LDAP

Put the shared key txt files in a directory which can be found – with appropriate permissions - on both SIPPR and CM (not in regular WAS profile directories which are unique per system) and set shared secret key paths in WAS Trust Association Interceptors, restart SIPPR and CM and check stavconfig.xml has these paths

Set TURNTokenAuthEnabled=true if clients are all ST 9.0 (TURN authentication not supported by previous clients)

For TURN server put file from SecretKeyPathForTurnAuthToken and key txt files in root directory / and put filenames in TurnServer.properties

Page 44: BP109 : Sametime Voice and Video in the Real World

Bigger On the Inside When It's Working Troubleshoot It When It's Fixed Make It Mobile Bigger On the Outside When It's Resilient, Break It When It's Secure Hack It Lessons Learned Outside the Box

Page 45: BP109 : Sametime Voice and Video in the Real World

Heads Up on Common Issues – IP Telephony

Restrictions on packet size (eg, UDP / SIP-aware firewalls) causes issues with the long list of codecs, ICE/STUN/TURN candidates and encryption options in SIP from Sametime Clients and VMCU – IP telephony may not have hit this issue in the same environment

G.729 is not currently supported except with SUT, iLBC is not yet supported in ST9, calls over the WAN may be prevented from using G.711 – discuss your needs and options with IBM

Lossy codecs especially in combination or used twice (eg, on an external conference bridge) may produce poor voice quality – ensure such use cases are evaluated

SIP session timers may provoke issues – set these low for testing and high for production

Test on/off hold, transfers, any conferencing and other special features like bridging, TLS...

Page 46: BP109 : Sametime Voice and Video in the Real World

Heads Up on Common Issues – WiFi and Firewalls

Corporate WiFi is a completely different environment to Mobile Data – test both!

Corporate Guest WiFi is another different environment - ensure the expectation and/or testing receives focus early-on as changes in this environment is a sensitive area

WiFi in other environments (some airports, hotels, etc.) may also be too restrictive

– Using Mobile Data instead by switching off WiFi on phone would be expected to work

Move non-standard ports to 80 and 443 where possible to overcome firewall issues, or specifically ask for firewalls to be opened for SIP (5060) and TURN (3478) and RTP

Page 47: BP109 : Sametime Voice and Video in the Real World

Lessons Learned in Hosting

VMCU really needs dedicated hardware meeting minimum spec (4 core, 8GB RAM) which is best placed on-premises in customer data center to keep latency to a minimum

VMCU requires eth0 to be used for its connection to VMGR - this is not generally possible to create without access to a Bare Metal Server (BMS)

Once you have one BMS get a second for redundancy and/or high speed (consistent performance guaranteed iops) shared storage between them for clones

Reserving CPU, memory and bandwidth as documented are all important in high-performance enterprise environments (much less so in small evaluations but reserve now or suffer later)

BMS reboots can cause datastore corruption – resist the urge to exploit simplistic automated monitoring which can result in this!

Page 48: BP109 : Sametime Voice and Video in the Real World

Why I UNIX/Linux/AIX/…

Pick an OS which gives you fast, secure access to the command line and the ability to troubleshoot the entire foundation of the system from that command line including the boot process and background processes

Standardize on one OS … logically RHEL or SLES by virtue of VMGR / VMCU

Make exceptions where necessary (eg, Document Conversion, ability to restart services without restarting entire Community Server)

Page 49: BP109 : Sametime Voice and Video in the Real World

OS Tips

Use bonding to both protect against physical adapter failure and simplify virtual machine cloning

Reduce TCP keepalive time to prevent backed-up queues – net.ipv4.tcp_keepalive_time=60

Reduce TCP final timeout to allow connections to end faster – tcp_fin_timeout=30

Check and increase default system limits – ulimit / limits.conf / syctl.cnf

Use LVM with ext3/ext4 and leave space for snapshots

Install wireshark before you need it

Page 50: BP109 : Sametime Voice and Video in the Real World

Draw a diagram

Draw (or purloin) some deployment diagrams to share with IBM support – they will ask you for them

– at a minimum include all Sametime components, proxies, load balancers

– if possible include additional detail on firewalls, VPNs, etc.

Page 51: BP109 : Sametime Voice and Video in the Real World

Bigger On the Inside When It's Working Troubleshoot It When It's Fixed Make It Mobile Bigger On the Outside When It's Resilient, Break It When It's Secure Hack It Lessons Learned

Page 52: BP109 : Sametime Voice and Video in the Real World

Phones Outside the Box

The call flows we showed included only Clients – but the scenarios can also involve Phones

Sametime Meetings can also call out to Phones with simple SIPPR rules

– Condition: Method=INVITE RequestURI=sip:[0-9]{6}@.*

– Destination: Request-URI pattern=sip:(.+)@.* Output pattern=sip:[email protected]:5060;transport=tcp

– Also set TelephoneConferenceEnabled=true in ConferenceManager.properties in /opt/IBM/WebSphere/profiles/*/installedApps/*/ConferenceFocus.ear/ConferenceFocus.war

Phones can also call into conference calls set up by Sametime Meetings

– Condition: Method=INVITE RequestURI=sip:[0-9]{4}@.* Source Address=ippbx.x.y.com

– Destination: sip:stvmgr.x.y.com:5060;transport=tcp

Page 53: BP109 : Sametime Voice and Video in the Real World

ST telephone numbers

ST will normally REGISTER using what is in the telephoneNumber field from LDAP

In fact ST really uses whatever is in Person document cache – which is taken from the Business Card

Business Card can be changed in SSC or by editing XML but the Telephone Number field should normally show the PSTN number

If there is no (valid) telephoneNumber then some outbound calls may work using the e-mail address registration for P2P calls – but a valid telephoneNumber is required for reliable ST telephone number and/or sip dialling

Obviously the numbers in LDAP for ST must be unique!

Page 54: BP109 : Sametime Voice and Video in the Real World

How can you call ST from a phone?

Assuming a user has a real phone and its number is in telephoneNumber then calls to telephoneNumber would go to the real phone

An internal dialling code could be used to reach the softphone instead if IP PBXes can transform the dialled number (SIPPR configuration cannot transform a received to another number – the “To:” header cannot be manipulated – what is received in INVITE must match what is REGISTERed)

An external dialling convention is not possible but a call-forward on the real phone could reach the softphone

SUT has superior support both for allowing a user to select their preferred device to receive a call on and integration without call-forwarding, called and calling number translation, etc.

Page 55: BP109 : Sametime Voice and Video in the Real World

ST Plugin allows a field other than telephoneNumber for softphone

Custom Plugin allows use of other Business Card fields or other LDAP fields

Often users are allowed to edit their Telephone Number

– by using another field issues with user-edited numbers can be eliminated

Page 56: BP109 : Sametime Voice and Video in the Real World

Different capabilities are available with different vendors – external TCSPI integration allows CS / CM to start conferences and provide moderator controls

Some integration can be achieved through sip addressing (ST )

– (Outbound) Condition: Method=INVITE Request URI=.*@x\.y\.com.*

– Destination: Request URI pattern=sip:(.+)@.* Output pattern=sip:[email protected]:5060;transport=tcp

– (Inbound) Condition: Method=INVITE Source Address=dma.x.y.com

– Destination: sip:stcm.x.y.com:5060;transport=tcp (Push Route)

It is also possible to integrate with 3rd party video clients through such a bridge

– (Outbound) Condition: Method=INVITE Request URI=.*\.3pvc.*

– Destination: Request URI pattern=sip:(.+)@.* Output pattern=sip:[email protected]:5060;transport=tcp

3rd Party Video Conferencing

Page 57: BP109 : Sametime Voice and Video in the Real World

Monitoring Inside and Outside the Box

ThinkRite managed services hinge on pro-active monitoring scripts and server dashboard (not a product) which run 24/7 and notifies staff of potential issues – many scripts run checks on the servers but dedicated SIP and VP (watchit) bots run on intranet and internet and can send independent alerts, as can the dashboard itself if updates dry up

There are many interfaces which may be useful for monitoring, identify which you can use

– the Bandwidth Manager ISC and STDBBWM database are particularly useful for monitoring calls

db2 -x "select cast(fromuserid as varchar(50)),cast(touserid as varchar(50)),endtime,endreason from bwm_media_sessions where

starttime > (current_timestamp -1 day)"

– also logs of the Conference Manager in …WebSphere/AppServer/profiles/*/logs/STMediaServer

callsummary.log.0 conference.log.0

– (REST) APIs on Conference Manager, Video Manager, … (see Links)

Greatest assurance of course remains Connect Client tests for which watchit is invaluable

Page 58: BP109 : Sametime Voice and Video in the Real World

Useful Links

http://www.slideshare.net/a8us/utf-8enibm-sametime-9-voice-and-video-deployment

http://www-01.ibm.com/support/docview.wss?uid=swg27040186&aid=1

http://www-10.lotus.com/ldd/stwiki.nsf/xpViewCategories.xsp?lookupName=Voice%20and%20Video

– BWM deployment best practices, CM and VMGR REST APIs, new tricks for ST AV…

https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/WebSphere+SIP+and+CEA/page/Configuring+and+Deploying+WebSphere+SIP+Environments

http://www-01.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/tsip_tunelinux.html

Page 59: BP109 : Sametime Voice and Video in the Real World

Links not relevant to Sametime AV

https://www.ibm.com/developerworks/community/wikis/home?lang=en#/wiki/WebSphere%20SIP%20and%20CEA/page/Achieving%20High%20Availability%20with%20WebSphere%20Application%20Server%20SIP%20Container%20and%20F5%20BIG-IP%20Local%20Traffic%20Manager (does not apply to Sametime!)

http://www.f5.com/pdf/deployment-guides/ibm-sametime-dg.pdf (does not include SIP!)

Page 60: BP109 : Sametime Voice and Video in the Real World

Related Sessions

Mon 1:00pm Mockingbird 1 & 2 MAS204 IBM Sametime Deployment Do’s and Don’ts: Tips, Tricks, Perils and Pitfals

Tues 1:00pm Swan SW 1-2 BP103 Solving the Weird, Obscure and The Mind-Bending

Tues 3:45pm Dolphin S Hem 1 ID102 IBM Sametime: Design and Implementation of a full HADR Deployment

Weds 10:30am Mockingbird 1&2 ID109 Digital Nightmares – The Biggest Performance Killers in Your Environment

Weds 11:45am Swan SW 7-10 ID112 Connect the Dots: IBM Sametime Audio/Video Planning, Deployment, Troubleshooting and Beyond

Weds 1:30pm Dolphin S Hem 1 ID108 Mobile Security Roundup

Page 61: BP109 : Sametime Voice and Video in the Real World

Who Was That Man?

Jeremy Sanders, Msc (Proj Mgmt) is the Chief Technical Officer of ThinkRite Ltd

(UK/EMEA) and continues to work with the ThinkRite team to integrate and develop enhancements for IBM SUT, Sametime Voice/Softphone (”SUT-Lite”) and IBM Unified Messaging. He’s been involved with IBM in development, integration, support and administration of what we now call Unified Communications for over 20 years.

For further details see the first few slides of last year’s presentation…

http://www.slideshare.net/kbmsg/jmp206

Page 62: BP109 : Sametime Voice and Video in the Real World

ThinkRite Ltd is the European division of ThinkRite Inc/ThinkRite Pty

ThinkRite provides Sametime/SUT installation services, managed services, hosting services, development services and innovative products including ThinkRite Assistant – Single Click to connect to all voice and web meetings using Sametime softphone and Mobile clients

http://www.thinkrite.com/brochures/ThinkRite%20Assistant%20Brochure.pdf

Think What?

One unique system for internal and external

Secured VPN to connect to Directory and PBX if needed

Available anywhere and on mobile devices without VPN access

Cloud 9.0

Page 63: BP109 : Sametime Voice and Video in the Real World

Engage Online

SocialBiz User Group socialbizug.org

– Join the epicenter of Notes and Collaboration user groups

Social Business Insights blog ibm.com/blogs/socialbusiness

– Read and engage with our bloggers

Follow us on Twitter

– @IBMConnect and @IBMSocialBiz

LinkedIn http://bit.ly/SBComm

– Participate in the IBM Social Business group on LinkedIn

Facebook https://www.facebook.com/IBMConnected

– Like IBM Social Business on Facebook