TCP PerformanceFor Mobile Applications
Vladimir Kirillov@darkproger
Networking Stack
Data Link
Network
Transport
Application
Session
Data Link
Network
Transport
Application
Session
WiFi Edge 3G LTE
IP
TCP
HTTP
TLS
Data Link
Network
Transport
Application
Session
WiFi Edge 3G LTE
IP
TCP
HTTP
TLS
Level Protocol API / Implementation
hardware
kernel
SOCK_STREAM
(Http|NS)URLConnection
OpenSSL
Protocol API / Implementation Introspection
WiFi Edge 3G LTE
IP
TCP
HTTP
TLS
hardware
kernel
SOCK_STREAM
(Http|NS)URLConnection
OpenSSL
gdb
ptrace
socket API
bpf(4)LSF
dtrace
capturing iPhone traffic
% udid=$(system_profiler SPUSBDataType \ | awk '/iPhone/{go=1} /Serial/ {if (go) print $3; go=0}')276cb9530201bcehelloworldcd55560ed015d00
% rvictl -s $udidStarting device 276cb9530201bcehelloworldcd55560ed015d00[SUCCEEDED]
% ifconfig rvi0rvi0: flags=3005<UP,DEBUG,LINK0,LINK1> mtu 0
capturing Android traffic
# adb connect 192.168.56.100# adb shellshell@android:/ $ suTest propsu allows access thanks to androVM.su.bypass propertyshell@android:/ # tcpdump -i eth1
tcpdump -i lo0 -w t.pcap -s0 &nc -l 5000 &echo hello | nc localhost 5000kill %1
# tcpdump -r t.pcap -nnvv -tttt -K 'tcp port 5000'
2012-11-24 12:23:35.511134 IP6 (hlim 64, next-header TCP (6) payload length: 44) ::1.51734 > ::1.5000: Flags [S], seq 453038127, win 65535, options [mss 16324,nop,wscale 4,nop,nop,TS val 303407352 ecr 0,sackOK,eol], length 0
2012-11-24 12:23:35.511175 IP6 (hlim 64, next-header TCP (6) payload length: 20) ::1.5000 > ::1.51734: Flags [R.], seq 0, ack 453038128, win 0, length 0
2012-11-24 12:23:35.511226 IP (tos 0x0, ttl 64, id 8400, offset 0, flags [DF], proto TCP (6), length 64)
127.0.0.1.51735 > 127.0.0.1.5000: Flags [S], seq 2527137802, win 65535, options [mss 16344,nop,wscale 4,nop,nop,TS val 303407352 ecr 0,sackOK,eol], length 02012-11-24 12:23:35.511276 IP (tos 0x0, ttl 64, id 58311, offset 0, flags [DF], proto TCP (6), length 64)
127.0.0.1.5000 > 127.0.0.1.51735: Flags [S.], seq 494520280, ack 2527137803, win 65535, options [mss 16344,nop,wscale 4,nop,nop,TS val 303407352 ecr 303407352,sackOK,eol], length 02012-11-24 12:23:35.511287 IP (tos 0x0, ttl 64, id 47796, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.51735 > 127.0.0.1.5000: Flags [.], seq 1, ack 1, win 9186, options [nop,nop,TS val 303407352 ecr 303407352], length 02012-11-24 12:23:35.511298 IP (tos 0x0, ttl 64, id 52186, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.5000 > 127.0.0.1.51735: Flags [.], seq 1, ack 1, win 9186, options [nop,nop,TS val 303407352 ecr 303407352], length 02012-11-24 12:23:35.511332 IP (tos 0x0, ttl 64, id 31417, offset 0, flags [DF], proto TCP (6), length 58)
127.0.0.1.51735 > 127.0.0.1.5000: Flags [P.], seq 1:7, ack 1, win 9186, options [nop,nop,TS val 303407352 ecr 303407352], length 62012-11-24 12:23:35.511351 IP (tos 0x0, ttl 64, id 29060, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.51735 > 127.0.0.1.5000: Flags [F.], seq 7, ack 1, win 9186, options [nop,nop,TS val 303407352 ecr 303407352], length 02012-11-24 12:23:35.511354 IP (tos 0x0, ttl 64, id 4019, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.5000 > 127.0.0.1.51735: Flags [.], seq 1, ack 7, win 9186, options [nop,nop,TS val 303407352 ecr 303407352], length 02012-11-24 12:23:35.511367 IP (tos 0x0, ttl 64, id 20879, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.5000 > 127.0.0.1.51735: Flags [.], seq 1, ack 8, win 9186, options [nop,nop,TS val 303407352 ecr 303407352], length 02012-11-24 12:23:35.511378 IP (tos 0x0, ttl 64, id 59633, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.51735 > 127.0.0.1.5000: Flags [F.], seq 7, ack 1, win 9186, options [nop,nop,TS val 303407352 ecr 303407352], length 02012-11-24 12:23:35.511388 IP (tos 0x0, ttl 64, id 56794, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.5000 > 127.0.0.1.51735: Flags [F.], seq 1, ack 8, win 9186, options [nop,nop,TS val 303407352 ecr 303407352], length 0
17 packets captured
# tcpdump -r t.pcap -nnvv -tttt -K 'tcp port 5000'
2012-11-24 12:23:35.511134 IP6 (hlim 64, next-header TCP (6) payload length: 44) ::1.51734 > ::1.5000: Flags [S], seq 453038127, win 65535, options [mss 16324,nop,wscale 4,nop,nop,TS val 303407352 ecr 0,sackOK,eol], length 0
2012-11-24 12:23:35.511175 IP6 (hlim 64, next-header TCP (6) payload length: 20) ::1.5000 > ::1.51734: Flags [R.], seq 0, ack 453038128, win 0, length 0
2012-11-24 12:23:35.511226 IP (tos 0x0, ttl 64, id 8400, offset 0, flags [DF], proto TCP (6), length 64)
127.0.0.1.51735 > 127.0.0.1.5000: Flags [S], seq 2527137802, win 65535, options [mss 16344,nop,wscale 4,nop,nop,TS val 303407352 ecr 0,sackOK,eol], length 02012-11-24 12:23:35.511276 IP (tos 0x0, ttl 64, id 58311, offset 0, flags [DF], proto TCP (6), length 64)
127.0.0.1.5000 > 127.0.0.1.51735: Flags [S.], seq 494520280, ack 2527137803, win 65535, options [mss 16344,nop,wscale 4,nop,nop,TS val 303407352 ecr 303407352,sackOK,eol], length 02012-11-24 12:23:35.511287 IP (tos 0x0, ttl 64, id 47796, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.51735 > 127.0.0.1.5000: Flags [.], seq 1, ack 1, win 9186, options [nop,nop,TS val 303407352 ecr 303407352], length 02012-11-24 12:23:35.511298 IP (tos 0x0, ttl 64, id 52186, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.5000 > 127.0.0.1.51735: Flags [.], seq 1, ack 1, win 9186, options [nop,nop,TS val 303407352 ecr 303407352], length 02012-11-24 12:23:35.511332 IP (tos 0x0, ttl 64, id 31417, offset 0, flags [DF], proto TCP (6), length 58)
127.0.0.1.51735 > 127.0.0.1.5000: Flags [P.], seq 1:7, ack 1, win 9186, options [nop,nop,TS val 303407352 ecr 303407352], length 62012-11-24 12:23:35.511351 IP (tos 0x0, ttl 64, id 29060, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.51735 > 127.0.0.1.5000: Flags [F.], seq 7, ack 1, win 9186, options [nop,nop,TS val 303407352 ecr 303407352], length 02012-11-24 12:23:35.511354 IP (tos 0x0, ttl 64, id 4019, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.5000 > 127.0.0.1.51735: Flags [.], seq 1, ack 7, win 9186, options [nop,nop,TS val 303407352 ecr 303407352], length 02012-11-24 12:23:35.511367 IP (tos 0x0, ttl 64, id 20879, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.5000 > 127.0.0.1.51735: Flags [.], seq 1, ack 8, win 9186, options [nop,nop,TS val 303407352 ecr 303407352], length 02012-11-24 12:23:35.511378 IP (tos 0x0, ttl 64, id 59633, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.51735 > 127.0.0.1.5000: Flags [F.], seq 7, ack 1, win 9186, options [nop,nop,TS val 303407352 ecr 303407352], length 02012-11-24 12:23:35.511388 IP (tos 0x0, ttl 64, id 56794, offset 0, flags [DF], proto TCP (6), length 52) 127.0.0.1.5000 > 127.0.0.1.51735: Flags [F.], seq 1, ack 8, win 9186, options [nop,nop,TS val 303407352 ecr 303407352], length 0
17 packets captured
^^% stat -f %z t.pcap1306
% tcptrace t.pcap
17 packets seen, 17 TCP packets tracedelapsed wallclock time: 0:00:00.001344, 12648 pkts/sec analyzedtrace file elapsed time: 0:00:00.000305TCP connection info:
1: localhost:52132 - localhost:5000 (a2b) 1> 1< (reset)2: localhost:52133 - localhost:5000 (c2d) 8> 7< (complete) (reset)
% tcptrace -o2 -l t.pcap...
adv wind scale: 4 adv wind scale: 4 req sack: Y req sack: Y
sacks sent: 0 sacks sent: 0 urgent data pkts: 0 pkts urgent data pkts: 0 pkts urgent data bytes: 0 bytes urgent data bytes: 0 bytes
mss requested: 16344 bytes mss requested: 16344 bytes max segm size: 6 bytes max segm size: 0 bytes min segm size: 6 bytes min segm size: 0 bytes avg segm size: 5 bytes avg segm size: 0 bytes
max win adv:146976 bytes max win adv: 146976 bytes
min win adv:146976 bytes min win adv: 146976 bytes
zero win adv: 0 times zero win adv: 0 times avg win adv: 146976 bytes avg win adv: 122480 bytes
initial window:6 bytes initial window: 0 bytes initial window: 1 pkts initial window: 0 pkts ttl stream length: 6 bytes ttl stream length: 1 bytes
missed data: 0 bytes missed data: 1 bytes truncated data: 0 bytes truncated data: 0 bytes truncated packets: 0 pkts truncated packets: 0 pkts data xmit time: 0.000 secs data xmit time: 0.000 secs idletime max: 0.1 ms idletime max: 0.0 ms throughput: 27027 Bps throughput: 0 Bps
endpoint
SO_RCVBUF
SO_SNDBUF
endpoint
SO_RCVBUF
SO_SNDBUF
endpoint
SO_RCVBUF
SO_SNDBUF
endpoint
SO_RCVBUF
SO_SNDBUF
SEG
SEG
endpoint
SO_RCVBUF
SO_SNDBUF
endpoint
SO_RCVBUF
SO_SNDBUF
LATENCY
SEG
SEG
BANDWIDTH
2 * LATENCY = RTT
Latency
• Time from one endpoint to another• Each connection spans multiple links
• latency = sum (lat foreach link)• RTT = 2 * latency
Bandwidth• Number of bytes a link can handle
• bw = min (bw foreach link)
BandwidthDelay
ProductBDP = RTT * BANDWIDTH
sender window
receiver window
client
SO_RCVBUF
SO_SNDBUF
server
SO_RCVBUF
SO_SNDBUF
sender window
receiver window
client
SO_RCVBUF
SO_SNDBUF
server
SO_RCVBUF
SO_SNDBUFSEGSEG
SEGSEG
sender window
receiver window
client
SO_RCVBUF
SO_SNDBUF
server
SO_RCVBUF
SO_SNDBUFSEGSEG
SEGSEG
SEG SEGSEG SEG
receiver windowSEGSEG SEG SEGSEG SEG
TCPbyte
stream
• stateful• ordered• reliable• managed
IP
TCP
HTTP
TLShas state
no state
paired
IP
TCP
HTTP
TLS
1 RTTSYN
SYN,ACKACK
TLS
"Oh, a SSL certificate warning.
I'll read it carefully and understand the possible implications before proceeding.”
-- no User, ever.
TLS
"Oh, a SSL library.
I'll understand carefully its semantics and will not breakauthentication.”
-- unknown developer.
TLS% openssl s_client -showcerts -connect internet.velcom.by:443
CONNECTED(00000003)
depth=3 Thawte Premium Server CAverify error:num=19:self signed certificate in certificate chainverify return:0
Certificate chain 0 s:/C=BY/ST=Minsk/L=Minsk/O=FE Velcom/CN=internet.velcom.by i:/C=US/O=Thawte, Inc./CN=Thawte SSL CA
-----BEGIN CERTIFICATE-----...-----END CERTIFICATE-----
1 s:/C=US/O=Thawte, Inc./CN=Thawte SSL CA i:/C=US/O=thawte, Inc./OU=Certification Services Division/OU=(c) 2006 thawte, Inc. - For authorized use only/CN=thawte Primary Root CA
-----BEGIN CERTIFICATE-----...-----END CERTIFICATE-----
2 s:/C=US/O=thawte, Inc./OU=Certification Services Division/OU=(c) 2006 thawte, Inc. - For authorized use only/CN=thawte Primary Root CA i:/C=ZA/ST=Western Cape/L=Cape Town/O=Thawte Consulting cc/OU=Certification Services Division/CN=Thawte Premium Server CA/[email protected]
-----BEGIN CERTIFICATE-----...-----END CERTIFICATE-----
3 s:/C=ZA/ST=Western Cape/L=Cape Town/O=Thawte Consulting cc/OU=Certification Services Division/CN=Thawte Premium Server CA/[email protected] i:/C=ZA/ST=Western Cape/L=Cape Town/O=Thawte Consulting cc/OU=Certification Services Division/CN=Thawte Premium Server CA/[email protected]
-----BEGIN CERTIFICATE-----...-----END CERTIFICATE-----
Server certificatesubject=/C=BY/ST=Minsk/L=Minsk/O=FE Velcom/CN=internet.velcom.byissuer=/C=US/O=Thawte, Inc./CN=Thawte SSL CA
SSL handshake has read 4736 bytes and written 328 bytes
TLS
% openssl s_client -showcerts -connect ciklum.com:443CONNECTED(00000003)
depth=0 /C=UA/OU=Domain Control Validated/CN=*.ciklum.netverify error:num=20:unable to get local issuer certificateverify return:1depth=0 /C=UA/OU=Domain Control Validated/CN=*.ciklum.netverify error:num=27:certificate not trustedverify return:1depth=0 /C=UA/OU=Domain Control Validated/CN=*.ciklum.netverify error:num=21:unable to verify the first certificateverify return:1---Certificate chain 0 s:/C=UA/OU=Domain Control Validated/CN=*.ciklum.net i:/O=AlphaSSL/CN=AlphaSSL CA - G2...Server certificatesubject=/C=UA/OU=Domain Control Validated/CN=*.ciklum.netissuer=/O=AlphaSSL/CN=AlphaSSL CA - G2
SSL handshake has read 1854 bytes and written 328 bytes
IP
TCP
HTTP
TLS
1 RTT
2 RTTs
SYNSYN,ACK
ACK, ClientHelloServerHello, Certificate
ClientKEX, ChangeCipherSpecChangeCipherSpec,Finished
It takes 4 RTTs to serve a HTTPS request
IP
TCP
HTTP
TLS
1 RTT
2 RTTs
SYNSYN,ACK
ACK, ClientHelloServerHello, Certificate
ClientKEX, ChangeCipherSpecChangeCipherSpec,Finished
GETOK
1 RTT
It takes 4 RTTs to serve a HTTPS request
IP
TCP
HTTP
TLS
1 RTT
2 RTTs
SYNSYN,ACK
ACK, ClientHelloServerHello, Certificate
ClientKEX, ChangeCipherSpecChangeCipherSpec,Finished
GETOK
1 RTT
TCP Reliability
sender window
receiver window
client
SO_RCVBUF
SO_SNDBUF
server
SO_RCVBUF
SO_SNDBUFSEGSEG
SEGSEG
SEG SEGSEG SEG
receiver windowACK
ACK
ACK
ACK
AirPortExpress
router
router
router
sender window
receiver window
client
SO_RCVBUF
SO_SNDBUF
server
SO_RCVBUF
SO_SNDBUFSEGSEG
SEGSEG
SEG SEGSEG SEG
receiver windowACK
ACK
ACK
ACK
AirPortExpress
router
router
router
sender window
receiver window
client
SO_RCVBUF
SO_SNDBUF
server
SO_RCVBUF
SO_SNDBUFSEGSEG
SEGSEG
SEG SEGSEG SEG
receiver windowACK
ACK
ACK
ACK
retransmit on timeout (~200ms)
TCP Congestion Control
SO_RCVBUF
sender window
receiver window
server
SO_RCVBUF
SO_SNDBUFSEG
client
SO_SNDBUF
SEGSEGSEG SEG SEGSEG
SEG SEG SEG SEG
sender windowreceiver window
AirPortExpress
overloadedrouter
router router
receiver window
client
SO_RCVBUF
SO_SNDBUF
server
SO_RCVBUF
SO_SNDBUFSEGSEG SEG SEGSEG SEG
ACK
ACK
ACK
^^^ What congestion control is actually designed for
AirPortExpress
router
router
router
sender window
receiver window
client
SO_RCVBUF
SO_SNDBUF
server
SO_RCVBUF
SO_SNDBUFSEGSEG
SEGSEG
SEG SEGSEG SEG
receiver windowACK
ACK
ACK
ACK
SEG
SEG
SEG
SEG
^^^ What actually happens on mobile devices
Crappy Wi-Fi
TCP Artifacts
• Nagle algorithmwhile (1)write(fd, “5”, 1);
(telnet syndrom)
Delayed ACKhttp://www.stuartcheshire.org/papers/NagleDelayedAck/
TCP Artifacts
•SO_OOBINLINE•TCP URG•RFC 6093
API Issues
API IssuesAsync NSURLConnectionUIScrollViewCFRunLoopAddCommonMode
SO_RCVBUF
sender window
receiver windowSO_RCVBUF
server
SO_RCVBUF
SO_SNDBUFSEGSEG
client
SO_SNDBUF
SEG SEGSEGSEG SEG SEGSEG
SEG
CongestionAvoidance
TCP Reno
• Additive Increase• Multiplicative Decrease• Slow Start
Android
Android
# cat /proc/sys/net/ipv4/tcp_slow_start_after_idle 1# cat /proc/sys/net/ipv4/tcp_no_metrics_save 0
# echo 0 > /proc/sys/net/ipv4/tcp_slow_start_after_idle # echo 1 > /proc/sys/net/ipv4/tcp_no_metrics_save
# find /proc/sys/net/ipv4 | grep cong | xargs -tn1 cat
cat /proc/sys/net/ipv4/tcp_allowed_congestion_controlcubic renocat /proc/sys/net/ipv4/tcp_available_congestion_controlcubic renocat /proc/sys/net/ipv4/tcp_congestion_controlcubic
# ip route show default via 192.168.56.1 dev eth1 initcwnd 10 initrwnd 10
Sockets
• setsockopt(2)• adjust window size• socket buffer sizes • TCP_NODELAY (Nagle)• etc•getsockopt(2)• monitoring
• low-latency responding to socket events• do not let the buffer stay full
getsockopt(SOL_TCP, TCP_INFO)
ESTAB 0 176 10.1.1.1:22 10.1.1.2:61984 users:(("sshd",18989,3))!mem:(r0,w1168,f2928,t0)
ts sack bic wscale:4,5 rto:280
rtt:56.25/7.5 ato:40 cwnd:8 ssthresh:7
send 1.6Mbps rcv_rtt:50 rcv_space:14480
#include <linux/tcp.h>
iproute2
Speedup
Do not create connections!
for _i in $(seq 10);ssh -f thailand cat
for _i in $(seq 10);ssh \-o 'ControlMaster yes' \-f thailand cat
Responsive UI
• Instagram• VK• best UI• worst reliability
Steroids• TCP Fast Open• Linux 3.6• HAProxy
Steroids• TCP/NC• TCP and math (maths)• http://dspace.mit.edu/openaccess-
disseminate/1721.1/58796
Scheduling,Algorithms
• TCP Westwood+ (LFN)• TCP Veno (Wi-Fi)• http://www.apan.net/meetings/
honolulu2004/materials/engineering/APAN_ppt.pdf
•CONF_TCP_CONG_VENO
Steroids
• TLS False Start• TLS NPN• Next Protocol Negotiation
• HTTP Pipelining• SPDY
Research
• https://github.com/proger/iproute2ss -I
• https://github.com/proger/captcp• tcptrace• tcpflow• monitoring
kthxbai@darkproger
http://kirillov.im