28
servload: Generating Representative Workloads for Web Server Benchmarking Jörg Zinke, Jan Habenschuß and Bettina Schnor Potsdam University Institute of Computer Science Operating Systems and Distributed Systems SPECT@SummerSim 2012 , 9. July 2012

servload: Generating Representative Workloads for … · servload: Generating Representative Workloads for Web ... OpenBSD Packet Filter (PF) for Server Load Balancing ... DNS server

Embed Size (px)

Citation preview

servload: Generating Representative Workloads for WebServer Benchmarking

Jörg Zinke, Jan Habenschuß and Bettina Schnor

Potsdam UniversityInstitute of Computer Science

Operating Systems and Distributed Systems

SPECT@SummerSim 2012 , 9. July 2012

Motivation

Server Load Balancing

Bettina Schnor (Potsdam University) servload Frame 2 of 26

Motivation

OpenBSD PF Development

OpenBSD Packet Filter (PF) forServer Load Balancing

implementation of WeightedRound Robin (WRR) and LeastStates (LS)

part of OpenBSD 5.0 released on01. November 2011

match in on bge0 proto tcp from any to any port 80 \rdr-to {www1, www2, www3} least-states

[PhD candidate Jörg Zinke, master student Fabian Hahn]

Bettina Schnor (Potsdam University) servload Frame 3 of 26

Motivation

Credit Based Server Load Balancing

using InfiniBand Network RDMA feature

Bettina Schnor (Potsdam University) servload Frame 4 of 26

Related Work

Table of Contents

1 Related Work

2 Features

3 Workload Generation

4 Evaluation

5 Implementation

6 Summary and Future Work

Bettina Schnor (Potsdam University) servload Frame 5 of 26

Related Work

Related Work

Existing benchmarks:

httperf: has support for loading requests from a session file. But: has alimit of 1000 sessions and can not replay these exactly. Problems can besolved ... just a matter of implementation and tuning effort ;-)

Tsung: The drawback is that the result metrics are aggregated to anunsuited mean value from a 10 second sampled spot. Interesting metricslike the number of requests which exceeded a threshold response timeare not available.

SPECweb2009: neither support for load balancing cluster nor replay

...

Workload generation:

Lots of theoretical work done starting in the 90s.

no tools available (?)

Bettina Schnor (Potsdam University) servload Frame 6 of 26

Features

Table of Contents

1 Related Work

2 Features

3 Workload Generation

4 Evaluation

5 Implementation

6 Summary and Future Work

Bettina Schnor (Potsdam University) servload Frame 7 of 26

Features

Features

1 web server benchmarking =⇒ Replaying workloads

2 generate higher loads and determine capability limits of the web serverand/or proof Service Level Agreements (SLAs) (scalability tests) =⇒Workload generation

Real WorkloadA real workload is defined as a sequence of requests which was received froma real world production web server.Representative WorkloadA representative workload is defined as a (generated) workload which has thesame characteristics like a given real workload.

Bettina Schnor (Potsdam University) servload Frame 8 of 26

Workload Generation

Table of Contents

1 Related Work

2 Features

3 Workload Generation

4 Evaluation

5 Implementation

6 Summary and Future Work

Bettina Schnor (Potsdam University) servload Frame 9 of 26

Workload Generation

Workload Generation

given: web server log trace

idea: add representative sessionsneeds no knowledge about probability of successor pages or other a prioriknowledge

servload supports different methods:

1 multiply

2 peak

3 score: The score method identifies sessions and rates them.

Bettina Schnor (Potsdam University) servload Frame 10 of 26

Workload Generation

Combined Log Format

127.0.0.1 - sven [22/Oct/2011:16:35:46 -0200]"GET /test.png HTTP/1.0" 200 232"http://example.com/index.html""Mozilla/5.0 (Windows NT 6.0;)"

remote host e.g. IP address of the clientRFC 1413 identity check of the client (rarely enabled)remote user (rarely enabled)timestamp when the server finished processing the requestrequestHTTP response statussize of the response in bytesreferrer corresponds to the predecessor pageuser agent

Identifying User Sessions: the following fields are considered in this prioritizedorder: remote user, remote host and user agent, or remote host only.

Bettina Schnor (Potsdam University) servload Frame 11 of 26

Workload Generation

Metrics for Identifying User Behavior:

number of requests in a session

session length in seconds

think time (in seconds) within session

body sizes of requests within session

Bettina Schnor (Potsdam University) servload Frame 12 of 26

Workload Generation

The score method identifies sessions and rates them:

Each session metric is compared to the median of the same metriccalculated from all sessions.

The nearer the current session metric value is to the overall median thehigher is the rating of the session.

Higher rated sessions are added to the given trace with randomly chosenstart times in the destination time frame.

Bettina Schnor (Potsdam University) servload Frame 13 of 26

Evaluation

Table of Contents

1 Related Work

2 Features

3 Workload Generation

4 Evaluation

5 Implementation

6 Summary and Future Work

Bettina Schnor (Potsdam University) servload Frame 14 of 26

Evaluation

Evaluation: score method

Internet Service Provider (ISP) testbed

web client running servload located in Europe/Berlin

a web server cluster located in South Korea

30 min web server log from ISP with 5684 sessions

Problem: no dedicated network

score method: added 2972 sessions

Bettina Schnor (Potsdam University) servload Frame 15 of 26

Evaluation

Evaluation: score method

ISP testbed

web client running servload located in Europe/Berlin

a web server cluster located in South Korea

30 min web server log from ISP with 5684 sessions

Problem: no dedicated network

score method: added 2972 sessions

Bettina Schnor (Potsdam University) servload Frame 15 of 26

Evaluation

Metric Original Modified

Requests 102081 153118Sessions 5684 8656

Number of RequestMedian 11 13

Average 17.96 17.69Standard deviation 21.57 18.88

Median Think-Time (s)Median 3 2.5

Average 514 321Standard deviation 3431 2718

Length (s)Median 90 96

Average 2578 1776Standard deviation 11536 9500

Bettina Schnor (Potsdam University) servload Frame 16 of 26

Evaluation

Metric Original Modified

Requests 102081 153118Sessions 5684 8656

Median BytesMedian 1275 1319

Average 2205 2009Standard deviation 3277 2740

Standard deviation and averages always decrease as expected (outliers getbad score).

Medians are comparable.

Bettina Schnor (Potsdam University) servload Frame 17 of 26

Evaluation

Evaluation: Replay Capabilities of servload

Metric Original Replay

Requests 10024 10024Sessions 118 118

Length (s) 1800 1809

Requests per secondMedian 5 5

Average 5.57 5.54Standard deviation 4.37 2.62

Median Think-Time (s)Median 0 1

Average 39 39Standard deviation 172 172

Bettina Schnor (Potsdam University) servload Frame 18 of 26

Evaluation

Evaluation: Replay

Metric Original Replay

Requests 10024 10024Sessions 118 118

Length (s) 1800 1809

Length (s)Median 166.5 168

Average 414.85 416.86Standard deviation 526.36 526.06

Median BytesMinimum 287 288Maximum 12394 12395

Median 1245 1314Average 2024 2037

Standard deviation 1921 1893

Bettina Schnor (Potsdam University) servload Frame 19 of 26

Implementation

Table of Contents

1 Related Work

2 Features

3 Workload Generation

4 Evaluation

5 Implementation

6 Summary and Future Work

Bettina Schnor (Potsdam University) servload Frame 20 of 26

Implementation

servload Implementation

Back to the central approach and efficient UNIX system calls:

First approach:

combination of Lua and C

Experiences with a Wikipedia dump of 1 of the 10 proxy-caches of theWikimedia Foundation, consisting of 25.6 Billions of HTTP-requests

ANSI C code (tested on Linux and OpenBSD)

servload binary parses given logfile, applies score method (optional) anddoes the replay

based on poll() with a single process and no threads

Bettina Schnor (Potsdam University) servload Frame 21 of 26

Implementation

servload Implementation

Back to the central approach and efficient UNIX system calls:

First approach:

combination of Lua and C

Experiences with a Wikipedia dump of 1 of the 10 proxy-caches of theWikimedia Foundation, consisting of 25.6 Billions of HTTP-requests

ANSI C code (tested on Linux and OpenBSD)

servload binary parses given logfile, applies score method (optional) anddoes the replay

based on poll() with a single process and no threads

Bettina Schnor (Potsdam University) servload Frame 21 of 26

Implementation

Evaluation: Scalability of servload

servload was extended to support also UDP traffic to evaluate for exampleDNS server (Master Thesis of Sebastian Menski)

servload on a single node, Intel Xeon E5520, 2.27 GHz with 12GB RAM

3 heterogenous (AMD Opteron, Intel Pentium and Intel Xeon) Bindbackend servers

LVS as Load Balancer with WRR

UDP Receive Queue increased to 24MB

5 minute original log

generating higher workload using the multiply method with factor 400, 800and 1600

→ millions of request

Bettina Schnor (Potsdam University) servload Frame 22 of 26

Implementation

Factor Requests Sessions avg Req/s max Req/s1 22 594 33 75 204400 9 037 600 13200 30125 81600800 18 075 200 26400 60250 1632001600 36 150 400 52800 120 501 326 400

Bettina Schnor (Potsdam University) servload Frame 23 of 26

Implementation

servload Output parameters

0

500

1000

1500

2000

2500

3000

400 800 1600

Tim

eo

uts

#

Faktor

Timeouts

400... 9.037.600 requests800... 18.075.200 requests1600... 36.150.400 requests

wrr

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

400 800 1600

Mill

ise

ku

nd

en

Faktor

Median First Response Time

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

400 800 1600

Mill

ise

ku

nd

en

Faktor

Median Connect Time

Bettina Schnor (Potsdam University) servload Frame 24 of 26

Summary and Future Work

Table of Contents

1 Related Work

2 Features

3 Workload Generation

4 Evaluation

5 Implementation

6 Summary and Future Work

Bettina Schnor (Potsdam University) servload Frame 25 of 26

Summary and Future Work

Summary and Future Work

replay benchmark in only ~2000 lines of ANSI Ccode

support for HTTP and DNS services

works also with server load balancer

representative workload generation:score method adds complete user sessions

scales with millions of requests

current work: support for DNS SecurityExtensions (DNSSEC)

→ Open-Source under BSD license available from:www.salbnet.org

Bettina Schnor (Potsdam University) servload Frame 26 of 26