56
Break My Site practical stress testing and tuning of Cocoon applications photo credit: Môsieur J a talk for CocoonGT 2007 by Jeremy Quinn 1 This is designed as a beginner’s talk. I am the beginner.

Break My Site

  • Upload
    mrquinn

  • View
    4.241

  • Download
    1

Embed Size (px)

DESCRIPTION

Practical stress testing and tuning of Cocoon applications. It is best to know before you go live, whether your new site will stand up to the traffic you expect it to receive. There is only so much you can tell about the speed and capacity of a web project from browsing it by hand. This talk offers practical advice, garnered from real-life experience about how to measure and tune the performance of web applications, using free tools like JMeter, the Open Source load-tester from Apache. The talk will cover, planning your tests, setting up the tools, advise on what data to capture, how to interpret it and some of the possibilities for tuning your project. It will help you answer questions like : How many users will it take to break my site ? How can I handle more users, faster ? How reliable will my site be ? How can I measure the effect of my changes during development ? How do I compare different implementations of the same functionality ? How do I determine the right size for my server infrastructure ? The photo-credit links are missing from the Flash version, here they are :

Citation preview

Page 1: Break My Site

Break My Sitepractical stress testing and tuning of Cocoon applications

photo credit: Môsieur J

a talk for CocoonGT 2007 by Jeremy Quinn

1

This is designed as a beginner’s talk. I am the beginner.

Page 2: Break My Site

scenario

Approximately 600 hits per data-point

Users

0

50

100

150

200

250

300

350

400

1 10 20 30 40 50 60 70 80 90 100

Original

Page Load (secs)Throughput (pages/min)Stability %

2

My starting point for optimisationYou may be able to see how useless one person clicking in a browser is at testing, the server never gets loaded.Disregard the absolute speed values, this was done on my laptop.

Page 3: Break My Site

scenario

Approximately 600 hits per data-point

Users

0

50

100

150

200

250

300

350

400

1 10 20 30 40 50 60 70 80 90 100

Original

Page Load (secs)Throughput (pages/min)Stability %Average

seconds per page

OUCH !!!!

Throughputpages per

minuteEEEK !!!

3

The original server configuration being taken to only 30 users, with disastrous effect.Throughput drops to single figures, page load time climbs sky high, stability reduces by nearly half.From the logs I could see there were not enough threads or memory to cope.This is badly broken.

Page 4: Break My Site

1 10 20 30 40 50 60 70 80 90 100

Memory & Thread Optimisation

scenario

Approximately 600 hits per data-pointAll graphs at the same scale

accumulated optimisations -–— what’s possible

Users

0

50

100

150

200

250

300

350

400

1 10 20 30 40 50 60 70 80 90 100

Original

Page Load (secs)Throughput (pages/min)Stability %

4

Memory and Thread optimisation has made a tremendous difference.Now I can easily test up to 100 users, before I could not go past 30.Page Load time, Throughput etc. is now linear with the increase in users.At some point this configuration will also ‘break’, you need to know where this point isso that you can plan for it not happening.

Page 5: Break My Site

1 10 20 30 40 50 60 70 80 90 100

Memory & Thread Optimisation

scenario

Approximately 600 hits per data-pointAll graphs at the same scale

accumulated optimisations -–— what’s possible

Users

0

50

100

150

200

250

300

350

400

1 10 20 30 40 50 60 70 80 90 100

Original

Page Load (secs)Throughput (pages/min)Stability %

1 10 20 30 40 50 60 70 80 90 100

Read Optimisation

5

The server is doing slightly less work due to this optimisationThis shows as the gradients flattening out a bit more.

Page 6: Break My Site

1 10 20 30 40 50 60 70 80 90 100

Memory & Thread Optimisation

scenario

Approximately 600 hits per data-pointAll graphs at the same scale

accumulated optimisations -–— what’s possible

Users

0

50

100

150

200

250

300

350

400

1 10 20 30 40 50 60 70 80 90 100

Original

Page Load (secs)Throughput (pages/min)Stability %

1 10 20 30 40 50 60 70 80 90 100

Read Optimisation

1 10 20 30 40 50 60 70 80 90 100

List Optimisation200

150

100

50

0

6

Another optimisation, further reducing the work done on the server.Flatter curves, better performance.

This gives you an idea of the scale of improvement you may be able to make.In my case it is quite dramatic!

Page 7: Break My Site

purposewhat do I hope to achieve?

photo credit: Brian Uhreen

7

Load testing can be used to solve different sorts of problems.You don’t want your site broken by popularity.Find out how much traffic will make a site break, so you can plan to cope.

Page 8: Break My Site

comparing changes

• code optimisations

• different implementations

• caching strategies

• different datasources

• different server architectures

8

Load testing can be used during development to test changes.

Page 9: Break My Site

test content

• test the content of pages

• test the behaviour of webapps

9

JMeter also has exhaustive tools for crawling and testing content within web pages.It can also send parameters as POST requests etc. to help you test webapps.

Page 10: Break My Site

gauge your server

• get a specification

• how many of what power of machine?

• choosing and tuning a load-balancer

10

Before you start to work out what servers you need, you have to know what level of traffic it is intended to cope with.This has to come from either the implementation you are replacing or from marketing.Test the machines you plan to use, so you know what capacity they can handle.Round-Robins with machines of different capacities may work really badly.

Page 11: Break My Site

planningwhat am I going to do?

photo credit: Paul Goyette

11

Plan well and hopefully you will get useful data

Page 12: Break My Site

what can I change?

• implementations

• server configuration

• memory

• threads

• system architecture

12

What kind of changes do you want to compare?

Page 13: Break My Site

what will happen?

• build a mental model

• predict what you expect to happen

• learn to read the data

• test your ideas against reality

13

As your tests run, relate events in the server logs to how the graphs look

Page 14: Break My Site

order of change

• only make one change at a time

• only make one change at a time

• only make one change at a time

• only make one change at a time

14

If there are several changes you can makeIt is important to plan what order you do them in

Page 15: Break My Site

order of change

• have you got that?

15

If you make more than one change at a timeyou don’t know which change had what effect which makes it more difficult to build a mental model of what is happening

Page 16: Break My Site

order of change

• in the most meaningful order

• which depends on your purpose

16

I tested optimisations before playing with threads and memoryFor each change, I ran the tests again.Optimisations may require different ideal thread and memory settings.

Page 17: Break My Site

what tools?on MacOSX I used :

• JMeter

• Terminal

• Console

• Activity Monitor

• Grab

17

The tools I used to capture and watch real-time performance and issues

Page 18: Break My Site

what tools?on MacOSX I used :

• JMeter

• Terminal

• Console

• Activity Monitor

• Grab

18

I found it was fine to run JMeter on the server for running low level tests.It is better to run JMeter on another machine though.For very high throughput testing, you can aggregate results from many JMeter clients.

Page 19: Break My Site

what tools?on MacOSX I used :

• JMeter

• Terminal

• Console

• Activity Monitor

• Grab

19

I can watch Cocoon in its Terminal window.See errors as they happen

Page 20: Break My Site

what tools?on MacOSX I used :

• JMeter

• Terminal

• Console

• Activity Monitor

• Grab

20

I used the Console to filter and watch Jetty’s logs for the repository.

Page 21: Break My Site

what tools?on MacOSX I used :

• JMeter

• Terminal

• Console

• Activity Monitor

• Grab

21

In Activity Monitor I can track threads, memory and cpu usage.Here we can see the two java processesThe top one is Cocoon, the bottom one the repositoryHere you can see that through bottlenecks, the system is being under-utilised.IMHO both cores should be working fully on a well tuned system

Page 22: Break My Site

setupwhat do I do to get ready?

photo credit: Fabio Venni

22

getting started

Page 23: Break My Site

make a test

keep it simple stupid

23

Here is the simplest way to use JMeter

Page 24: Break My Site

start JMeter

24

Starting JMeter from the command-line

Page 25: Break My Site

start a plan

25

Start a new plan, give it a name

Page 26: Break My Site

threads

26

Set up the number of threads you want to test with

Page 27: Break My Site

request

27

Add an HTTP Request to the Thread GroupSet up the server’s IP, port and path

Page 28: Break My Site

recording

28

A graph for plotting resultsIf you are running many tests, make sure you give everything suitable names and comments so you know what you were testing.The test output can be written to file for further analysis.

Page 29: Break My Site

check everything

• content of datafeeds

• behaviour of pipelines

• outcome of transformations

29

Check every stage, so you know everything is working properly and set up as you expected.Do this before you start testing.Otherwise you might not be testing what you think you are.

Page 30: Break My Site

caching

• repository?

• datasource?

• pipeline?

30

Make sure you know the state of all the caching,so it matches what you want to test.

Page 31: Break My Site

warm-up

• hit your test page(s) with a browser

• last chance for a visual check

• get Cocoon up to speed

31

Warm-up Cocoon.Hit the pages you want to test in a browser to make sure they work.

Page 32: Break My Site

be local

• clients & server on the same subnet

• avoid possible network issues

32

Be on the server's local subnet.

Page 33: Break My Site

enough clients?

• creating many threads

• creates its own overhead

33

Have enough clients to perform the level of testing you need.

Page 34: Break My Site

quit processes

• everything that is unnecessary

• periodic processes skew your results

34

Letting something like your email client fire off periodic checks for new mail, will skew your results.

Page 35: Break My Site

record your data

• server logs

• JMeter graphs

• JMeter data file (XML or CSV ?)

• screen grabs of specific events

• test it !!!!!

35

Prepare properly, to make sure all of your data will be recorded.Grab relevant sections of logs.Grab graphs after a run.Choose what data to write to a file.

Page 36: Break My Site

running teststime for a cup of tea?

photo credit: thespeak

36

No, you have lots to do.

Page 37: Break My Site

run-time data

• clear log windows on each run

• keep it organised

• use a file or folder naming convention

37

Record the real-time data you need.I clear the log windows for each run so after the run is finished, I can easily scroll back looking for exceptions etc.Use naming conventions for multiple runs.

Page 38: Break My Site

watch it run

• part of the learning process

• see where the problems happen

• see what you are fixing

• make the link between cause and effect

• take snapshots

38

Watch the tools as the tests run.Learn to read the graphs.

Page 39: Break My Site

my screen

39

I can follow errors and timings change in Cocoon’s output.Errors and query times changing for the repo, in Jetty’s logs.Thread, memory and cpu usage in Activity Monitor.

Page 40: Break My Site

interpreting the results

what does it all mean?

photo credit: woneffe

40

The key for me, to begin to learn how to interpret the results, was to watch the tests live, relating what appeared on the graph to what was happening (errors, slowdowns, garbage collection etc.) on the server.

Page 41: Break My Site

reading the graph

• load speed

• throughput

• deviation

• hits

41

NB. Unless you are testing on real hardware, these speeds have no meaning in isolationbut are useful for comparison

Page 42: Break My Site

reading the graph

• load speed

• throughput

• deviation

• hits

42

Page load speeds in milliseconds.Lower is betterOn a stable system, it should go flat

Page 43: Break My Site

reading the graph

• load speed

• throughput

• deviation

• hits

43

Throughput in pages per second.Higher is betterOn a stable system, it should go flat

Page 44: Break My Site

reading the graph

• load speed

• throughput

• deviation

• hits

44

The deviation (variability) of the load speed in millisecondsLower is betterOn a stable system, it should go flat

Page 45: Break My Site

reading the graph

• load speed

• throughput

• deviation

• hits

45

The black dots are individual sample times

Page 46: Break My Site

reading the graph

• load speed

• throughput

• deviation

• hits

46

Sometimes JMeter draws data off the graphI was using a beta, this may have been fixed in the latest release

Page 47: Break My Site

chaotic start

• test for long enough

• JMeter ramps threads

• JVMs warm up

• Servlet adds threads

47

Your tests need to run for long enough for you to get meaningful results.I ran a minimum of 600 hits per test

Page 48: Break My Site

what’s this sawtooth?

• hysteresis?

• randomise requests?

48

You can sometimes see a sawtooth in the throughput.What causes this?Is it important?This is what I think is happening ....

Maybe you could remove the effect by randomising requests.

Page 49: Break My Site

what’s this sawtooth?

• hysteresis?

• randomise requests?

processing

49

Throughput slows as jobs build up and threads process.

Page 50: Break My Site

what’s this sawtooth?

• hysteresis?

• randomise requests?

sending

50

Responses comes in spurts as jobs complete.JMeter issues new requests

Page 51: Break My Site

what’s this sawtooth?

• hysteresis?

• randomise requests?

51

The change in gradient can tell you the change in the increase in throughput (?)

Page 52: Break My Site

shi‘appen

52

Learn to relate problems on the server to its effect on the graph.Big spikes indicate that you have problemsYou may see the effects of:exceptionsgarbage collection

Page 53: Break My Site

good signs

• speed flattening

• throughput flattening

• deviation dropping

• no exceptions ; )

53

When the values on the graph begin to flatten out,it shows that the system has become stable at that load.

Page 54: Break My Site

• authorise

• fill in data

• upload files

• wrap tests in logic

• etc. etc.

more tests

54

I used the simplest possible configuration in JMeter.I was only hitting one url.It does a lot more.

Page 55: Break My Site

• JDBC

• LDAP

• FTP

• AJP

• JMS

• POP, IMAP

• etc. etc.

more protocols

55

You can apply tests to many different parts of your infrastructure.

Page 56: Break My Site

• design to cope

• test early, test often

• plan well

• pay attention

• capture everything

• maintain a model

conclusions

56