28
© 2006, National Research Council Canad © 2006, IBM Corporation Solving performance issues in OTS- based systems Erik Putrycz Software Engineering Group National Research Council Canada Marius Slavescu IBM Rational Automated Software Quality Division IBM Canada

© 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

Embed Size (px)

Citation preview

Page 1: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

© 2006, National Research Council Canada© 2006, IBM Corporation

Solving performance issues in OTS-based systems

Erik Putrycz

Software Engineering Group

National Research Council Canada

Marius Slavescu

IBM Rational Automated Software Quality Division

IBM Canada

Page 2: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

2© 2006, National Research Council, Canada© 2006, IBM Corporation

Outline

• Introduction: performance problems in OTS-based systems

• Iterative process for diagnosing problems

• Applying the iterative process with Eclipse TPTP

• Example

• Conclusions

Page 3: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

3© 2006, National Research Council, Canada© 2006, IBM Corporation

Outline

• Introduction: performance problems in OTS-based systems– Memory leaks– Inefficient code– Current tools and techniques– OTS-based systems context

• Iterative process for diagnosing problems

• Applying the iterative process with Eclipse TPTP

• Example

• Conclusions

Page 4: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

4© 2006, National Research Council, Canada© 2006, IBM Corporation

Memory Leaks

• All applications require to reserve memory– … but sometimes the application doesn’t release memory that

is not used anymore• Execution environment or runtime responsible for

managing memory– C/C++: manual reservations– Java, .NET, most other VMs: runtime responsible of reserving

and releasing memory

Page 5: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

5© 2006, National Research Council, Canada© 2006, IBM Corporation

Memory Leaks (2)

• Diagnostic– Problem noticeable only until there is not more physical

memory available• Swapping and paging occurs

• Solution– “Black box” component

• get the vendor to fix it or change the way the product is use

– Open source component• Instrument and locate source of the problem

Page 6: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

6© 2006, National Research Council, Canada© 2006, IBM Corporation

Inefficient code

• Inefficient algorithms: – unnecessary complexity in the algorithm (that can be replaced

by less complex versions);

• Inefficient use of caching: – without caching where possible (e.g., database, network or

files), heavy resource access can lead to slow execution;

• Expensive interactions with other products:– some interactions within the system may be expensive and

lead to a

• High resource utilization, or long latencies (e.g., remote interactions used inappropriately).

Page 7: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

7© 2006, National Research Council, Canada© 2006, IBM Corporation

Current tools and techniques

• Focused for custom built software

• If source code is available:– test performance– locate the performance issue in the code– fix the problem and iterate

• If source code is not available– use the performance troubleshooting section of the manual (if

available)– get a product expert to help identifying the cause

Page 8: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

8© 2006, National Research Council, Canada© 2006, IBM Corporation

OTS-based Systems Context

• Possible sources of the cause– integration code– one or more components

• Possible actions– change the concerned component(s) configuration– notify the vendor of the bug– replace a product– modify the way the product is used

Page 9: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

9© 2006, National Research Council, Canada© 2006, IBM Corporation

Outline

• Introduction: performance problems in COTS-based systems

• Iterative process for diagnosing problems– Objectives– Process– Alternatives

• Applying the iterative process with Eclipse TPTP

• Example

• Conclusions

Page 10: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

10© 2006, National Research Council, Canada© 2006, IBM Corporation

Objectives

• Currently– Product experts in charge of analyzing logs, finding the cause

of the problem

• Iterative process – Solve a performance problem in a systematic manner

Page 11: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

11© 2006, National Research Council, Canada© 2006, IBM Corporation

Iterative process: 4 steps

1. Data collection: trace files are collected on OTS products;

2. Performance problem reproduction: the problem has to be repeated in order to capture it in the traces;

3. Trace correlation: the trace files collected are correlated to understand the causality of events in the system;

4. Trace analysis: the correlated traces may need further analysis to focus on the relevant information

Page 12: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

12© 2006, National Research Council, Canada© 2006, IBM Corporation

1. Data Collection

• Start by the least intrusive method through the different OTS-components

• Usually tracing end-user response time

• Subsequent steps:– Get data more focused

• Possible methods:– Adding instrumentation to the integration code– Enable logging on other products

Page 13: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

13© 2006, National Research Council, Canada© 2006, IBM Corporation

2. Performance problem reproduction

• Test system:– Load testing tool

• generate load

– Automated testing framework• scenario to reproduce the problem

• Production system: – wait until the problem reappears

Page 14: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

14© 2006, National Research Council, Canada© 2006, IBM Corporation

3. Trace correlation

• Finding the causality of events in the different traces

• Reconstruct the causality– Traces timestamps– Logs with only time:

• Correlate by similar time

– Logs with time + duration• Find events that happened during another one

Page 15: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

15© 2006, National Research Council, Canada© 2006, IBM Corporation

4. Trace analysis

• Statistical– Locate in the traces the most significant events– Group the most repeated ones– Calculate where time is spent

• Visual– Most common method– Graph or UML interaction diagrams

• Analysis of profiler traces– [CASCON04]: traces are collected of all the interactions

between integration code and OTS-products– Find which interactions are the cause of the performance

problem

Page 16: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

16© 2006, National Research Council, Canada© 2006, IBM Corporation

Alternatives solutions to the iterative process

• Online analysis– Enable diagnostic on production system without a cycle

• JMX– Java Management: interface to query the state of java virtual

machines

• JRockit (BEA JVM)– Integrated diagnosis of memory leaks– Integrated profiler

• Existing solutions limited to specific types of performance problems

• Stability concerns on production systems

Page 17: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

17© 2006, National Research Council, Canada© 2006, IBM Corporation

Outline

• Introduction: performance problems in COTS-based systems

• Iterative process for diagnosing problems

• Applying the iterative process with Eclipse TPTP – TPTP Platform Project– Tracing and Profiling Tools Project– Monitoring Tools Project– OTS-based Systems Support

• Example

• Conclusions

Page 18: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

18© 2006, National Research Council, Canada© 2006, IBM Corporation

TPTP Platform Project: The Foundation

1) Reference UIs and perspectives– Basic metaphors for interacting with target systems and

resources - includes both remote and local systems.– Extensible UI Frameworks and common navigators,

viewers, editors, and wizards

2) Standard data models, and assets repository– Information model implementations for test, trace, log

and statistical data.– Framework for running rule based queries against data

model instances and some simple queries.

3) Common data collection and execution framework– Execution environment that supports deployment,

launch, and control of test cases and applications– Data collection and control frameworks and agents.– Communication service which is used by the distributed

data collection and control frameworks.

Eclipse Reference UIs

Execution Framework

XMI Assets

Page 19: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

19© 2006, National Research Council, Canada© 2006, IBM Corporation

TPTP Tracing and Profiling Tools Project

• Frameworks for tracing and profiling tools by extending the TPTP platform.

• Profiling tools for both single-system and distributed Java applications.

• A JVMPI monitoring agent that collects trace and profile data.

–Collects and analyzes heap and stack information –JVMTI-based monitoring agent.

• A generic tool kit for probe insertion - can instrument byte code of Java applications.

Page 20: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

20© 2006, National Research Council, Canada© 2006, IBM Corporation

TPTP Monitoring Tools Project

• Frameworks for building monitoring tools by extending the TPTP platform.

• Includes tools for monitoring application servers (JBoss, Jonas, and Websphere) and system performance.

• Collects, analyzes, aggregates, and visualizes data captured in the log and statistical models.

• Supports Common Base Event (CBE), provides services for mapping of custom log formats to CBE, and regular expression based log filtering.

• Correlates data across multiple instances of log and statistical models; also across instances of trace and test history models

–Enables symptom and pattern analysis.

Page 21: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

21© 2006, National Research Council, Canada© 2006, IBM Corporation

OTS-based systems support

OTS Component A

Documentation: Configuration guide Tuning Guide

Logging facilities Logging files

OTS Component B

Documentation: Configuration guide Tuning Guide

Logging facilities Logging files

TPTP Tools

Analysis Correlation Visualization

Page 22: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

22© 2006, National Research Council, Canada© 2006, IBM Corporation

Outline

• Introduction: performance problems in COTS-based systems

• Iterative process for diagnosing problems

• Applying the iterative process with Eclipse TPTP

• Example– Description– Applying the iterative process

• Conclusions

Page 23: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

23© 2006, National Research Council, Canada© 2006, IBM Corporation

Example

Host A Host B

MySQLDatabase

Error log

Slow query log

Tomcat JSP Server

Access Log

httpqueries

web page

databasequery

queryresults

Page 24: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

24© 2006, National Research Council, Canada© 2006, IBM Corporation

Applying the iterative process

• Iteration 1:– Data collection:

• Tomcat: default logging doesn’t provide execution time -> customized to include processing time

• Mysql: Slow query log enabled -> all slow queries recorded• Change the default value of 2 seconds

– Performance problem reproduction– Trace correlation

• Use tomcat and MySQL’s timestamps and execution times• Custom log extractors + parsers

– Trace analysis• Visual analysis with UML2 Interaction diagrams problem located in the databaseSolution: modify integration code

Page 25: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

25© 2006, National Research Council, Canada© 2006, IBM Corporation

Applying the iterative process (2)

• Iteration 2:– Data collection

• ProbeKit: Instrument the integration code

– Performance problem reproduction: same as it 1.– Data correlation: same algorithm as it 1.– Trace analysis: locate the source of the problem inside the

integration code

Page 26: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

26© 2006, National Research Council, Canada© 2006, IBM Corporation

Applying the iterative process (3)

Page 27: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

27© 2006, National Research Council, Canada© 2006, IBM Corporation

Outline

• Introduction: taxonomy of performance problems in COTS-based systems

• Iterative process for diagnosing problems

• Applying the iterative process with Eclipse TPTP

• Example

• Conclusions

Page 28: © 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group

28© 2006, National Research Council, Canada© 2006, IBM Corporation

Conclusion

• Diversity of OTS components– new challenge for solving performance problems

• Existing tools and methods are targeted for custom software– More powerful methods are required

• Solution: – Solve performance problems in a systematic manner with an

iterative process– At each iteration focus on a specific part

• Availability of generic tools for performance analysis– Automate and customize the analysis

• Example and TPTP available at www.eclipse.org/tptp