Software Testing and Reliability Robustness Testing Aditya P. Mathur Purdue University May 19-23, 2003 @Guidant Corporation Minneapolis/St Paul, MN Graduate

Software Testing and ReliabilityRobustness Testing

Aditya P. MathurPurdue UniversityMay 19-23, 2003

@ Guidant CorporationMinneapolis/St Paul, MN

Graduate Assistants: Ramkumar NatarajanBaskar Sridharan

Last update: April 17, 2003

Software Testing and Reliability Aditya P. Mathur 2002 2

Learning objectives What is robustness testing?

How to perform robustness testing? (The ballista test methodolgy.)

Results from the Ballista test methodology.


References

The Ballista Software Robustness Testing Service, John P. DeVale, Philip J. Koopman, and David J. Guttendorf, Testing Computer Software, June 1999. [http://www.ices.cmu.edu/ballista]

Robustness Testing of the Microsoft Win32 API, Charles P. Shelton, Philip J. Koopman, and Koby DeVale, Proceedings of the International Conference on Dependable Systems, June 25-28, 2000, IEEE Press.


Robustness: Review

Robustness testing

Robustness is the degree to which a software component functions correctly in the presence of exceptional inputs or stressful environmental conditions.

Clues come from requirements. The goal is to test a program under scenarios not stipulated in the requirements.


Failures due to Lack of Robustness

Apollo 11: Lunar Module experienced three computer crashes and reboots during powered lunar descent due to exceptional sensor and equipment settings.

Ariane 5 rocket: Flight had to be aborted shortly after liftoff due to an unhandled exceptional error condition.

Others: We observe program crashes almost daily! Often these are due to erroneous or non-handling of exceptional conditions.


Programming error or Requirements

error?

Often, program crashes, or hangs, occur due to the incorrect handling of an exceptional condition.

The input that causes the mishandled exception condition to arise might arise due to an error in the requirements, or in the design, or in the code. However, for a user it is simply a defect in the application that leads to a failure.

The impact of an application crash could be inconvenience caused to its user or as extreme as to a cause a catastrophe.


Why lack of robustness ?

Specifications are often incomplete. This may lead to an implementation that does not handle cases left unspecified.

Resource constraints often leave code segments untested for extreme or unexpected values.

Example: atoi(NULL) ---> segmentation fault?

Why would any one invoke atoi() with a NULL parameter?

What form of black- and white- box testing are needed to detect hidden sources of exceptional conditions?


Robustness, functionality, and operational

profiles

Most development organizations plan for functional testing. Often, various types of black-box testing techniques are used.

Justification of functional testing is relatively straightforward by invoking the “unhappy customer” scenario.

But functional testing is unlikely to lead to the test of exceptional conditions.

Tests based on well-derived operational profiles are also unlikely to test for exceptional conditions for the obvious reason that the operational profile merely prioritizes functionality.


Off-the-shelf components and robustness

Often one composes an application with one or more components acquired from a different vendor and integrating these with code developed in-house.

The code and other related documents for off-the-shelf component might not be available thus making code-based testing impossible.

Hence the need for a methodology for robustness testing that does not require access to the source code of the components under test.


The Ballista Approach to Robustness Testing

Often one composes an application with one or more components acquired from a different vendor and integrating these with code developed in-house.

The code and other related documents for off-the-shelf component might not be available thus making code-based testing impossible.

Hence the need for a methodology for robustness testing that does not require access to the source code of the components under test.


The Ballista Testing Methodology

Based on combinational generation of tests that consist of valid and invalid values of parameters.

For each test case a module under test (MuT) is called once to check if it is robust to that input.

Parameter values are drawn from a pool of valid and invalid values for a set of data types.

Functionality of MuT is not a concern.

Of concern is whether or not MuT crashes or hangs for the test case.


Test Cases from Ballista

Each test case is of the form {MuT, param_value1, param_value2, …param_valueN} and corresponds to the procedure call MuT (param_value1, param_value2, …param_valueN)


Test Cases from Ballista: Example

inttrap(double a, double b, intN)

ZERO

ONE

NEGONE

TWO

PI

PIBY2

TWOPI

E

DBLMAX

DBLMIN

SMALLNOTZERO

NEGSMALLNOTZERO

ZERO

ONE

NEGONE

TWO

PI

PIBY2

TWOPI

E

DBLMAX

DBLMIN

SMALLNOTZERO

NEGSMALLNOTZERO

MAXINT

MININT

ZERO

ONE

NEGONE

2

4

8

16

32

64

1K

64K

inttrap(ONE, DBLMAX, 64K )All combinations are generated.


Scalability of Ballista Tests

Scalability: Growth in the number of test cases is linear, or sub-linear, in the number of modules tested.

Scalability achieved by…

constraining specifications to “don’t crash” and “don’t hang,”

base the test cases on the exceptional values of data types, and

reuse of test cases amongst modules that have interfaces with matching parameters.

Example: 233 functions and system calls in Posix API with real-time extensions required defining only 20 data types.


Date String 12/1/1899

1/1/1900

1/1/2047

8/31/1992

Generic String BIGSTRINGSTRINGLEN1ALLASCIINONPRINTABLE

Organization of Test Values Each data type is treated as a module and fits into a

hierarchical object oriented class structure. Thus, all values (test cases) of a parent type are inherited by the child.

Generic PointerNULLDELETED1K

MAXSIZEINVALID

PAGESIZE


Generation of Test Values Each test value is associated with up to three code segments.

The first code segment creates an instance of a data type with specific properties.

The optional second code segment modifies the effects of the constructor such nas deleting a file while returning a file handle!

The optional third code segment deletes or frees system resources that might be reserved by other constructors such as deleting a file or freeing system memory.


Phantom Parameters

How to handle modules that have no parameters and might input from file(s) and/or system state?.

Set up a dummy module that sets up system state and/or files to be used by the MuT and then invoke the MuT.

Ballista allows the specification of “phantom parameters” during parameter set up. Ballista invokes constructor/destructor pairs for these parameters but does not actually pass them to the MuT.

For example, for a file input, a phantom parameter is defined to create a file. If this file is accessed via, say, a central registry, then phantom parameter can appropriately register the file object.


Failure Types Catastrophic failure corrupts the state of the OS and

generally leads to a restart of the machine. Restart failure causes a process to “hang” requiring a kill signal

for its termination after an application-dependent time interval.

Abort failure causes abnormal termination of the process such as a “core dump.”

Hindering failure is characterized by the MuT returning an error code that incorrectly describes the exception that led to the error. [Might require manual analysis.]

Silent failure causes return of a value with no indication of an error when an exceptional value is input, such as returning a value for log (0). [Might require manual analysis.]


Test Results [Sample]

For 233 function calls across 15 different POSIX compliant OS, normalized failure rates were found to be 9.99% to 22.69%.

Code that produces catastrophic failure on Win 95, 98, and CE:

GetThreadContext(GetCurrentThread(),NULL);

Overall robustness failure rates=Number of failed tests/total tests

Windows 2000: Varies from 7.4%-48.9% (No catastrophic failures)

Windows NT: Varies from 6.2%-51% (No catastrophic failures)

Windows CE: Varies from 6.7-33.4% (excluding catastrophic failures)

Sample:


Balista server

Ballista WEB Server

Interface Specification Capture

Test Selection

Test Reporting

Hardening WrapperCreation

ResultPatternDiscovery

WWW and RPC

BallistaTestClient

ModuleUnderTest

User machine


Summary

What is robustness?

A robustnes testing methodology

Sample robustness failure rates of a few operating systems

Documents

Software Testing and Reliability Robustness Testing Aditya P. Mathur Purdue University May 19-23, 2003 @Guidant Corporation Minneapolis/St Paul, MN Graduate