Well-defined coverage metrics for the glass box test 2014 Well-defined coverage metrics for the glass box test Slide 2 / 22 Well-defined coverage metrics for the glass box test Agenda:

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014

Well-defined

coverage metrics for

the glass box test

Rainer Schmidberger

[email protected]

ISTE (Institute for Software Technology), University of Stuttgart

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014

Well-defined coverage metrics for the glass box test Slide 2 / 22

Well-defined coverage metrics

for the glass box test

Agenda:

■ Background and motivation

■ Overview of today‘s glass box test

■ A closer look at the underlying models and metrics

■ Requirements for a GBT model

■ My approach: A new and precise model for the GBT

■ The Reduced Program Representation (RPR)

■ RPR execution semantics using Petri nets

■ RPR based metric definition

■ The tool CodeCover

■ Overview

■ Test case selective GBT

■ GBT tool support for test case development

■ Conclusion

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014


Glass box test (1)

■ The glass box test (GBT), also known as white box test or structural test, shows which parts of the program under test

have, or have not, been executed. This degree of execution is

called coverage.

■ GBT-results can be used as test completion criterion or as an

input for developing test cases .

green: executed at least by one of the test cases

red: not executed

yellow: partly executed

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014


Glass box test (2)

■ Tools are required, and many GBT tools are available for almost any programming language.

■ Coverage Report:

■ Empirical studies clearly indicate that higher GBT coverage correlates with lower post-release defect density.

■ Standards for safety critical-software require a very high, or

even complete coverage (e.g. IEC 61508, DO-178B).

So at first glance, the GBT seems to be a well-established and mature testing technique …

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014


A closer look at the underlying models ..

■ Typically the control flow graph (CFG) is used to build an abstraction model of the original program code.

■ Most popular GBT metrics are defined with respect to the CFG.

But the transformation of the real programs into the CFG is

ambiguous:

void foo() {

if(a) {

stmt1;

while(b) {

stmt2;

}

}

}

if(a)

Entry

stmt1

while(b)

stmt2

Exit

Are the entry and exit nodes part of the CFG?

Does the if statement have a distinct end node?

Does the while statement have a distinct end node?

Program code CFG

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014


… and metrics

And even more severe are the missing

representation for

■ exception handling,

■ conditional expressions

■ and the short circuit

operations of Boolean

expressions.

Statement-coverage

Branch-/Block-coverage

CodeCover Version: 1.0.2.2

62,8 % Branch: 50,0 %

Block: 52,2 %

Clover Version: 3.1.0

58,5 %

Emma Version: v2.1.5320

Line: 62,0 % Block: 54,0 %

EclEmma Version: 2.2.1

Instruction: 56,7 % Line: 63,6 %

Block: 50,0 %

eCobertura Version: 0.9.8

64,3 % Branch: 50,0 %

CodePro Version: 7.1.0

Instruction: 57,7 % Line: 58,5 %

Block: 60,6 %

Rational Application Developer V. 9.0.0

Line: 67,0 %

GBT tools show different coverage results for the same execution of a given 45-statement reference Java program

A (new) reference model for the GBT is required!

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014


Requirements for a new model for the GBT

■ The model forms the basis on which the popular control-flow based metrics as well as the logic and conditional expression

based metrics can be defined.

■ The model supports exception handling.

■ There is an easy and precise transformation rule to transform real programs into the model.

■ The model does not depend on any particular programming

language. An algorithm implemented in different programming

languages should have the same model representation.

■ The model specifies how to place the probes in the program

under test that count the execution of the relevant GBT items.

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014


A new and precise model for the GBT

1. Definition of a primitive language RPR(Reduced

Program Representation)

which abstracts the GBT-

relevant aspects of the real programming

languages.

2. Definition of the execution semantics using Petri nets. The nets will also include

the execution counters

that “measure” the

execution of a particular item.

3. On this basis: Precise

definition of the popular GBT-metrics.

Statement = PrimitiveStatement

| IfStatement …

IfStatement = "if" "(" BoolExpression ")"

"then" StatementBlock

"else" StatementBlock.

BoolExpression = Condition | CompoundExpression.

…

A exeStmts (P, T) t T : exe(A, t)

stmtCov(P, T) =| exeStmts(P, T) |

| stmts(P) |

tAbrupttNormal

sIn

sN

sA

Entry area

Executionarea

Exit area

tIn

sCN

sCIn

sCA

tEAbrupttENormal

sE

exe(A, t) | M(sCN) | > 0

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014

Well-defined coverage metrics for the glass box test

1. The model language RPR – Control flow

Program = StatementBlock.

StatementBlock = "{" StatementList "}".

StatementList = Statement

( StatementList | empty ).

Statement = ( PrimitiveStatement

| TerminateStatement

| WhileStatement

| IfStatement

| SwitchStatement

| TryStatement )

SubExpressions.

PrimitiveStatement = "stmt".

TerminateStatement = "throw" | "return" |

"break" | "continue".

IfStatement = "if" "(" BoolExpression ")"

"then" StatementBlock

"else" StatementBlock.

WhileStatement = "while" "(" BoolExpression ")"

StatementBlock.

…

ID

ID

Java RPR

System.out.print(“GBT");

x1 = ((b*(-1) +

Math.sqrt(D))/(2*a));

if(n > 20) { n++; return; } while (x > 0)

field[x] = x--;

S1 stmt []

S2 stmt []

S3 if( ... ) then B2 {

S4 stmt [] S5 return [] } else B3 { } [] S6 while( ... ) B4 { S7 stmt [] } []

RPR Grammer (control flow) Example:

The GBT items are attributed

with a unique identifier. This ID is

necessary to manage coverage information, but is not

part of the original code.

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014

Well-defined coverage metrics for the glass box test

1. The model language RPR – Expressions

Expression = BoolExpression |

ConditionalExpression.

BoolExpression = ( Condition |

CompoundExpression ).

Condition = "expr" SubExpressions .

CompoundExpression = ( "andThen" | "orElse" |

"and" | "or" )

"(" BoolExpression ","

BoolExpression ")".

ConditionalExpression = BoolExpression "?"

SubExpressions ":"

SubExpressions.

SubExpressions = "[" ExpressionList "]".

ExpressionList = Expression ";" ExpressionList | empty.

ID

Java RPR

A && B f(A | B) A = B ? 7 : 42;

E1 andThen( E2 expr [], E3 expr [] ) E1 expr [ E2 or( E3 expr [], E4 expr []); ] S1 stmt [ E1 E2 expr [] ? [] : []; ]

RPR Grammer (expressions) Example:

ID

RPR defines all GBT-relevant aspects of the real programming languages.

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014


2. GBT model nets – primitive items

■ Primitive GBT items such as a primitive statement or a primitive

Boolean expression were described

as a place-bordered petri (sub)net called GBT model net.

■ The model nets have exactly one

distinct input place and one or

more distinct output places for

normal and abrupt completion.

■ Places with empty post-set “count”

the GBT items execution.

tAbrupttNormal

sIn

sN

sA

PrimitiveStatementEntry area

Executionarea

Exit area

tIn

sCN

sCIn

sCA

tEAbrupttENormal

sE

Statement

sS

Statement

sIn

sN

sA

The initial marking is exactly one token in the input place. And there are only final markings with exactly one token in one of the output places.

Because all model nets are a place-bordered and token-pre-

serving they can be abstracted into sub nets or super places.

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014


2. GBT model nets – primitive items

■ Primitive GBT items such as a primitive statement or a primitive

Boolean expression were described

as a place-bordered petri (sub)net called GBT model net.

■ The model nets have exactly one

distinct input place and one or

more distinct output places for

normal and abrupt completion.

■ Places with empty post-set “count”

the GBT items execution.

tAbrupttNormal

sIn

sN

sA

PrimitiveStatementEntry area

Executionarea

Exit area

tIn

sCN

sCIn

sCA

tEAbrupttENormal

sE

Statement

sS

Statement

sIn

sN

sA

The initial marking is exactly one token in the input place. And there are only final markings with exactly one token in one of the output places.

Because all model nets are a place-bordered and token-pre-

serving they can be abstracted into sub nets or super places.

Input place

Output places

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014


2. GBT model nets – complex items

■ The complex GBT items such as if statements or

compound Boolean

expressions contain other

GBT items as part of their own structure. They are

described as a composition

of model nets.

■ RPR provides the composition rules.

■ This embedding technique

automatically provides a

dominance relationship between the GBT items.

tAbrupt

tNormal

sIn

sN

sA

IfStatement

Entry area

Exit area

C

sInsA

sFsT

sIn sA

sN

StatementBlock[then-Block]

B1sIn

sAsN

B2

BoolExpression

StatementBlock

[else-Block]

Executionarea

tIn

sCNsCA

sCIn

The sub net of an Boolean expression

The sub nets of statement blocks

A = ddom(B) A ddom B the model net of B is (directly )

embedded in the model net of A

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014


3. GBT metric definition

■ Statement coverage:

P is a RPR program, stmts(P) is the set of all GBT items

corresponding to the RPR statement production. T is a set of

test cases.

A exeStmts (P, T) A stmts(P) t T : exe(A, t)

stmtCov(P, T) =| exStmts(P, T) |

| stmts(P) |

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014


3. GBT metric definition

■ Statement coverage:

P is a RPR program, stmts(P) is the set of all GBT items

corresponding to the RPR statement production. T is a set of

test cases.

A exeStmts (P, T) A stmts(P) t T : exe(A, t)

stmtCov(P, T) =| exStmts(P, T) |

| stmts(P) |

Additional GBT Metrics:

Branch coverage

The degree of executed branches

Block coverage

The degree of executed statement blocks

Decision coverage

Like Branch coverage, but “forking” expressions are also taken into account

Loop coverage, Term coverage and MC/DC

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014


Coverage visualization and coverage report are based on the selected test cases only.

CodeCover Perspective

Coverage report

Coverage visualization

The tool CodeCover

www.CodeCover.org

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014


Coverage visualization and coverage report are based on the selected test cases only.

CodeCover Perspective

Coverage report

Coverage visualization

The tool CodeCover

www.CodeCover.org

CodeCover key features:

Has frontends for Java, C and COBOL

Is a reference implementation for the described GBT metrics

Is Eclipse integrated and provides Ant interfaces

Supports the test case selective GBT

Eclipse Public Licence (EPL)

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014


Init

Store

JMX-Interface

PUT with CodeCover-enhancements

Test suite GBT log-file

double getTotal(Customer sustomer, double b) {

// normal customer: no discount

double discount = 0.0;

if(customer.getRevenue() > 2000) {

// All major customers get 10% discount

discount = 0.1;

if(customer.isCommercial()) {

// commercial customers have add. 5%

discount += 0.05;

}

}

double total = b * (1.0 - discount);

return total;

}

Test case ID 4711

Name Print invoice

Precondition Customer is selected

Action, Inputs

1. …

2. …

Expected Result

… 201.40€ …

Counter values for test case 4711

S1 3

S2 3

B1 1

…

Test case selective GBT

■ The test case selective GBT provides evaluations not only for the entire test suite but also for a single test case.

■ It can be used for example for selective regression testing or

test case development.

„Justus“

justus.tigris.org

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014


Developing new test cases – basic concept

■ Tool-based support for the tester: Developing new input data for test cases that increase coverage.

double getTotal(Customer customer, double b) {

// normal customer: no discount

double discount = 0.0;

if(customer.getRevenue() > 2000) {

// All major customers get 10% discount

discount = 0.1;

if(customer.isCommercial()) {

// commercial customers have add. 5%

discount += 0.05;

}

}

double total = b * (1.0 - discount);

return total;

}

The test target

The dominator of the test target

How can I find input values in order to

execute this part of the program?

Idea: Use the test cases as starting point that execute the dominator of the test target!

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014


Developing new test cases - definitions

P is a RPR program, A P is a GBT item – the test target, T is a set

of test cases for P, and t T a test case.

t testcases(A) t T exe(A, t)

Select all GBT items A with

| testcases(A) | = 0 | testcases(ddom(A)) | > 0

- all not executed GBT items with an executed direct dominator -

and put them into a list:

Each entry in this list is called test case recommandation and

provides systematical support for the tester to develop new test cases that increases coverage.

GBT item A ddom(A) testcases(ddom(A))

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014


CodeCovers „RecommendationsView“

www.CodeCover.org

se

Ra

ine

r Sc

hm

idb

erg

er

2

4.0

4.2

014


Conclusion

■ To date, most GBT metrics are defined either intuitively or based on the CFG. But both definitions have severe shortcomings.

■ In the approach described, the popular GBT metrics are

precisely defined by using the following model:

■ A notation (RPR) which is applicable to a large class of programming languages. In RPR control flow, expressions and

exception handling are well integrated.

■ Model nets that describe the execution semantics of the GBT

items in a mathematical sound way. The model net’s counters provide a precise specification for the code

instrumentation. The model net also provides a dominance

relationship between the GBT items.

■ The tool CodeCover ■ is a reference implementation of the presented metrics, and

■ offers a tool based technique that supports the tester in

developing new test cases.

Documents

Well-defined coverage metrics for the glass box test 2014 Well-defined coverage metrics for the glass box test Slide 2 / 22 Well-defined coverage metrics for the glass box test Agenda: