54
Bioinformatics software testing and quality assurance Joshua W. K. Ho Head, Bioinformatics and Systems Medicine Laboratory Victor Chang Cardiac Research Institute Winter School in Mathematical and Computational Biology, UQ 2016-07-06

Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Bioinformatics software testing and quality assurance Joshua W. K. Ho Head, Bioinformatics and Systems Medicine Laboratory Victor Chang Cardiac Research Institute

Winter School in Mathematical and Computational Biology, UQ 2016-07-06

Page 2: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

How do I know the output of my program is correct?

Page 3: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

How do you know an R package is implemented correctly?

Is running the ‘example’ code alone sufficient? If not, how many test cases do we need? How to generate additional test cases? How do we verify the correctness of the outputs?

Page 4: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Nature 2010 (News Feature)

Nature 2013 (News In Focus)

Science 2013 (Policy Forum)

Nature 2015 (World View)

Page 5: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Clinical application: how can we be sure that our variant calling pipeline is implemented correctly?

Challenge: very hard to check the correctness of the output, especially false negatives

Page 6: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Genome Medicine, 2013

Five commonly employed variant-calling pipelines. SNV concordance ~57.4%, and indel concordance ~26.8% => Need caution interpreting results in genomic medicine setting

Genome Medicine, 2014

Compared results from ANNOVAR and VEP (using ENSEMBL transcripts): Matching annotation in only 65% if loss-of-function variants, and 87% of all exonic variants

Page 7: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Why is testing challenging in bioinformatics?

•  Lack of rigorous review (compared to journal peer review) •  We do not spend enough time and energy on testing •  Misusing external software / components

•  Often caused by not checking the limitation and scope of the software function •  Over reliance on ‘validation testing’ – merely check if the results “make sense”

•  Nonetheless it is often hard to exactly check the correctness of the output of complex algorithmic outcomes

•  Often only rely on a very small number of (simple) test cases •  Software fault may only show up in some input cases, so a failure may not be observed

unless we try to search the input space widely. •  Will need to use diverse and realistic test cases

Page 8: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Main objectives

•  Understand the importance and challenges of software testing in bioinformatics

•  Understand basic concepts and techniques in software testing

•  Understand how we can implement QA in bioinformatics, especially in translational genomic applications

Page 9: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Software testing concepts and techniques

Page 10: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Some definitions in software testing

Error: A defect in the human thought process Misspecification of the range of a variable

Fault: Concrete manifestation of an error within the software. A bug, use of wrong parameters, incorrect software dependency

Failure: Departure of the operational software system behaviour from user expectation Test case: Input and execution condition that is developed for verifying the compliance of the program to the specific requirement Oracle: A mechanism to check the correctness of a test result from any given test case.

Page 11: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Case study: testing a kNN classifier

Mary is a bioinformatician in a research lab. Her project studies whether promoter DNA sequences in yeast can distinguish among three groups of genes - highly expressed, highly repressed, and dynamically expressed. She wants to build a supervised classifier of promoter DNA sequences for her research. She downloaded a new R package that was described in a recent publication. The algorithm behind the main function of this package, kNN, is described in the paper and the R ‘help’ page. The ‘example’ code can be executed successfully, and seems to produce reasonable looking outputs, even though you are not 100% sure of the expected outputs.

Page 12: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

What does this R package do?

Class1, TCGATCGATCGGGGATTAGC Class1, ACGATCGGGGACGAGCTACCCATG Class1, CATCGATGGGCTAGCT Class2, ATCGTGGGCTAGCTAGCCCCCC Class2, GATGCTAAACGGGGATCGATCA Class3, ACGTGGATCGAAAAAGCTAGC Class3, GTAGATCGAAATCGATGCATCGAGC Class3, ATGCTAGGGCTAGCTAC

TGATGCGACGATCGATCGCATAC ACGAGGGGCTAGCTACA GCTAGCCCATCGATCTAGATCGAGCGATCGA ACGTTGGCTAGCTACG

Training sequences Training class label

Test sequences

AA AT AC AG TA TT … Class1, 8 2 4 5 2 1 … Class1, 3 5 1 6 7 2 … Class1, 0 2 3 5 2 1 … Class2, 2 9 4 0 9 1 … Class2, 1 4 0 3 0 5 … Class3, 8 1 1 5 1 1 … Class3, 6 0 3 7 7 7 … Class3, 5 2 7 1 6 2 …

AA AT AC AG TA TT … 3 5 1 6 7 2 … 0 2 3 5 2 1 … 2 9 4 0 9 1 … 6 0 3 7 7 7 …

Class1 Class2 Class3 Test data

The label of the test data is determined by the most frequently occurring class in the k nearest training instances. If k=3, the test data will be classified to be Class 3 in this example. If there is a tie of the most frequently occurring class, our classifier will return ‘Uncertain’.

Calculate Euclidian distance (using k-mer frequency) between test data and all training data

Calculation of k-mer frequency

Page 13: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

13

Page 14: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Correct implementation

What are good test cases? What is a good oracle for this program?

Page 15: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

This version has a fault.

Correct: sqrt(sum((thisTest - thisTrain)^2))

Page 16: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

This version has a fault.

Correct: k==kk

Page 17: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Input space

Failure-causing input

Execute by Program Under Test (PUT)

Execute by Program Under Test (PUT)

Verify by Oracle

Verify by Oracle

Successful test case

Failure detected

Test case selection Test execution Output verification

1. Test Case Selection Problem: how to increase the chance of selecting a test case from the failure-causing input

2. Oracle Problem: How to decide if any given test cases is correct?

Page 18: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Can we learn from the software testing field?

A good software testing strategy should actively reveal as many faults as possible using a selected set of test cases. Selection of test cases (input) Special test cases? Random test cases? Test based on program flow? Test based on failure pattern?

Test execution and automation Which test to execute first, what to execute next? When to stop? Can we automate this process?

Verification of test cases How to check correctness for the output from large and complex software? Simulation program? Machine learning software?

Test reporting and documentation Reporting testing for validation and verification

Page 19: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Weyuker (1982) Computer Journal

Page 20: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Example: how to test sin(x)?

sin function sin(0o )=0

sin(30o)=0.5

Suppose the program returns: sin(29.8o )=0.51234 incorrect sin(29.8o )=0.49876 correct? How do I design test cases without knowing the implementation of the program? E.g.,

3 5

sin( )3! 5!x xx x= − + −K

Page 21: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Three standard techniques

Special test cases / special value testing Selected cases where the correct output is known (from external experimental validation or simulation)

N-version programming Check concordance between multiple implementation or variants of the same initial specification

Check that the output is within the ‘expected range’ of values Even thought we cannot determine precisely what the value may be, it is often possible to determine an expected range of values the output should fall in.

21

Page 22: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Three solutions

Solution 1: special test cases, such as 𝑥=0,𝜋/6 , 𝜋/2 ,… Problem: can only test a small subset of inputs

Solution 2: N-version programming, compare the results of multiple independent versions of sin(x) Problem: what happens when an inconsistency is detected?

Solution 3: check the expected range, such as visual inspection of the plot of sin(x) Problem: not quantitative

Page 23: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Compile a test suite of models that have been solved analytically or using numerical method Run each stochastic simulator many times, and check that the result do not deviate substantially from the analytical solution

Software testing concept: Special test cases

What about systems biology?

Page 24: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Software testing concept: N-version programming

Page 25: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

How to determine the correctness of the program output?

Indeed it is very hard to verify the correctness of any given output of these programs (If we know the expected output, we do not need these programs in the first place!) Common techniques for dealing with the Oracle problem Special test cases Selected cases where the correct output is known (from external experimental validation or simulation)

N-version programming Check concordance between multiple implementation or variants of the same initial specification

Expected range Check that the range of value is within expectation

25

Page 26: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

26

Metamorphic Testing

sin function has the following properties sin(x)= sin(x+360o) ……

Execute the program with input x=29.8o and x=389.8o check that sin(29.8o) and sin(389.8o ) Key idea: Multiple execution of the same program. •  Identify the expected output of a program from previously executed test cases

Page 27: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project
Page 28: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Chen et al (2009) BMC Bioinformatics

Core idea: Execute the same program multiple times with slightly modified input, such that their output could be compared to some expected properties

Page 29: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Different MR and test cases have different effectiveness

We tested a network simulator with real and simulated data using 10 Metamorphic Relations on the original and mutant programs

Chen et al (2009) BMC Bioinformatics

Page 30: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Advantages of Metamorphic Testing

•  We can use real data (instead of simulated data) as test cases as there is now a mechanism to verify the output

•  Usually quite easy to implement if you know some properties of the algorithm

•  Can be use in conjunction with other testing techniques, such as special test cases and N-version programming, etc.

Page 31: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

31

Page 32: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

An example MR for a kNN classifier

Source test case: Source input: Train.seq, Train.cls, OneTest.seq, k, kk

Source output: cls

Follow-up test case If cls != ‘Uncertain’:

•  Follow-up input: (Train.seq,OneTest.seq), (Train.cls, cls), OneTest.seq, k, kk

•  Expected follow-up output: cls

Page 33: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Another example MR for a kNN classifier

Source test case: Source input: Train.seq, Train.cls, OneTest.seq, k, kk

Source output: cls

Follow-up test case If cls != ‘Uncertain’:

•  Follow-up input: (Train.seq + duplicate all sequences from class cls), (Train.cls + cls), OneTest.seq, k, kk

•  Expected follow-up output: cls

Page 34: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Other useful tips for choosing good test cases

•  Test boundary values •  Use diverse test cases •  The order of execution of the test cases matters because failure causing inputs

are generally clustered together in the input space

Page 35: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

35

Failure pattern and test case diversity

Page 36: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Failure-causing pattern fixed but unknown

r

o

Page 37: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

t

r

o

Page 38: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

t

r

r

o

Page 39: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

•  Can reduce the number of test cases by up to 50% •  Challenge is to define a good distance measure

Page 40: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Kamali (2015) Biophysical Reviews

Page 41: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Quality assurance in translational bioinformatics

Page 42: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Motivation: Quality assurance of clinically-oriented bioinformatics pipelines for human genetic mutation identification

Genetic counseling, Inform treatment options

Take blood for DNA sequencing

From hundreds of millions of short reads to identify

genetic variants

Page 43: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Motivation – Clinical Guidelines about quality assurance

•  RCPA – Massively Parallel Sequencing Implementation Guidelines •  Aimed at diagnostic laboratories implementing next generation sequencing •  “The validation study must establish the analytical validity of the bioinformatics

pipeline in terms of being able to correctly detect sequence variants” •  “The laboratory must validate the entire bioinformatics pipeline as a whole, under

the given operational environment”

•  ….but did not specify how?

Page 44: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Motivation – Many variant calling pipelines exist, but their results have low concordance…

J. O’Rawe, et al., “Low concordance of multiple variant-calling pipelines: practical implications for exome and genome sequencing,” Genome Med., vol. 5, no. 3, p. 28, Mar. 2013.

Page 45: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

A genomic variant calling pipeline

A sequence with length ~3x109

~200 million sequence reads, each with length 100

~2 million variants

Page 46: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project
Page 47: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Vision: Validation and Quality Control on the cloud

Page 48: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Framework Overview

Page 49: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Tests

Test Description

MR0 Deterministic Output

MR1 random permutation of input

MR2 duplication of reads

MR3 unmapped reads

MR4 mapped reads

SI0 simulated reads – no mutations

SI1 simulated reads – mutations

Page 50: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Results

Page 51: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Results

Cost: on-demand vs spot •  9 x c3.8xlarge instances for 6 hours

($1.68/hr/ instance on-demand) •  76% saving using spot instances

On-Demand Spot $90.72 $21.60

Page 52: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

Summary

•  Understand the importance and challenges of software testing in bioinformatics

•  Understand basic concepts and techniques in software testing

•  Understand how we can implement QA in bioinformatics, especially in translational genomic applications

Page 53: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project

[email protected] http://bioinformatics.victorchang.edu.au

Joint work with

Eleni Giannoulatou (Lab Head)

Michael Troup (RA)

Andrian Yang (PhD student)

Amir Kamali (MPhil student)

Page 54: Bioinformatics software testing and quality assurancebioinformatics.org.au/ws/wp-content/uploads/sites/10/... · 2016-07-21 · Mary is a bioinformatician in a research lab. Her project