21
Software Testing and the R Language useR! 2014, U.C.L.A. Stephen Kaluzny TIBCO Software Inc. July 3, 2014 Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 1 / 21

Software Testing and the R language

Embed Size (px)

DESCRIPTION

As the number of packages available for R continues to grow, maintaining and testing these packages becomes more difficult. This difficulty is compounded as independent implementations of the R language, such as TIBCO Enterprise Runtime for R (TERR), are developed. To address this, we have created a test automation framework for testing packages with both TERR and R. We will describe how the framework automatically creates tests from a package's source files. Issues with testing on multiple platforms will be discussed. Suggestions for improving packages with tests will also be presented.

Citation preview

Page 1: Software Testing and the R language

Software Testing and the R LanguageuseR! 2014, U.C.L.A.

Stephen Kaluzny

TIBCO Software Inc.

July 3, 2014

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 1 / 21

Page 2: Software Testing and the R language

Introduction

1 Testing R engines

TIBCO’s approachNot validating R (Andy Nichols talk on Wednesday)

2 Testing R packages

How does one test a package?

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 2 / 21

Page 3: Software Testing and the R language

Motivation and Background

Company has been working on S/R engines for 26 years

Originally S and S-PLUS with StatSci / MathSoft / Insightful / TIBCONow working with TIBCO Enterprise Runtime for R (TERR)

TERR is an embed-able R engine for enterprises

the analytics engine in TIBCO Spotfire visualization software

”Develop in R, Deploy in TERR”

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 3 / 21

Page 4: Software Testing and the R language

Importance of Testing

Need to ensure correct results from the software

Is R giving the correct results?

R Core has this covered fairly well

Do your results match R?

There are now several R engines that need testing, including:

TERRRenjin (jvm)ORE Statistics Engine (Oracle)

Test with use of different software libraries e.g.

linear algebramath functions and statistical distributions (Intel Math Library)

R on special hardware

GPU for linear algebramulti-core systems

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 4 / 21

Page 5: Software Testing and the R language

Testing the TERR Engine

Developers write unit tests as features (functions) are developed

in C++ and R

Quality Control group write tests for each feature

often converting tests developed for S-PLUS

Quality Control group write regression tests for bugs that are fixed

these tests often cover complex areas of the code

We have created several packages to assist with development andtesting

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 5 / 21

Page 6: Software Testing and the R language

Assertion Test Framework

Based on loop test idea for testing from John Chambers for S version3

Each individual test is an S expression that evaluate to true ifsuccessful

Collection of tests for a function are grouped into a file

Each file is run in clean environment

Metadata at the top of the file is used by the test framework

Entire test system is controlled by a Jenkins job

can request only particular tests

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 6 / 21

Page 7: Software Testing and the R language

Assertion Test Metadata

## TestName: glm.t## Package: stats## FunctionalArea: statistical models## Tags: models, glm, model.frame## Description: Basic glm function## RequiredPackages:## UsesRandom: FALSE## Author: ctaylor, meriksso, nmockler## TestType: unit## TestFormat: assertion## NotBeforeVersion:## NotAfterVersion:## ExcludePlatforms:## RunsInSnext: TRUE## RunsInR: TRUE## RunsInSplus:

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 7 / 21

Page 8: Software Testing and the R language

assertionTest Package

Created to support Assertion Test Framework

Not unlike the testthat package

Key Functions:assertionTest

the main function, run on a file of expressions

atExpectStop and atExpectWarningsatUsesRandom

resets seed after each expressionerror if .Random.seed changes outside of this function

atSource

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 8 / 21

Page 9: Software Testing and the R language

Assertion Test Stats

Tests are run after every nightly build

3 build platforms (Win-32, Win-64, Linux-64)4 test platforms (Win7-32, Win7-64, Win8-64, Linux-64)Two compilers on each platform (Intel and gcc or Microsoft)Two builds of each compiler/platform combination (Release, Debug)Results are written to a MySQL database

Over 1000 test files for 16 ”core” packages in TERR

Developer ”unit” assertion tests are run during continous builds

build runs every 10 minutes if there has been a checkinbuild results email goes to any developer who made a checkin for thatbuild

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 9 / 21

Page 10: Software Testing and the R language

Other Tests for TERR

Manual Tests

run by QC group before a release

Performance Tests

spiff test - diff tests (ala Rout.save) with a numerical tolerance

mostly adapted from S-PLUS

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 10 / 21

Page 11: Software Testing and the R language

RinR Package

Evaluates R expressions in a different R interpreter

Includes evaluator functions that evaluate expressions in particularinterpreter

REvaluator - R interpreterTERREvaluator - TERR interpreterLocalEvaluator - current interpreter

makeREvaluator - creates evaluators e.g. R2.15.3Evaluator

multiREvaluator - evaluate an expression in multiple interpreters

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 11 / 21

Page 12: Software Testing and the R language

RinR Examples

# From within TERR:> library(RinR)> REvaluate(version)

platform x86_64-unknown-linux-gnuarch x86_64os linux-gnusystem x86_64, linux-gnustatusmajor 3minor 1.0year 2014month 04day 10svn rev 65387language Rversion.string R version 3.1.0 (2014-04-10)nickname Spring Dance

> .libPaths(c(.libPaths(), REvaluate(.libPaths())))> .libPaths()[1] "/home/TERR/TERR-intel.140620/library"[2] "/home/R/R-3.1.0/lib/R/library"

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 12 / 21

Page 13: Software Testing and the R language

RinR Examples (page 2)

RCompare is often used during development

> compareQuant <- RCompare(quantile(c(0,1,1,2,3,5,8,13)))> class(compareQuant)[1] "sideBySide"> compareQuant

R version 3.1.0 (2014-04-10) TERR version 3.0.0 (2014-06-20)[1,] 0% 25% 50% 75% 100% 0% 25% 50% 75% 100%[2,] 0.00 1.00 2.50 5.75 13.00 0.00 1.00 2.50 5.75 13.00[3,][4,]$all.equal$all.equal$R version 3.1.0 (2014-04-10) vs. TERR version 3.0.0 (2014-06-20)[1] TRUE

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 13 / 21

Page 14: Software Testing and the R language

Future TERR Testing

Add new tests as new features are added

Incorporating tests from core R

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 14 / 21

Page 15: Software Testing and the R language

Testing Packages

One of our biggest challenges

First question from customers: Does package xyzzy work withTERR?

What does it mean for a package to work?

How do you test over 5000 packages?

Package versions and dependencies add to the challenge

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 15 / 21

Page 16: Software Testing and the R language

Package Testing with R

R CMD check does some testingruns files in tests

compare to Rout.save files, if the exist

runs examples from help files

compare to Rout.save files, if they exist in tests/Examples

Some packages formal tests that can be run, using these packages:

RUnit : R Unit test frameworkscriptests : Transcript-based unit tests that are easy to create andmaintainsvUnit : SciViews GUI API - Unit testingtestit : A simple package for testing R packagestestthat : Testthat code. Tools to make testing fun :)

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 16 / 21

Page 17: Software Testing and the R language

CRAN Test Statistics

CRAN packages from early June, 2014 - 5566 packages

961 (17.3%) have at least one tests directory

230 (4.1%) have target files for comparing test results (‘*.Rout.save‘)

394 of the 961 (7.1%) have actual software tests:

262 (4.7%) use testthat

113 (2.0%) use RUnit

10 (0.2%) use svUnit

5 (0.09%) use testit

4 (0.07%) use scriptests

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 17 / 21

Page 18: Software Testing and the R language

Package Tests

Without creating our own specific tests, from a package we can run:

tests (if they exist)example code from the help filescode extracted from any vignettes

Problems with this approach

package tests do not always include output to check resultshelp file code examples seldom have output to comparehelp file code examples show usage, not necessarily testing the codevignette is also to illustrate usage, not for testing the code

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 18 / 21

Page 19: Software Testing and the R language

Testing Packages with TERR

Run tests, help file examples and vignette examples in R and TERR

Compare results of each each expression

Instrumented in a package (‘teaextract‘) that is run from R

use the parallel package to divide up the testing

runs now down to 23 hours from over 6 days

Currently: 1795 CRAN packages run without any errors in TERR

830 give exactly the same results as R

Many packages have a small fraction that do not run

missing a minor function in TERR

Failures with examples that use random numbers, timestamps

Improvement of this system is a priority in the QC group

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 19 / 21

Page 20: Software Testing and the R language

Summary and the Future

Testing R Engines

well covered by R and TIBCO

Testing packages - a big challenge

limited tests in packages, there should be moreuse of testthat is a good signtestCoverage package from Mango Solutions would help

does not solve the problem

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 20 / 21

Page 21: Software Testing and the R language

Questions?

Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 21 / 21