Upload
lou-bajuk-yorgan
View
438
Download
3
Tags:
Embed Size (px)
DESCRIPTION
As the number of packages available for R continues to grow, maintaining and testing these packages becomes more difficult. This difficulty is compounded as independent implementations of the R language, such as TIBCO Enterprise Runtime for R (TERR), are developed. To address this, we have created a test automation framework for testing packages with both TERR and R. We will describe how the framework automatically creates tests from a package's source files. Issues with testing on multiple platforms will be discussed. Suggestions for improving packages with tests will also be presented.
Citation preview
Software Testing and the R LanguageuseR! 2014, U.C.L.A.
Stephen Kaluzny
TIBCO Software Inc.
July 3, 2014
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 1 / 21
Introduction
1 Testing R engines
TIBCO’s approachNot validating R (Andy Nichols talk on Wednesday)
2 Testing R packages
How does one test a package?
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 2 / 21
Motivation and Background
Company has been working on S/R engines for 26 years
Originally S and S-PLUS with StatSci / MathSoft / Insightful / TIBCONow working with TIBCO Enterprise Runtime for R (TERR)
TERR is an embed-able R engine for enterprises
the analytics engine in TIBCO Spotfire visualization software
”Develop in R, Deploy in TERR”
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 3 / 21
Importance of Testing
Need to ensure correct results from the software
Is R giving the correct results?
R Core has this covered fairly well
Do your results match R?
There are now several R engines that need testing, including:
TERRRenjin (jvm)ORE Statistics Engine (Oracle)
Test with use of different software libraries e.g.
linear algebramath functions and statistical distributions (Intel Math Library)
R on special hardware
GPU for linear algebramulti-core systems
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 4 / 21
Testing the TERR Engine
Developers write unit tests as features (functions) are developed
in C++ and R
Quality Control group write tests for each feature
often converting tests developed for S-PLUS
Quality Control group write regression tests for bugs that are fixed
these tests often cover complex areas of the code
We have created several packages to assist with development andtesting
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 5 / 21
Assertion Test Framework
Based on loop test idea for testing from John Chambers for S version3
Each individual test is an S expression that evaluate to true ifsuccessful
Collection of tests for a function are grouped into a file
Each file is run in clean environment
Metadata at the top of the file is used by the test framework
Entire test system is controlled by a Jenkins job
can request only particular tests
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 6 / 21
Assertion Test Metadata
## TestName: glm.t## Package: stats## FunctionalArea: statistical models## Tags: models, glm, model.frame## Description: Basic glm function## RequiredPackages:## UsesRandom: FALSE## Author: ctaylor, meriksso, nmockler## TestType: unit## TestFormat: assertion## NotBeforeVersion:## NotAfterVersion:## ExcludePlatforms:## RunsInSnext: TRUE## RunsInR: TRUE## RunsInSplus:
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 7 / 21
assertionTest Package
Created to support Assertion Test Framework
Not unlike the testthat package
Key Functions:assertionTest
the main function, run on a file of expressions
atExpectStop and atExpectWarningsatUsesRandom
resets seed after each expressionerror if .Random.seed changes outside of this function
atSource
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 8 / 21
Assertion Test Stats
Tests are run after every nightly build
3 build platforms (Win-32, Win-64, Linux-64)4 test platforms (Win7-32, Win7-64, Win8-64, Linux-64)Two compilers on each platform (Intel and gcc or Microsoft)Two builds of each compiler/platform combination (Release, Debug)Results are written to a MySQL database
Over 1000 test files for 16 ”core” packages in TERR
Developer ”unit” assertion tests are run during continous builds
build runs every 10 minutes if there has been a checkinbuild results email goes to any developer who made a checkin for thatbuild
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 9 / 21
Other Tests for TERR
Manual Tests
run by QC group before a release
Performance Tests
spiff test - diff tests (ala Rout.save) with a numerical tolerance
mostly adapted from S-PLUS
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 10 / 21
RinR Package
Evaluates R expressions in a different R interpreter
Includes evaluator functions that evaluate expressions in particularinterpreter
REvaluator - R interpreterTERREvaluator - TERR interpreterLocalEvaluator - current interpreter
makeREvaluator - creates evaluators e.g. R2.15.3Evaluator
multiREvaluator - evaluate an expression in multiple interpreters
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 11 / 21
RinR Examples
# From within TERR:> library(RinR)> REvaluate(version)
platform x86_64-unknown-linux-gnuarch x86_64os linux-gnusystem x86_64, linux-gnustatusmajor 3minor 1.0year 2014month 04day 10svn rev 65387language Rversion.string R version 3.1.0 (2014-04-10)nickname Spring Dance
> .libPaths(c(.libPaths(), REvaluate(.libPaths())))> .libPaths()[1] "/home/TERR/TERR-intel.140620/library"[2] "/home/R/R-3.1.0/lib/R/library"
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 12 / 21
RinR Examples (page 2)
RCompare is often used during development
> compareQuant <- RCompare(quantile(c(0,1,1,2,3,5,8,13)))> class(compareQuant)[1] "sideBySide"> compareQuant
R version 3.1.0 (2014-04-10) TERR version 3.0.0 (2014-06-20)[1,] 0% 25% 50% 75% 100% 0% 25% 50% 75% 100%[2,] 0.00 1.00 2.50 5.75 13.00 0.00 1.00 2.50 5.75 13.00[3,][4,]$all.equal$all.equal$R version 3.1.0 (2014-04-10) vs. TERR version 3.0.0 (2014-06-20)[1] TRUE
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 13 / 21
Future TERR Testing
Add new tests as new features are added
Incorporating tests from core R
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 14 / 21
Testing Packages
One of our biggest challenges
First question from customers: Does package xyzzy work withTERR?
What does it mean for a package to work?
How do you test over 5000 packages?
Package versions and dependencies add to the challenge
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 15 / 21
Package Testing with R
R CMD check does some testingruns files in tests
compare to Rout.save files, if the exist
runs examples from help files
compare to Rout.save files, if they exist in tests/Examples
Some packages formal tests that can be run, using these packages:
RUnit : R Unit test frameworkscriptests : Transcript-based unit tests that are easy to create andmaintainsvUnit : SciViews GUI API - Unit testingtestit : A simple package for testing R packagestestthat : Testthat code. Tools to make testing fun :)
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 16 / 21
CRAN Test Statistics
CRAN packages from early June, 2014 - 5566 packages
961 (17.3%) have at least one tests directory
230 (4.1%) have target files for comparing test results (‘*.Rout.save‘)
394 of the 961 (7.1%) have actual software tests:
262 (4.7%) use testthat
113 (2.0%) use RUnit
10 (0.2%) use svUnit
5 (0.09%) use testit
4 (0.07%) use scriptests
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 17 / 21
Package Tests
Without creating our own specific tests, from a package we can run:
tests (if they exist)example code from the help filescode extracted from any vignettes
Problems with this approach
package tests do not always include output to check resultshelp file code examples seldom have output to comparehelp file code examples show usage, not necessarily testing the codevignette is also to illustrate usage, not for testing the code
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 18 / 21
Testing Packages with TERR
Run tests, help file examples and vignette examples in R and TERR
Compare results of each each expression
Instrumented in a package (‘teaextract‘) that is run from R
use the parallel package to divide up the testing
runs now down to 23 hours from over 6 days
Currently: 1795 CRAN packages run without any errors in TERR
830 give exactly the same results as R
Many packages have a small fraction that do not run
missing a minor function in TERR
Failures with examples that use random numbers, timestamps
Improvement of this system is a priority in the QC group
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 19 / 21
Summary and the Future
Testing R Engines
well covered by R and TIBCO
Testing packages - a big challenge
limited tests in packages, there should be moreuse of testthat is a good signtestCoverage package from Mango Solutions would help
does not solve the problem
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 20 / 21
Questions?
Stephen Kaluzny (TIBCO Software Inc.) Software Testing July 3, 2014 21 / 21