Upload
tom-mens
View
124
Download
4
Embed Size (px)
DESCRIPTION
Citation preview
TOWARDS ANEMPIRICAL ANALYSIS
OF THEMAINTAINABILITY OF
CRAN PACKAGESMaëlick Claes
Tom Mens & Philippe Grosjean
&
17th
December 2013, BENEVOL
Software Engineering Lab Numerical Ecology of Aquatic
Systems Lab
0
INTRODUCTIONR & CRAN
Statistical environment based on the S languagePackage systemPackages contain code (R, C, Fortran,...), documentation, examples, tests,datasets,...Dependency relations between packages
http://www.r-project.org
CRAN"Official" repository containing more than 5000 packagesStrict policy for package acceptancePackage quality regularly checked & archive process.Complaints in the community Hornik 2012, Are there too many R packages?
http://cran.r-project.org
"DESCRIPTION" FILE
PACKAGE DEPENDENCIES
HAS GROWTH BECOME LINEAR?
CRAN CHECKS
R CMD check command tool for checking package correctness and quality
Package status: OK, NOTE, WARNING or ERROR
Every package release must pass the test on two OS to be accepted on
CRAN
R CMD check rerun daily on the whole set of packages
Packages with inactive maintainer and ERROR status are archived when
the next non-minor version of R is released
A combination of R version, OS, architecture and C compiler forms a
flavor
EXAMPLESPackage with ERROR status
Package listing
NUMBER OF PACKAGES FOR EACHFLAVOR
HOW DO FLAVORSIMPACT ERRORS?
EVOLUTION OF ERRORS FOREACH FLAVOR
NUMBER OF ERRORS EXTRACTED
WHAT IS THECAUSE OF ERRORS
IN CRAN ANDHOW ARE THEY
FIXED?
SOURCES OF STATUS CHANGESPackage Update (PU)
ERROR status change coincides with the release of a new package
version
Dependency Update (DU)ERROR status change coincides with the release of a new package
version of a dependency
External Factors (EF)ERROR status change without a new package version or a new
version of any dependency
SOURCES OF NEW ERRORS ANDERROR FIXES
ARE ERRORS CAUSED AND FIXEDBY DEPENDENCY UPDATE REALLY
CAUSED AND FIXED BY IT?
WHAT IS THE SOURCE OF ERRORSFIXED BY PACKAGE UPDATE?
HOW LONG DOESIT TAKE TO FIX AN
ERROR?
NUMBER OF DAYS NEEDED TO FIXERRORS
NUMBER OF DAYS NEEDED TO FIXERRORS WITH PACKAGE UPDATE
CONCLUSION
CONCLUSIONSome flavors are more error prone than others
High amount of errors caused by external factors
Most errors fixed quickly without developer intervention
Maintenance effort needs to be focused on fixing errors caused by others
Need for a more specific tool for problems related to dependency changes
FUTURE WORKRefine errors by looking at package content
Static analysis of the R code
Function dependency graph
Rerun some part of R CMD checkImpact of the maintainers on the time required to fix packages and
amount of errors for each flavor
Do packages containing tests are more error prone?
How does the dependency network evolve over time?
THANKS FOR YOUR ATTENTION
QUESTIONS?