Upload
others
View
15
Download
0
Embed Size (px)
Citation preview
Package ‘compositions’June 14, 2018
Version 1.40-2
Date 2018-05-16
Title Compositional Data Analysis
Author K. Gerald van den Boogaart <[email protected]>,Raimon Tolosana-Delgado, Matevz Bren
Maintainer K. Gerald van den Boogaart <[email protected]>
Depends R (>= 2.2.0), tensorA, robustbase, energy, bayesm
Suggests rgl,combinat
Description Provides functions for the consistent analysis of compositionaldata (e.g. portions of substances) and positive numbers (e.g. concentrations)in the way proposed by J. Aitchison and V. Pawlowsky-Glahn.
License GPL (>= 2)
URL http://www.stat.boogaart.de/compositions
NeedsCompilation yes
Repository CRAN
Date/Publication 2018-06-14 12:58:27 UTC
R topics documented:compositions-package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5acomp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12acomparith . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14acompmargin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16acompscalarproduct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Activity10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Activity31 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19alr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21AnimalVegetation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23aplus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23aplusarithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25apt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1
2 R topics documented:
ArcticLake . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28arrows3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29as.data.frame . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30axis3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31balance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32barplot.acomp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34Bayesite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36binary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37biplot3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39Blood23 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40Boxite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41boxplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42ccomp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44ccompgof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45cdt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48ClamEast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50ClamWest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51clo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52clr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53clr2ilr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55ClusterFinder1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56CoDaDendrogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58coloredBiplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61colorsForOutliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63CompLinModCoReg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64compOKriging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66ConfRadius . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67cor.acomp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68Coxite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70cpt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71DiagnosticProb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72dist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73ellipses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74endmemberCoordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76Firework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78fitdirichlet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79fitSameMeanDifferentVarianceModel . . . . . . . . . . . . . . . . . . . . . . . . . . . 80gausstest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81geometricmean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82getdetectionlimit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83Glacial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84gof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85groupparts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88Hongite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89HotellingsTsq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90HouseholdExp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91Hydrochem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91idt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
R topics documented: 3
iit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95ilr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96ilrBase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98ilt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99ipt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100is.acomp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102IsMahalanobisOutlier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103isoPortionLines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104jura . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106kingTetrahedron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107Kongite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110logratioVariogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111lrvgram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113MahalanobisDist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114matmult . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116mean.acomp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117meanrow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118Metabolites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119missing.compositions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120missingProjector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123missingsummary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125mix.Read . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126mvar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131normalize . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133NormalTests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134oneOrDataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135outlierclassifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136outlierplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138outliersInCompositions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142pairwiseplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145parametricMat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147perturbe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148plot.acomp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150plot.aplus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153plot3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155plot3Dacomp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156plot3Daplus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158plot3Drmult . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159plot3Drplus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160plotlogratioVariogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162plotmissingsummary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163PogoJump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164powerofpsdmatrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165princomp.acomp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166princomp.aplus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
4 R topics documented:
princomp.rcomp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172princomp.rmult . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174princomp.rplus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176print.acomp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178pwlrPlot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180qqnorm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183rAitchison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184ratioLoadings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187rcomp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189rcomparithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192rcompmargin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193rDirichlet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194Read standard data files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195replot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197rlnorm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199rMahalanobis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200rmult . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202rmultarithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203rmultmatmult . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204rnorm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206robustnessInCompositions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208rplus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209rplusarithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211rpois . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212runif . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213scalar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215Sediments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217segments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218SerumProtein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219ShiftOperators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220simplemissingplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221SimulatedAmounts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222simulatemissings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229Skulls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231SkyeAFM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231split . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232straight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233summary.acomp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235summary.aplus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236summary.rcomp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237sumprojector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238Supervisor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240ternaryAxis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241totals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243transformations from ’mixtures’ to ’compositions’ classes . . . . . . . . . . . . . . . . . 244tryDebugger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
compositions-package 5
ult . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246var.acomp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250variograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251varmlm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253vcovAcomp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254vgmFit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255WhiteCells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256Yatquat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257zeroreplace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
Index 260
compositions-package Compositional Data Analysis
Description
"compositions" is a package for the analysis of compositional and multivariate positive data (gen-erally called "amounts"), based on several alternative approaches.
Details
The DESCRIPTION file:
Package: compositionsVersion: 1.40-2Date: 2018-05-16Title: Compositional Data AnalysisAuthor: K. Gerald van den Boogaart <[email protected]>, Raimon Tolosana-Delgado, Matevz BrenMaintainer: K. Gerald van den Boogaart <[email protected]>Depends: R (>= 2.2.0), tensorA, robustbase, energy, bayesmSuggests: rgl,combinatDescription: Provides functions for the consistent analysis of compositional data (e.g. portions of substances) and positive numbers (e.g. concentrations) in the way proposed by J. Aitchison and V. Pawlowsky-Glahn.License: GPL (>= 2)URL: http://www.stat.boogaart.de/compositions
Index of help topics:
structureacomp Aitchison compositionsacompmargin Marginal compositions in Aitchison CompositionsActivity10 Activity patterns of a statistician for 20 daysActivity31 Activity patterns of a statistician for 20 daysalr Additive log ratio transform
6 compositions-package
AnimalVegetation Animal and vegetation measurement+.aplus vectorial arithmetic for data sets with aplus
classaplus Amounts analysed in log-scaleapt Additive planar transformArcticLake Artic lake sediment samples of different water
deptharrows3D arrows in 3D, based on package rglas.data.frame.acomp Convert "compositions" classes to data framesaxis3D Drawing a 3D coordiante system to a plot, based
on package rglbalance Compute balances for a compositional dataset.barplot.acomp Bar charts of amountsBayesite Permeabilities of bayesitebinary Treating binary and g-adic numbersbiplot3D Three-dimensional biplots, based on package rglBlood23 Blood samplesBoxite Compositions and depth of 25 specimens of
boxiteboxplot.acomp Displaying compositions and amounts with
box-plotscdt Centered default transformClamEast Color-size compositions of 20 clam colonies
from East BayClamWest Color-size compositions of 20 clam colonies
from West Bayclo Closure of a compositionclr Centered log ratio transformclr2ilr Convert between clr and ilr, and between cpt
and ipt.ClusterFinder1 Heuristics to find subpopulations of outliersCoDaDendrogram Dendrogram representation of acomp or rcomp
objectscoloredBiplot A biplot providing somewhat easier access to
details of the plot.colorsForOutliers1 Create a color/char palette or for groups of
outliersCompLinModCoReg Compositional Linear Model of CoregionalisationcompOKriging Compositional Ordinary Krigingcompositions-package library(compositions)ConfRadius Helper to compute confidence ellipsoidscor.acomp Correlations of amounts and compositionsCoxite Compositions, depths and porosities of 25
specimens of coxitecpt Centered planar transformDiagnosticProb Diagnostic probabilitiesdist Distances in variouse approachesellipses Draw ellipses
compositions-package 7
endmemberCoordinates Recast amounts as mixtures of end-membersFirework Firework mixturesgeometricmean The geometric meangetDetectionlimit Gets the detection limit stored in the data setGlacial Compositions and total pebble counts of 92
glacial tillsgroupparts Group amounts of partsHongite Compositions of 25 specimens of hongiteHouseholdExp Household ExpendituresHydrochem Hydrochemical composition data set of Llobregat
river basin water (NE Spain)idt Isometric default transformiit Isometric identity transformilr Isometric log ratio transformilrBase The canonical basis in the clr plane used for
ilr and ipt transforms.ilt Isometric log transformipt Isometric planar transformis.acomp Check for compositional data typeIsMahalanobisOutlier Checking for outliersisoPortionLines Isoportion- and Isoproportion-linesjuraset The jura datasetkingTetrahedron Ploting composition into rotable tetrahedronKongite Compositions of 25 specimens of kongitelines.rmult Draws connected lines from point to point.logratioVariogram Empirical variograms for compositionsMahalanobisDist Compute Mahalanobis distances based von robust
Estimationsmean.acomp Mean amounts and mean compositionsmeanRow The arithmetic mean of rows or columnsMetabolites Steroid metabolite patterns in adults and
childrenmissingProjector Returns a projector the the observed space in
case of missings.missingsInCompositions
The policy of treatment of missing values inthe "compositions" package
missingSummary Classify and summarize missing values in adataset
mix.2aplus Transformations from 'mixtures' to'compositions' classes
mix.Read Reads a data file in a mixR formatmvar Metric summary statistics of real, amount or
compositional datanames.acomp The names of the partsnormalize Normalize vectors to norm 1norm.default Vector space normoneOrDataset Treating single compositions as one-row
8 compositions-package
datasetsOutlierClassifier1 Detect and classify compositional outliers.outlierplot Plot various graphics to analyse outliers.outliersInCompositions
Analysing outliers in compositions.pairwisePlot Creates a paneled plot like pairs for two
different datasets.parametricPosdefMat Unique parametrisations for matrices.perturbe Perturbation of compositionsplot3D plot in 3D based on rglplot3D.acomp 3D-plot of compositional dataplot3D.aplus 3D-plot of positive dataplot3D.rmult plot in 3D based on rglplot3D.rplus plot in 3D based on rglplot.acomp Ternary diagramsplot.aplus Displaying amounts in scatterplotsplot.logratioVariogram
Empirical variograms for compositionsplot.missingSummary Plot a Missing SummarypMaxMahalanobis Compute distributions of empirical Mahalanobis
distances based on simulationsPogoJump Honk Kong Pogo-Jumps Championshippower.acomp Power transform in the simplexpowerofpsdmatrix power transform of a matrixprincomp.acomp Principal component analysis for Aitchison
compositionsprincomp.aplus Principal component analysis for amounts in log
geometryprincomp.rcomp Principal component analysis for real
compositionsprincomp.rmult Principal component analysis for real dataprincomp.rplus Principal component analysis for real amountsprint.acomp Printing compositional data.qHotellingsTsq Hotellings T square distributionqqnorm.acomp Normal quantile plots for compositions and
amountsR2 R squarerAitchison Aitchison Distribution+.rcomp Arithmetic operations for compositions in a
real geometryrcomp Compositions as elements of the simplex
embedded in the D-dimensional real spacercompmargin Marginal compositions in real geometryrDirichlet Dirichlet distributionread.geoeas Reads a data file in a geoeas formatrelativeLoadings Loadings of relations of two amountsreplot Modify parameters of compositional plots.rlnorm.rplus The multivariate lognormal distribution
compositions-package 9
+.rmult vectorial arithmetic for datasets in aclassical vector scale
rmult Simple treatment of real vectorsrnorm.acomp Normal distributions on special spacesrobustnessInCompositions
Handling robustness issues and outliers incompositions.
+.rplus vectorial arithmetic for data sets with rplusclass
rplus Amounts i.e. positive numbers analysed asobjects of the real vector space
runif.acomp The uniform distribution on the simplexscalar Parallel scalar productsscale Normalizing datasets by centering and scalingSediments Proportions of sand, silt and clay in sediments
specimenssegments.rmult Draws straight lines from point to point.SerumProtein Serum Protein compositions of blood samplesShiftOperators Shifts of machine operatorssimpleMissingSubplot Ternary diagramsSimulatedAmounts Simulated amount datasetssimulateMissings Artifical simulation of various kinds of
missingsSkulls Measurement of skullsSkyeAFM AFM compositions of 23 aphyric Skye lavassplit.acomp Splitting datasets in groups given by factorsstraight Draws straight lines.summary.acomp Summarizing a compositional dataset in terms of
ratiossummary.aplus Summaries of amountssummary.rcomp Summary of compositions in real geometrysumMissingProjector Compute the global projector to the observed
subspace.Supervisor Proportions of supervisor's statements assigned
to different categoriesternaryAxis Axis for ternary diagramstotals Total sum of amountstryDebugger Empirical variograms for compositionsult Uncentered log transformvar.acomp Variances and covariances of amounts and
compositionsvariation Variation matrices of amounts and compositionsvar.lm Residual variance of a modelvcovAcomp Variance covariance matrix of parameters in
compositional regressionvgmFit Compositional variogram model fittingvgram2lrvgram vgram2lrvgram
10 compositions-package
vgram.sph Variogram functionsWhiteCells White-cell composition of 30 blood samples by
two different methodsYatquat Yatquat fruit evaluationzeroreplace Zero-replacement routine
Further information is available in the following vignettes:
UsingCompositions Using Animal (source, pdf)
To get detailed "getting started" introduction use help.start() or help.start(browser="myfavouritebrowser")Go to "Packages" then "compositions" and then "overview" and then launch the file "UsingCompo-sitions.pdf" from there. Please also check the web-site: http://www.stat.boogaart.de/compositionsfor improved material and our new book expected to appear spring 2009.The package is devoted to the analysis of multiple amounts. Amounts have typically non-negativevalues, and often sum up to 100% or one. These constraints lead to spurious effects on the covari-ance structure, as pointed out by Chayes (1960). The problem is treated rigorously in the monog-raphy by Aitchison (1986), who characterizes compositions as vectors having a relative scale, andidentifies its sample space with the D-part simplex. However still (i.e. 2005) most statistical pack-ages do not provided any support for this scale.The grounding idea of the package exploits the class concept: the analyst gives the data a com-positional or amount class, and then all further analysis are (should be) automatically done in aconsistent way, e.g. x <- acomp(X); plot(x) should plot the data as a composition (in a ternarydiagram) directly without any further interaction of the user.The package provides four different approaches to analyse amounts. These approaches are asso-ciated to four R-classes, representing four different geometries of the sampling space of amounts.These geometries depend on two questions: whether the total sum of the amounts is a relevant in-formation, and which is the meaningful measure of difference of the data.
rplus : (Real Plus) The total amount matters, and amounts should be compared on an absolutebasis. i.e. the difference between 1g and 2g is the same as the difference between 1kg and 1001g,one gram.aplus : (Aitchison Plus) The total amount matters, but amounts should be compared relatively, i.e.the difference between 1mg and 2mg is the same as that of 1g and 2g: the double.acomp : (Aitchison composition) the total amount is constant (or an artifact of the sampling/measurementprocedure), and the meaningful difference is a relative one. This class follows the original proposalsof Aitchison.rcomp : (Real composition) the sum is a constant, and the difference in amount from 0% to 1% andfrom 10% to 11% is regarded as equal. This class represents the raw/naive treatment of composi-tions as elements of the real simplex based on an absolute geometry. This treatment is implicitlyused in most amalgamation problems. However the whole approach suffers from the drawbacksand problems discussed in Chayes (1960) and Aitchison (1986).The aim of the package is to provide all the functionality to do a consistent analysis in all of theseapproaches and to make the results obtained with different geometries as easy to compare as possi-ble.
compositions-package 11
Note
The package compositions has grown a lot in the last year: missings, robust estimations, outlier de-tection and classification, codadendrogram. This makes everything much more complex especiallyfrom the side of programm testing. Thus we would like to urge our users to report all errors andproblems of the lastest version (please check first) to [email protected].
Author(s)
K. Gerald van den Boogaart <[email protected]>, Raimon Tolosana-Delgado, Matevz Bren
Maintainer: K. Gerald van den Boogaart <[email protected]>
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
Aitchison, J, C. Barcel’o-Vidal, J.J. Egozcue, V. Pawlowsky-Glahn (2002) A consise guide to the al-gebraic geometric structure of the simplex, the sample space for compositional data analysis, TerraNostra, Schriften der Alfred Wegener-Stiftung, 03/2003
Billheimer, D., P. Guttorp, W.F. and Fagan (2001) Statistical interpretation of species composition,Journal of the American Statistical Association, 96 (456), 1205-1214
Chayes, F. (1960). On correlation between variables of constant sum. Journal of GeophysicalResearch 65~(12), 4185–4193.
Pawlowsky-Glahn, V. and J.J. Egozcue (2001) Geometric approach to statistical analysis on thesimplex. SERRA 15(5), 384-398
Pawlowsky-Glahn, V. (2003) Statistical modelling on coordinates. In: Thi\’o -Henestrosa, S. andMart\’in-Fern\’a ndez, J.A. (Eds.) Proceedings of the 1st International Workshop on CompositionalData Analysis, Universitat de Girona, ISBN 84-8458-111-X, http://ima.udg.es/Activitats/CoDaWork03
Mateu-Figueras, G. and Barcel\’o-Vidal, C. (Eds.) Proceedings of the 2nd International Workshopon Compositional Data Analysis, Universitat de Girona, ISBN 84-8458-222-1, http://ima.udg.es/Activitats/CoDaWork05
van den Boogaart, K.G. and R. Tolosana-Delgado (2008) "compositions": a unified R package to an-alyze Compositional Data, Computers & Geosciences, 34 (4), pages 320-338, doi:10.1016/j.cageo.2006.11.017.
See Also
compositions-package, missingsInCompositions, robustnessInCompositions, outliersInCompositions,
12 acomp
Examples
library(compositions) # load librarydata(SimulatedAmounts) # load data sa.lognormalsx <- acomp(sa.lognormals) # Declare the dataset to be compositional
# and use relative geometryplot(x) # plot.acomp : ternary diagramellipses(mean(x),var(x),r=2,col="red") # Simplex 2sigma predictive regionpr <- princomp(x)straight(mean(x),pr$Loadings)
x <- rcomp(sa.lognormals) # Declare the dataset to be compositional# and use absolute geometry
plot(x) # plot.acomp : ternary diagramellipses(mean(x),var(x),r=2,col="red") # Real 2sigma predictive regionpr <- princomp(x)straight(mean(x),pr$Loadings)
acomp Aitchison compositions
Description
A class providing the means to analyse compositions in the philosophical framework of the Aitchi-son Simplex.
Usage
acomp(X,parts=1:NCOL(oneOrDataset(X)),total=1,warn.na=FALSE,detectionlimit=NULL,BDL=NULL,MAR=NULL,MNAR=NULL,SZ=NULL)
Arguments
X composition or dataset of compositions
parts vector containing the indices xor names of the columns to be used
total the total amount to be used, typically 1 or 100
warn.na should the user be warned in case of NA,NaN or 0 coding different types ofmissing values?
detectionlimit a number, vector or matrix of positive numbers giving the detection limit of allvalues, all columns or each value, respectively
BDL the code for ’Below Detection Limit’ in X
SZ the code for ’Structural Zero’ in X
MAR the code for ’Missing At Random’ in X
MNAR the code for ’Missing Not At Random’ in X
acomp 13
Details
Many multivariate datasets essentially describe amounts of D different parts in a whole. This hassome important implications justifying to regard them as a scale for its own, called a composition.This scale was in-depth analysed by Aitchison (1986) and the functions around the class "acomp"follow his approach.Compositions have some important properties: Amounts are always positive. The amount of everypart is limited to the whole. The absolute amount of the whole is noninformative since it is typicallydue to artifacts on the measurement procedure. Thus only relative changes are relevant. If the rel-ative amount of one part increases, the amounts of other parts must decrease, introducing spuriousanticorrelation (Chayes 1960), when analysed directly. Often parts (e.g H2O, Si) are missing in thedataset leaving the total amount unreported and longing for analysis procedures avoiding spuriouseffects when applied to such subcompositions. Furthermore, the result of an analysis should beindepent of the units (ppm, g/l, vol.%, mass.%, molar fraction) of the dataset.From these properties Aitchison showed that the analysis should be based on ratios or log-ratiosonly. He introduced several transformations (e.g. clr,alr), operations (e.g. perturbe, power.acomp),and a distance (dist) which are compatible with these properties. Later it was found that the setof compostions equipped with perturbation as addition and power-transform as scalar multiplica-tion and the dist as distance form a D-1 dimensional euclidean vector space (Billheimer, Fagan andGuttorp, 2001), which can be mapped isometrically to a usual real vector space by ilr (Pawlowsky-Glahn and Egozcue, 2001).The general approach in analysing acomp objects is thus to perform classical multivariate analysison clr/alr/ilr-transformed coordinates and to backtransform or display the results in such a way thatthey can be interpreted in terms of the original compositional parts.A side effect of the procedure is to force the compositions to sum up to a total , which is done bythe closure operation clo .
Value
a vector of class "acomp" representing one closed composition or a matrix of class "acomp" repre-senting multiple closed compositions each in one row.
Missing Policy
The policy of treatment of zeroes, missing values and values below detecion limit is explained indepth in compositions.missing.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de, Raimon Tolosana-Delgado
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
Aitchison, J, C. Barcel’o-Vidal, J.J. Egozcue, V. Pawlowsky-Glahn (2002) A consise guide to the al-gebraic geometric structure of the simplex, the sample space for compositional data analysis, TerraNostra, Schriften der Alfred Wegener-Stiftung, 03/2003
14 acomparith
Billheimer, D., P. Guttorp, W.F. and Fagan (2001) Statistical interpretation of species composition,Journal of the American Statistical Association, 96 (456), 1205-1214
Chayes, F. (1960). On correlation between variables of constant sum. Journal of GeophysicalResearch 65~(12), 4185–4193.
Pawlowsky-Glahn, V. and J.J. Egozcue (2001) Geometric approach to statistical analysis on thesimplex. SERRA 15(5), 384-398
Pawlowsky-Glahn, V. (2003) Statistical modelling on coordinates. In: Thi\’o-Henestrosa, S. andMart\’in-Fern\’andez, J.A. (Eds.) Proceedings of the 1st International Workshop on CompositionalData Analysis, Universitat de Girona, ISBN 84-8458-111-X, http://ima.udg.es/Activitats/CoDaWork03
Mateu-Figueras, G. and Barcel\’o-Vidal, C. (Eds.) Proceedings of the 2nd International Workshopon Compositional Data Analysis, Universitat de Girona, ISBN 84-8458-222-1, http://ima.udg.es/Activitats/CoDaWork05
van den Boogaart, K.G. and R. Tolosana-Delgado (2008) "compositions": a unified R package to an-alyze Compositional Data, Computers & Geosciences, 34 (4), pages 320-338, doi:10.1016/j.cageo.2006.11.017.
See Also
clr,rcomp, aplus, princomp.acomp, plot.acomp, boxplot.acomp, barplot.acomp, mean.acomp,var.acomp, variation.acomp, cov.acomp, msd
Examples
data(SimulatedAmounts)plot(acomp(sa.lognormals))
acomparith Power transform in the simplex
Description
The Aitchison Simplex with its two operations perturbation as + and power transform as * is avector space. This vector space is represented by these operations.
Usage
power.acomp(x,s)## Methods for class "acomp"## x*y## x/y
acomparith 15
Arguments
x an acomp composition or dataset of compositions (or a number or a numericvector)
y a numeric vector of size 1 or nrow(x)
s a numeric vector of size 1 or nrow(x)
Details
The power transform is the basic multiplication operation of the Aitchison simplex seen as a vectorspace. It is defined as:
(x ∗ y)i := clo((xyii )i)i
The division operation is just the multiplication with 1/y.
Value
An "acomp" vector or matrix.
Note
For * the arguments x and y can be exchanged. Note that this definition generalizes the powerby a scalar, since y or s may be given as a scalar, or as a vector with as many components as thecomposition in acomp x. The result is then a matrix where each row corresponds to the compositionpowered by one of the scalars in the vector.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
Aitchison, J, C. Barcel’o-Vidal, J.J. Egozcue, V. Pawlowsky-Glahn (2002) A consise guide to the al-gebraic geometric structure of the simplex, the sample space for compositional data analysis, TerraNostra, Schriften der Alfred Wegener-Stiftung, 03/2003
Pawlowsky-Glahn, V. and J.J. Egozcue (2001) Geometric approach to statistical analysis on thesimplex. SERRA 15(5), 384-398
http://ima.udg.es/Activitats/CoDaWork03
http://ima.udg.es/Activitats/CoDaWork05
See Also
ilr,clr, alr,
16 acompmargin
Examples
acomp(1:5)* -1 + acomp(1:5)data(SimulatedAmounts)cdata <- acomp(sa.lognormals)plot( tmp <- (cdata-mean(cdata))/msd(cdata) )class(tmp)mean(tmp)msd(tmp)var(tmp)
acompmargin Marginal compositions in Aitchison Compositions
Description
Compute marginal compositions of selected parts, by computing the rest as the geometric mean ofthe non-selected parts.
Usage
acompmargin(X,d=c(1,2),name="*",pos=length(d)+1,what="data")
Arguments
X composition or dataset of compositionsd vector containing the indices xor names of the columns selectedname The new name of the amalgamation columnpos The position where the new amalgamation column should be stored. This de-
faults to the last column.what The role of X either "data" for data (or means) to be transformed or "var" for
(acomp-clr)-variances to be transformed.
Details
The amalgamation column is simply computed by taking the geometric mean of the non-selectedcomponents. This is consistent with the acomp approach and gives clear ternary diagrams. However,this geometric mean is difficult to interpret.
Value
A closed compositions with class "acomp" containing the variables given by d and the the amalga-mation column.
Missing Policy
MNAR has the highest priority, MAR afterwards, and WZERO (BDL,SZ) values are considered as0 and finally reported as BDL.
acompscalarproduct 17
Author(s)
Raimon Tolosana-Delgado, K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
Vera Pawlowsky-Glahn (2003) personal communication. Universitat de Girona. [email protected]
van den Boogaart, K.G. and R. Tolosana-Delgado (2008) "compositions": a unified R package to an-alyze Compositional Data, Computers & Geosciences, 34 (4), pages 320-338, doi:10.1016/j.cageo.2006.11.017.
See Also
rcompmargin, acomp
Examples
data(SimulatedAmounts)plot.acomp(sa.lognormals5,margin="acomp")plot.acomp(acompmargin(sa.lognormals5,c("Pb","Zn")))plot.acomp(acompmargin(sa.lognormals5,c(1,2)))
acompscalarproduct inner product for datasets with a vector space structure
Description
acomp and aplus objects are considered as (sets of) vectors. The %*% is considered as the innermultiplication. An inner multiplication with another vector is the scalar product. An inner multipli-cation with a matrix is a matrix multiplication, where the vectors are either considered as row or ascolumn vector.
Usage
## S3 method for class 'acomp'x %*% y## S3 method for class 'aplus'x %*% y
Arguments
x a acomp or aplus object or a matrix interpreted in clr, ilr or ilt coordinates
y a acomp or aplus object or a matrix interpreted in clr, ilr or ilt coordinates
18 Activity10
Details
The operators try to mimic the behavior of %*% on c()-vectors as inner product, applied in parallelto all row-vectors of the dataset. Thus the product of a vector with a vector of the same type resultsin the scalar product of both. For the multiplication with a matrix each vector is considered as arow or column, whatever is more appropriate. The matrix itself is considered as representing alinear mapping (endomorphism) of the vector space to a space of the same type. The mapping isrepresented in clr, ilr or ilt coordinates. Which of the aforementioned coordinate systems is used isjudged from the type of x and from the dimensions of the A .
Value
Either a numeric vector containing the scalar products, or an object of type acomp or aplus contain-ing the vectors transformed with the given matrix.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
%*%.rmult
Examples
x <- acomp(matrix( sqrt(1:12), ncol= 3 ))x%*%xA <- matrix( 1:9,nrow=3)x %*% A %*% xx %*% AA %*% xA <- matrix( 1:4,nrow=2)x %*% A %*% xx %*% AA %*% xx <- aplus(matrix( sqrt(1:12), ncol= 3 ))x%*%xA <- matrix( 1:9,nrow=3)x %*% A %*% xx %*% AA %*% x
Activity10 Activity patterns of a statistician for 20 days
Description
Proportion of a day in activity teaching, consulting, administrating, research, other wakeful activi-ties and sleep for 20 days are given.
Activity31 19
Usage
data(Activity10)
Details
The activity of an academic statistician were divided into following six categories
teac teachingcons consultationadmi administrationrese researchwake other wakeful activitiesslee sleep
Data show the proportions of the 24 hours devoted to each activity, recorded on each of 20 days,selected randomly from working days in alternate weeks, so as to avoid any possible carry-overeffects, such as short-sleep day being compensated by make-up sleep on a succeeding day.
The six activity may be divided into two categories ’work’ comprising activities 1,2,3,4: and’leisure’ comprising activities 5 and 6.
All rows sum to one.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name STATDAY.DAT, hereincluded under the GNU Public Library Licence Version 2 or newer.
References
Aitchison: The Statistical Analysis of Compositional Data, 1986, Data 10, pp15.
Activity31 Activity patterns of a statistician for 20 days
Description
Proportion of a day in activity teaching, consulting, administrating, research, other wakeful activi-ties and sleep for 20 days are given.
Usage
data(Activity31)
20 Activity31
Details
The activity of an academic statistician were divided into following six categories
alr 21
teac teachingcons consultationadmi administrationrese researchwake other wakeful activitiesslee sleep
Data shows the proportions of the 24 hours devoted to each activity, recorded on each of 20 days,selected randomly from working days in alternate weeks, so as to avoid any possible carry-overeffects, such as short-sleep day being compensated by make-up sleep on a succeeding day.
The six activity may be divided into two categories ’work’ comprising activities 1,2,3,4: and’leisure’ comprising activities 5 and 6.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name ACTIVITY.DAT, hereincluded under the GNU Public Library Licence Version 2 or newer.
References
Aitchison: The Statistical Analysis of Compositional Data, 1986, Data 31.
alr Additive log ratio transform
Description
Compute the additive log ratio transform of a (dataset of) composition(s), and its inverse.
Usage
alr( x ,ivar=ncol(x), ... )alrInv( z, ...,orig=NULL)
Arguments
x a composition, not necessarily closedz the alr-transform of a composition, thus a (D-1)-dimensional real vector... generic arguments. not used.orig a compositional object which should be mimicked by the inverse transformation.
It is especially used to reconstruct the names of the parts.ivar The column to be used as denominator variable. Unfortunately not yet supported
in alrInv. The default works even if x is a vector.
22 alr
Details
The alr-transform maps a composition in the D-part Aitchison-simplex non-isometrically to a D-1dimensonal euclidian vector, treating the last part as common denominator of the others. The datacan then be analysed in this transformation by all classical multivariate analysis tools not relyingon a distance. The interpretation of the results is relatively simple, since the relation to the originalD-1 first parts is preserved. However distance is an extremely relevant concept in most types ofanalysis, where a clr or ilr transformation should be preferred.
The additive logratio transform is given by
alr(x)i := lnxixD
.
Value
alr gives the additive log ratio transform; accepts a compositional dataset alrInv gives a closedcomposition with the given alr-transform; accepts a dataset
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
See Also
clr,alr,apt, http://ima.udg.es/Activitats/CoDaWork03
Examples
(tmp <- alr(c(1,2,3)))alrInv(tmp)unclass(alrInv(tmp)) - clo(c(1,2,3)) # 0data(Hydrochem)cdata <- Hydrochem[,6:19]pairs(alr(cdata),pch=".")
AnimalVegetation 23
AnimalVegetation Animal and vegetation measurement
Description
Areal compositions by abundance of vegetation and animals for 50 plots in each of regions A andB.
Usage
data(AnimalVegetation)
Details
In a regional ecology study, plots of land of equal area were inspected and the parts of each plotwhich were thick or thin in vegetation and dense or sparse in animals were identified. From thisfield work the areal proportions of each plot were calculated for the four mutually exclusive andexhaustive categories: thick-dense, thick-sparse, thin-dense, thin-sparse. These sets of proportionsare recorded for 50 plots from each of two different regions A and B.
All rows sum to 1, except for some rounding errors.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name ANIVEG.DAT, here in-cluded under the GNU Public Library Licence Version 2 or newer.
References
Aitchison: The Statistical Analysis of Compositional Data, 1986, Data 25, pp22.
aplus Amounts analysed in log-scale
Description
A class to analyse positive amounts in a logistic framework.
Usage
aplus(X,parts=1:NCOL(oneOrDataset(X)),total=NA,warn.na=FALSE,detectionlimit=NULL,BDL=NULL,MAR=NULL,MNAR=NULL,SZ=NULL)
24 aplus
Arguments
X vector or dataset of positive numbers
parts vector containing the indices xor names of the columns to be used
total a numeric vectors giving the total amounts of each dataset.
warn.na should the user be warned in case of NA,NaN or 0 coding different types ofmissing values?
detectionlimit a number, vector or matrix of positive numbers giving the detection limit of allvalues, all columns or each value, respectively
BDL the code for ’Below Detection Limit’ in X
SZ the code for ’Structural Zero’ in X
MAR the code for ’Missing At Random’ in X
MNAR the code for ’Missing Not At Random’ in X
Details
Many multivariate datasets essentially describe amounts of D different parts in a whole. When thewhole is large in relation to the considered parts, such that they do not exclude each other, or whenthe total amount of each componenten is indeed determined by the phenomenon under investigationand not by sampling artifacts (such as dilution or sample preparation), then the parts can be treatedas amounts rather than as a composition (cf. acomp, rcomp).Like compositions, amounts have some important properties. Amounts are always positive. Anamount of exactly zero essentially means that we have a substance of another quality. Differentamounts - spanning different orders of magnitude - are often given in different units (ppm, ppb, g/l,vol.%, mass %, molar fraction). Often, these amounts are also taken as indicators of other non-measured components (e.g. K as indicator for potassium feldspar), which might be proportionalto the measured amount. However, in contrast to compositions, amounts themselves do matter.Amounts are typically heavily skewed and in many practical cases a log-transform makes their dis-tribution roughly symmetric, even normal.In full analogy to Aitchison’s compositions, vector space operations are introduced for amounts:the perturbation perturbe.aplus as a vector space addition (corresponding to change of units),the power transformation power.aplus as scalar multiplication describing the law of mass action,and a distance dist which is independent of the chosen units. The induced vector space is mappedisometrically to a classical RD by a simple log-transformation called ilt, resembling classical logtransform approaches.The general approach in analysing aplus objects is thus to perform classical multivariate analysison ilt-transformed coordinates (i.e., logs) and to backtransform or display the results in such a waythat they can be interpreted in terms of the original amounts.The class aplus is complemented by the rplus, allowing to analyse amounts directly as real num-bers, and by the classes acomp and rcomp to analyse the same data as compositions disregarding thetotal amounts, focusing on relative weights only.The classes rcomp, acomp, aplus, and rplus are designed as similar as possible in order to allowdirect comparison between results achieved by the different approaches. Especially the acompsimplex transforms clr, alr, ilr are mirrored in the aplus class by the single bijective isometrictransform ilt
aplusarithm 25
Value
a vector of class "aplus" representing a vector of amounts or a matrix of class "aplus" representingmultiple vectors of amounts, each vector in one row.
Missing Policy
The policy of treatment of zeroes, missing values and values below detecion limit is explained indepth in compositions.missing.
Author(s)
Raimon Tolosana-Delgado, K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
van den Boogaart, K.G. and R. Tolosana-Delgado (2008) "compositions": a unified R package to an-alyze Compositional Data, Computers & Geosciences, 34 (4), pages 320-338, doi:10.1016/j.cageo.2006.11.017.
See Also
ilt,acomp, rplus, princomp.aplus, plot.aplus, boxplot.aplus, barplot.aplus, mean.aplus,var.aplus, variation.aplus, cov.aplus, msd
Examples
data(SimulatedAmounts)plot(aplus(sa.lognormals))
aplusarithm vectorial arithmetic for data sets with aplus class
Description
The positive vectors equipped with the perturbation (defined as the element-wise product) as Abeliansum, and powertransform (defined as the element-wise powering with a scalar) as scalar multipli-cation forms a real vector space. These vector space operations are defined here in a similar way to+.rmult.
Usage
perturbe.aplus(x,y)## S3 method for class 'aplus'x + y## S3 method for class 'aplus'x - y## S3 method for class 'aplus'x * y## S3 method for class 'aplus'
26 aplusarithm
x / y## Methods for aplus## x+y## x-y## -x## x*r## r*x## x/rpower.aplus(x,r)
Arguments
x an aplus vector or dataset of vectors
y an aplus vector or dataset of vectors
r a numeric vector of size 1 or nrow(x)
Details
The operators try to mimic the parallel operation of R for vectors of real numbers to vectors ofamounts, represented as matrices containing the vectors as rows and works like the operators for{rmult}
Value
an object of class "aplus" containing the result of the corresponding operation on the vectors.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
rmult, %*%.rmult
Examples
x <- aplus(matrix( sqrt(1:12), ncol= 3 ))xx+xx + aplus(1:3)x * 1:41:4 * xx / 1:4x / 10power.aplus(x,1:4)
apt 27
apt Additive planar transform
Description
Compute the additive planar transform of a (dataset of) compositions or its inverse.
Usage
apt( x ,...)aptInv( z ,..., orig=NULL)
Arguments
x a composition or a matrix of compositions, not necessarily closed
z the apt-transform of a composition or a matrix of alr-transforms of compositions
... generic arguments, not used.
orig a compositional object which should be mimicked by the inverse transformation.It is especially used to reconstruct the names of the parts.
Details
The apt-transform maps a composition in the D-part real-simplex linearly to a D-1 dimensional eu-clidian vector. Although the transformation does not reach the whole RD−1, resulting covariancematrices are typically of full rank.
The data can then be analysed in this transformation by all classical multivariate analysis tools notrelying on distances. See cpt and ipt for alternatives. The interpretation of the results is easy sincethe relation to the first D-1 original variables is preserved.
The additive planar transform is given by
apt(x)i := clo(x)i, i = 1, . . . , D − 1
Value
apt gives the centered planar transform, aptInv gives closed compositions with the given apt-transforms
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
28 ArcticLake
References
van den Boogaart, K.G. and R. Tolosana-Delgado (2008) "compositions": a unified R package to an-alyze Compositional Data, Computers & Geosciences, 34 (4), pages 320-338, doi:10.1016/j.cageo.2006.11.017.
See Also
alr,cpt,ipt
Examples
(tmp <- apt(c(1,2,3)))aptInv(tmp)aptInv(tmp) - clo(c(1,2,3)) # 0data(Hydrochem)cdata <- Hydrochem[,6:19]pairs(apt(cdata),pch=".")
ArcticLake Artic lake sediment samples of different water depth
Description
Sand, silt and clay compositions of 39 sediment samples of different water depth in an Arctic lake.
Usage
data(ArcticLake)
Details
Sand, silt and clay compositions of 39 sediment samples at different water depth (in meters) inan Arctic lake. The additional feature is a concomitant variable or covariate, water depth, whichmay account for some of the variation in the compositions. In statistical terminology we havea multivariate regression problem with sediment composition as regressand and water depth asregressor.
All row percentage sums to 100, except for rounding errors.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name ARCTIC.DAT, here in-cluded under the GNU Public Library Licence Version 2 or newer.
References
Aitchison: The Statistical Analysis of Compositional Data, 1986, Data 5, pp5.
arrows3D 29
arrows3D arrows in 3D, based on package rgl
Description
adds 3-dimensional arrows to an rgl plot.
Usage
arrows3D(...)## Default S3 method:arrows3D(x0,x1,...,length=0.25,
angle=30,code=2,col="black",lty=NULL,lwd=2,orth=c(1,0.0001,0.0000001),labs=NULL,size=lwd)
Arguments
x0 a matrix or vector giving the starting points of the arrows
x1 a matrix or vector giving the end points of the arrows
... additional plotting parameters as described in rgl.material
length a number giving the length of the arrowhead
angle numeric giving the angle of the arrowhead
code 0=no arrowhead,1=arrowhead at x0,2=arrowhead at x1,3=double headed
col the color of the arrow
lty Not implemented, here for compatibility reasons with arrows
lwd line width in pixels
orth the flat side of the arrow is not unique by x0 and x1. This ambiguity is solved ina way that the arrow seams as wide as possible from the viewing direction orth.
labs labels to be plotted to the endpoints of the arrows
size size of the plotting symbol
Details
The function is called to plot arrows into an rgl plot. The size of the arrow head is given in a absoluteway. Therefore it is important to give the right scale for the length, to see the arrow head and that itdoes not fill the whole window.
Value
the 3D plotting coordinates of the tips of the arrows displayed, returned invisibly
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
30 as.data.frame
See Also
points3d, plot, plot3D,
Examples
x <- cbind(rnorm(10),rnorm(10),rnorm(10))plot3D(x)x0 <- x*0arrows3D(x0,x)
as.data.frame Convert "compositions" classes to data frames
Description
Convert a compositional object to a dataframe
Usage
## S3 method for class 'acomp'as.data.frame(x,...)## S3 method for class 'rcomp'as.data.frame(x,...)## S3 method for class 'aplus'as.data.frame(x,...)## S3 method for class 'rplus'as.data.frame(x,...)## S3 method for class 'rmult'as.data.frame(x,...)
Arguments
x an object to be converted to a dataframe
... additional arguments are not used
Value
a data frame containing the given data.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
axis3D 31
Examples
data(SimulatedAmounts)as.data.frame(acomp(sa.groups))# The central perpose of providing this command is that the following# works properly:data.frame(acomp(sa.groups),groups=sa.groups.area)
axis3D Drawing a 3D coordiante system to a plot, based on package rgl
Description
Adds a coordinate system to a 3D rgl graphic. In future releases, functionality to add tickmarks willbe (hopefully) provided. Now, it is just a system of arrows giving the directions of the three axes.
Usage
axis3D(axis.origin=c(0,0,0),axis.scale=1,axis.col="gray",vlabs=c("x","y","z"),vlabs.col=axis.col,bbox=FALSE,axis.lwd=2,axis.len=mean(axis.scale)/10,axis.angle=30,orth=c(1,0.0001,0.000001),axes=TRUE,...)
Arguments
axis.origin The location where to put the origin of the coordinate arrows typicall either 0,the minimum or the mean of the dataset
axis.scale either a number or a 3D vector giving the length of the arrows for the axis in thecoordiantes of the plot
axis.col Color to plot the coordinate system
vlabs The names of the axes, plotted at the end
vlabs.col color for the axes labels
bbox boolean, whether to plot a bounding box
axis.lwd line width of the axes
axis.angle angle of the arrow heads
axis.len length of the arrow heads
orth the orth argument of arrows3D
axes a boolean, wether to plot the axes
... these arguments are passed to arrows3D as rgl.material arguments
Details
The function is called to plot a coordiante system consisting of arrows into an rgl plot.
Value
Nothing
32 balance
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
points3d, plot, plot3D,arrows3D
Examples
x <- cbind(rnorm(10),rnorm(10),rnorm(10))plot3D(x)x0 <- x*0arrows3D(x0,x)
balance Compute balances for a compositional dataset.
Description
Compute balances in a compositional dataset.
Usage
balance(X,...)## S3 method for class 'acomp'balance(X,expr,...)## S3 method for class 'rcomp'balance(X,expr,...)## S3 method for class 'aplus'balance(X,expr,...)## S3 method for class 'rplus'balance(X,expr,...)balance01(X,...)## S3 method for class 'acomp'balance01(X,expr,...)## S3 method for class 'rcomp'balance01(X,expr,...)balanceBase(X,...)## S3 method for class 'acomp'balanceBase(X,expr,...)## S3 method for class 'rcomp'balanceBase(X,expr,...)## S3 method for class 'acomp'balanceBase(X,expr,...)## S3 method for class 'rcomp'balanceBase(X,expr,...)
balance 33
Arguments
X compositional dataset (or optionally just its column names for balanceBase)
expr a ~ formula using the column names of X as variables and seperating them by/ and organize by paranthesis (). : and * can be used instead of / when thecorresponding balance should not be created. - can be used as an synonym to /in the real geometries. 1 can be used in the unclosed geometries to level againsta constant.
... for future perposes
Details
For acomp-compositions balances are defined as orthogonal projections representing the log ratioof the geometric means of subsets of elements. Based on a recursive subdivision (provided by theexpr=) this projections provide a (complete or incomplete) basis of the clr-plane. The basis is givenby the balanceBase functions. The transform is given by the balance functions. The balance01functions are a backtransform of the balances to the amount of the first portion if this was the onlybalance in a 2 element composition, providing an "interpretation" for the values of the balances.
The package tries to give similar concepts for the other scales. For rcomp objects the concept ismainly unchanges but augmented by a virtual component 1, which always has portion 1.
For rcomp objects, we choose not a "orthogonal" transformation since such a concept anyway doesnot really exist in the given space, but merily use the difference of one subset to the other. Thebalance01 is than not really a transform of the balance but simply the portion of the first group ofparts in all contrasted parts.
For rplus objects we just used an analog to generalisation from the rcomp defintion as aplusis generalized from acomp. However at this time we have no idea wether this has any usefullinterpretation.
Value
balance a matrix (or vector) with the corresponding balances of the dataset.
balance01 a matrix (or vector) with the corresponding balances in the dataset transformedin the given geometry to a value between 0 and 1.
balanceBase a matrix (or vector) with column vectors giving the transform in the cdt-transformused to achiev the correponding balances.
References
http://ima.udg.es/Activitats/CoDaWork08 Papers of Boogaart and Tolosana http://ima.udg.es/Activitats/CoDaWork05 Paper of Egozcue
See Also
clr,ilr,ipt, ilrBase
34 barplot.acomp
Examples
X <- rnorm(100)Y <- rnorm.acomp(100,acomp(c(A=1,B=1,C=1)),0.1*diag(3))+acomp(t(outer(c(0.2,0.3,0.4),X,"^")))colnames(Y) <- c("A","B","C")
subComps <- function(X,...,all=list(...)) {X <- oneOrDataset(X)nams <- sapply(all,function(x) paste(x[[2]],x[[3]],sep=","))val <- sapply(all,function(x){
a = X[,match(as.character(x[[2]]),colnames(X)) ]b = X[,match(as.character(x[[2]]),colnames(X)) ]c = X[,match(as.character(x[[3]]),colnames(X)) ]return(a/(b+c))})
colnames(val)<-namsval
}pairs(cbind(ilr(Y),X),panel=function(x,y,...) {points(x,y,...);abline(lm(y~x))})pairs(cbind(balance(Y,~A/B/C),X),panel=function(x,y,...) {points(x,y,...);abline(lm(y~x))})
pairwisePlot(balance(Y,~A/B/C),X)pairwisePlot(X,balance(Y,~A/B/C),panel=function(x,y,...) {plot(x,y,...);abline(lm(y~x))})pairwisePlot(X,balance01(Y,~A/B/C))pairwisePlot(X,subComps(Y,A~B,A~C,B~C))
balance(rcomp(Y),~A/B/C)balance(aplus(Y),~A/B/C)balance(rplus(Y),~A/B/C)
barplot.acomp Bar charts of amounts
Description
Compositions and amounts dispalyed as bar plots.
Usage
## S3 method for class 'acomp'barplot(height,...,legend.text=TRUE,beside=FALSE,total=1,plotMissings=TRUE,missingColor="red",missingPortion=0.01)## S3 method for class 'rcomp'barplot(height,...,legend.text=TRUE,beside=FALSE,total=1,plotMissings=TRUE,missingColor="red",missingPortion=0.01)
barplot.acomp 35
## S3 method for class 'aplus'barplot(height,...,legend.text=TRUE,beside=TRUE,total=NULL,plotMissings=TRUE,missingColor="red",missingPortion=0.01)## S3 method for class 'rplus'barplot(height,...,legend.text=TRUE,beside=TRUE,total=NULL,plotMissings=TRUE,missingColor="red",missingPortion=0.01)## S3 method for class 'ccomp'barplot(height,...,legend.text=TRUE,beside=FALSE,total=1,plotMissings=TRUE,missingColor="red",missingPortion=0.01)
Arguments
height an acomp, rcomp, aplus, or rplus object giving amounts to be displayed
... further graphical parameters as in barplot
legend.text same as legend.text in barplot
beside same as beside in barplot
total The total to be used in displaying the composition, typically 1, 100 or the numberof parts. If NULL no normalisation takes place.
plotMissings logical: shall missings be annotate in the plot
missingColor color to draw missings
missingPortion The space portion to be reserved for missings
Details
These functions are essentially light-weighted wrappers for barplot, just adding an adequate de-fault behavior for each of the scales. The missingplot functionality will work well with the defaultsettings.
If plotMissings is true, there will be an additional portion introduced, which is not counted in thetotal. This might make the plots looking less nice, however they make clear to the viewer that it isby no means clear how the rest of the plot should be interpreted and that the missing value reallycasts some unsureness on the rest of the data.
Value
A numeric vector (or matrix, when beside = TRUE) giving the coordinates of all the bar midpointsdrawn, as in barplot
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
acomp, rcomp, rplus aplus, plot.acomp, boxplot.acomp
36 Bayesite
Examples
data(SimulatedAmounts)barplot(mean(acomp(sa.lognormals[1:10,])))barplot(mean(rcomp(sa.lognormals[1:10,])))barplot(mean(aplus(sa.lognormals[1:10,])))barplot(mean(rplus(sa.lognormals[1:10,])))
barplot(acomp(sa.lognormals[1:10,]))barplot(rcomp(sa.lognormals[1:10,]))barplot(aplus(sa.lognormals[1:10,]))barplot(rplus(sa.lognormals[1:10,]))
barplot(acomp(sa.tnormals))
Bayesite Permeabilities of bayesite
Description
Relation of mechanical properties of a new fibreboard with its composition
Usage
data(Bayesite)
Details
In the development of bayesite, a new fibreboard, experiments were conducted to obtain someinsight into the nature of the relationship of its permeability to the mix of its four ingredients
A: short fibres,B: medium fibres,C: long fibres,D: binder,
and the pressure of which these are bonded. The results of 21 such experiments are reported. It isrequired to investigate the dependence of permeability of mixture and bonding pressure.
All 4-part compositions sum to one.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name BAYESITE.DAT, hereincluded under the GNU Public Library Licence Version 2 or newer.
binary 37
References
Aitchison: The Statistical Analysis of Compositional Data, 1986, Data 21, pp21.
binary Treating binary and g-adic numbers
Description
Allows the access to individual digits in binary (and general g-adic) numbers.
Usage
binary(x,mb=max(maxBit(x,g)),g=2)unbinary(x,g=2)bit(x,b,g=2)## S3 method for class 'numeric'bit(x,b=0:maxBit(x,g),g=2)## S3 method for class 'character'bit(x,b=0:maxBit(x,g),g=2)bit(x,b,g=2) <- value## S3 replacement method for class 'numeric'bit(x,b=0:maxBit(x,g),g=2) <- value## S3 replacement method for class 'character'bit(x,b=0:maxBit(x,g),g=2) <- valuemaxBit(x,g=2)## S3 method for class 'numeric'maxBit(x,g=2)## S3 method for class 'character'maxBit(x,g=2)bitCount(x,mb=max(maxBit(x,g)),g=2)gsi.orSum(...,g=2)whichBits(x,mb=max(maxBit(x,g)),g=2,values=c(TRUE))binary2logical(x,mb=max(maxBit(x,g)),g=2,values=c(TRUE))
Arguments
x a number either represented a g-adic character string or as a integeral numericvalue
b the indicies of the bits to be processes. The least significant bit has index 0.
mb maximal bit. The index of the most significant bit to be treated
g the base of the g-adic representation. 2 corresponds to binary numbers, 8 tooctal numbers, 16 to hexadecimal numbers. g is limited by 36.
value a vector of bit values to be selected or setted.
values a vector of bit values that should be considered as TRUE.
... some binary numbers
38 binary
Details
These routines are primerily intended to manipulate g-adic numbers for user interface purposes andcondensed representation of information. They are not intended for a long number arithmetic.
Value
binary returns a standard binary (or g-adic) character representation of the number
unbinary returns a binary (or g-adic) representation of the number
bit returns the values of the requested bits. The values are returned as a logicalvector for binary numbers an as numeric digit values for other g-adic numbers.
maxBit returns the most significant bit represented in the number. This is the highest bitset in numeric numbers and the highest actually given character in a characterrepresentation.
bitCount returns the g-adic crossfoot of the number. For a binary number this is thenumber of bits set
gsi.orSum Only works for binary numbers and does a parallel or on each of the bits for alist of binary numbers.
whichBits returns the indices of the bits acutally set (or more precisely of the bits withvalue in values)
binary2logical returns the a true false vector of the bits acutally set (or more precisely of thebits with value in values)
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
outlierplot
Examples
(x<-unbinary("10101010"))(y<-binary(x))bit(x,1:3)bit(y,0:3)maxBit(x)maxBit(y)whichBits(x)whichBits(y)binary2logical(y)bit(x)bit(y)gsi.orSum(y,1)bitCount(x)bitCount(y)bit(x,2)<-1x
biplot3D 39
bit(y,2)<-1y
biplot3D Three-dimensional biplots, based on package rgl
Description
Plots variables and cases in the same plot, based on a principal component analysis.
Usage
biplot3D(x,...)## Default S3 method:biplot3D(x,y,var.axes=TRUE,col=c("green","red"),cex=c(2,2),
xlabs = NULL, ylabs = NULL, expand = 1,arrow.len = 0.1,...,add=FALSE)
## S3 method for class 'princomp'biplot3D(x,choices=1:3,scale=1,...,
comp.col=1,comp.labs=paste("Comp.",1:3),scale.scores=lambda[choices]^(1-scale),scale.var=scale.comp, scale.comp=sqrt(lambda[choices]),scale.disp=1/scale.comp)
Arguments
x princomp object or matrix of point locations to be drawn (typically, cases)
choices Which principal components should be used?
scale a scaling parameter like in biplot
scale.scores a vector giving the scaling applied to the scores
scale.var a vector giving the scaling applied to the variables
scale.comp a vector giving the scaling applied to the unit length of each component
scale.disp a vector giving the scaling of the display in the directions of the components
comp.col color to draw the axes of the components, defaults to black
comp.labs labels for the components
... further plotting parameters as defined in rgl.material
y matrix of second point/arrow-head locations (typically, variables)
var.axes logical, TRUE draws arrows and FALSE points for y
col vector/list of two elements the first giving the color/colors for the first data setand the second giving color/colors for the second data set.
cex vector/list of two elements the first giving the size for the first data set and thesecond giving size for the second data set.
40 Blood23
xlabs labels to be plotted at x-locationsylabs labels to be plotted at y-locationsexpand the relative expansion of the y data set with respect to xarrow.len The length of the arrows as defined in arrows3D
add logical, adding to existing plot or making a new one?
Details
This "biplot" is a triplot, relating data, variables and principal components. The relative scaling ofthe components is still experimental, meant to mimic the behavior of classical biplots.
Value
the 3D plotting coordinates of the tips of the arrows of the variables displayed, returned invisibly
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
gsi
Examples
data(SimulatedAmounts)pc <- princomp(acomp(sa.lognormals5))pcsummary(pc)plot(pc) #plot(pc,type="screeplot")biplot3D(pc)
Blood23 Blood samples
Description
Percentage of different leukocytes in blood samples of ten patients, determined by four differentmethods.
Usage
data(Blood23)
Details
In a comparative study of four different methods of assessing leukocytes composition of a bloodsample, aliquots of blood samples from ten patients were assessed by the four methods. Data showthe percentage in the 40 analyses of:
Boxite 41
P: polymorphonuclear leukocytes,S: small lymphocytes,L: large mononuclears.
All rows sum to 100.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name BLOOD.DAT, here in-cluded under the GNU Public Library Licence Version 2 or newer.
References
Aitchison: The Statistical Analysis of Compositional Data, 1986, Data 23, pp22.
Boxite Compositions and depth of 25 specimens of boxite
Description
A mineral compositions of 25 rock specimens of boxite type. Each composition consists of thepercentage by weight of five minerals, albite, blandite, cornite, daubite, endite, as well as the depthof location.
Usage
data(Boxite)
Details
A mineral compositions of 25 rock specimens of boxite type are given. Each composition consists ofthe percentage by weight of five minerals, albite, blandite, cornite, daubite, endite, and the recordeddepth of location of each specimen. We abbreviate the minerals names to A, B, C, D, E.
All row percentage sums to 100, except the 6-th 102.3 and the 10-th 99.9.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name BOXITE.DAT, here in-cluded under the GNU Public Library Licence Version 2 or newer.
42 boxplot
References
Aitchison: The Statistical Analysis of Compositional Data, 1986, Data 3, pp4.
boxplot Displaying compositions and amounts with box-plots
Description
For the different interpretations of amounts or compositional data, a different type of boxplot isfeasible. Thus different boxplots are drawn.
Usage
## S3 method for class 'acomp'boxplot(x,fak=NULL,...,
xlim=NULL,ylim=NULL,log=TRUE,panel=vp.logboxplot,dots=!boxes,boxes=TRUE,notch=FALSE,plotMissings=TRUE,mp=~simpleMissingSubplot(missingPlotRect,
missingInfo,c("NM","TM",cn)))
## S3 method for class 'rcomp'boxplot(x,fak=NULL,...,
xlim=NULL,ylim=NULL,log=FALSE,panel=vp.boxplot,dots=!boxes,boxes=TRUE,notch=FALSE,plotMissings=TRUE,mp=~simpleMissingSubplot(missingPlotRect,
missingInfo,c("NM","TM",cn)))## S3 method for class 'aplus'boxplot(x,fak=NULL,...,log=TRUE,
plotMissings=TRUE,mp=~simpleMissingSubplot(missingPlotRect,
missingInfo,names(missingInfo)))
## S3 method for class 'rplus'boxplot(x,fak=NULL,...,ylim=NULL,log=FALSE,
plotMissings=TRUE,mp=~simpleMissingSubplot(missingPlotRect,
missingInfo,names(missingInfo)))
vp.boxplot(x,y,...,dots=FALSE,boxes=TRUE,xlim=NULL,ylim=NULL,log=FALSE,notch=FALSE,plotMissings=TRUE,mp=~simpleMissingSubplot(missingPlotRect,
missingInfo,c("NM","TM",cn)),
boxplot 43
missingness=attr(y,"missingness") )vp.logboxplot(x,y,...,dots=FALSE,boxes=TRUE,xlim,ylim,log=TRUE,notch=FALSE,
plotMissings=TRUE,mp=~simpleMissingSubplot(missingPlotRect,
missingInfo,c("NM","TM",cn)),missingness=attr(y,"missingness"))
Arguments
x a data set
fak a factor to split the data set, not yet implemented in aplus and rplus
xlim x-limits of the plot.
ylim y-limits of the plot.
log logical indicating whether ploting should be done on log scale
panel the panel function to be used or a list of multiple panel functions
... further graphical parameters
dots a logical indicating whether the points should be drawn
boxes a logical indicating whether the boxes should be drawn
y used by pairs
notch logical, should the boxes be notched?
plotMissings Logical indicating that missings should be displayed.
mp A formula providing a function call, which will be evaluated within each panelwith missings to plot the missingness situation. The call can use the variablesmissingPlotRect, which provides a rectangle to plot the information to in apar("usr") like specification. In the rX is the current data
missingness The missingness information as a result from missingType of the full data in-formation the panels could base there missing plots on.
Details
boxplot.aplus and boxplot.rplus are wrappers of bxp, which just take into account the possiblelogarithmic scale of the data.
boxplot.acomp and boxplot.rcomp generate a matrix of box-plots, where each cell represents thedifference between the row and column variables. Such difference is respectively computed as alog-ratio and a rest.
vp.boxplot and vp.logboxplot are only used as panel functions. They should not be directlycalled.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
44 ccomp
See Also
plot.acomp, qqnorm.acomp
Examples
data(SimulatedAmounts)boxplot(acomp(sa.lognormals))boxplot(rcomp(sa.lognormals))boxplot(aplus(sa.lognormals))boxplot(rplus(sa.lognormals))# And now with missing!!!boxplot(acomp(sa.tnormals))
ccomp Count compositions
Description
A class providing the means to analyse count compositions understood as Poisson or multinomialrealisation, where the portions are given by an unkown Aitchison compositions.
Usage
ccomp(X,parts=1:NCOL(oneOrDataset(X)),total=NA,warn.na=FALSE,detectionlimit=NULL,BDL=NULL,MAR=NULL,MNAR=NULL,SZ=NULL)
Arguments
X composition or dataset of compositions
parts vector containing the indices xor names of the columns to be used
total the total amount to be used, typically 1 or 100
warn.na should the user be warned in case of NA,NaN or 0 coding different types ofmissing values?
detectionlimit a number, vector or matrix of positive numbers giving the detection limit of allvalues, all columns or each value, respectively
BDL the code for ’Below Detection Limit’ in X
SZ the code for ’Structural Zero’ in X
MAR the code for ’Missing At Random’ in X
MNAR the code for ’Missing Not At Random’ in X
ccompgof 45
Details
A count composition contains an indirect observation of an Aitchison composition by a Poisson ormultinomial variable. A count composition can only contain integer counts. It is assumed that thetotal sum is a an artefact and does not contain information on the actual composition. But it doescontain information on the precision of the relative observation.
Value
a vector of class "ccomp" representing count composition or a matrix of class "ccomp" representingmultiple count compositions each in one row.
Missing Policy
The policy of treatment of zeroes, missing values and values below detecion limit is explained indepth in compositions.missing.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
barplot.ccomp ccompMultinomialGOF.test ccompPoissonGOF.test cdt.ccomp cdtInv.ccompfitSameMeanDifferentVarianceModel groupparts.ccomp idt.ccomp idtInv.ccomp is.ccompmean.ccomp names<-.ccomp names.ccomp plot.ccomp PoissonGOF.test rmultinom.ccomp rnorm.ccomprpois.ccomp split.ccomp totals.ccomp
Examples
data(SimulatedAmounts)plot(acomp(sa.lognormals))
ccompgof Compositional Goodness of fit test
Description
Goodness of fit tests for count compositional data.
Usage
PoissonGOF.test(x,lambda=mean(x),R=999,estimated=missing(lambda))ccompPoissonGOF.test(x,simulate.p.value=TRUE,R=1999)ccompMultinomialGOF.test(x,simulate.p.value=TRUE,R=1999)
46 ccompgof
Arguments
x a dataset integer numbers (PoissonGOF) or count compositions (compPoissonGOF)
lambda the expected value to check against
R The number of replicates to compute the distribution of the test statistic
estimated states whether the lambda parameter should be considered as estimated for thecomputation of the p-value.
simulate.p.value
should all p-values be infered by simulation.
Details
The compositional goodness of fit testing problem is essentially a multivariate goodness of fit test.However there is a lack of standardized multivariate goodness of fit tests in R. Some can be foundin the energy-package.
In principle there is only one test behind the Goodness of fit tests provided here, a two sample testwith test statistic. ∑
ij k(xi, yi)√∑ij k(xi, xi)
∑ij k(yi, yi)
The idea behind that statistic is to measure the cos of an angle between the distributions in a scalarproduct given by
(X,Y ) = E[k(X,Y )] = E[
∫K(x−X)K(x− Y )dx]
where k and K are Gaussian kernels with different spread. The bandwith is actually the standardde-viation of k.The other goodness of fit tests against a specific distribution are based on estimating the parametersof the distribution, simulating a large dataset of that distribution and apply the two sample goodnessof fit test.
Value
A classical "htest" object
data.name The name of the dataset as specified
method a name for the test used
alternative an empty string
replicates a dataset of p-value distributions under the Null-Hypothesis got from nonpara-metric bootstrap
p.value The p.value computed for this test
Missing Policy
Up to now the tests can not handle missings.
ccompgof 47
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
See Also
fitDirichlet,rDirichlet, runif.acomp, rnorm.acomp,
Examples
## Not run:x <- runif.acomp(100,4)y <- runif.acomp(100,4)
erg <- acompGOF.test(x,y)#continueergunclass(erg)erg <- acompGOF.test(x,y)
x <- runif.acomp(100,4)y <- runif.acomp(100,4)dd <- replicate(1000,acompGOF.test(runif.acomp(100,4),runif.acomp(100,4))$p.value)hist(dd)
dd <- replicate(1000,acompGOF.test(runif.acomp(20,4),runif.acomp(100,4))$p.value)hist(dd)dd <- replicate(1000,acompGOF.test(runif.acomp(10,4),runif.acomp(100,4))$p.value)
hist(dd)dd <- replicate(1000,acompGOF.test(runif.acomp(10,4),runif.acomp(400,4))$p.value)hist(dd)dd <- replicate(1000,acompGOF.test(runif.acomp(400,4),runif.acomp(10,4),bandwidth=4)$p.value)hist(dd)
dd <- replicate(1000,acompGOF.test(runif.acomp(20,4),runif.acomp(100,4)+acomp(c(1,2,3,1)))$p.value)
hist(dd)
x <- runif.acomp(100,4)acompUniformityGOF.test(x)
dd <- replicate(1000,acompUniformityGOF.test(runif.acomp(10,4))$p.value)
48 cdt
hist(dd)
## End(Not run)
cdt Centered default transform
Description
Compute the centered default transform of a (data set of) compositions or amounts (or its inverse).
Usage
cdt(x,...)## Default S3 method:
cdt( x ,...)## S3 method for class 'acomp'
cdt( x ,...)## S3 method for class 'rcomp'
cdt( x ,...)## S3 method for class 'aplus'
cdt( x ,...)## S3 method for class 'rplus'
cdt( x ,...)## S3 method for class 'rmult'
cdt( x ,...)## S3 method for class 'ccomp'
cdt( x ,...)## S3 method for class 'factor'
cdt( x ,...)cdtInv(x,orig,...)## Default S3 method:
cdtInv( x ,orig,...)## S3 method for class 'acomp'
cdtInv( x ,orig,...)## S3 method for class 'rcomp'
cdtInv( x ,orig,...)## S3 method for class 'aplus'
cdtInv( x ,orig,...)## S3 method for class 'rplus'
cdtInv( x ,orig,...)## S3 method for class 'rmult'
cdtInv( x ,orig,...)## S3 method for class 'ccomp'
cdtInv( x ,orig,...)## S3 method for class 'factor'
cdt 49
cdtInv( x ,orig,...)
Arguments
x a classed (matrix of) amount or composition, to be transformed with its centereddefault transform, or its inverse
... generic arguments past to underlying functions.
orig a compositional object which should be mimicked by the inverse transformation.It is used to determine the backtransform to be used and eventually to reconstructthe names of the parts. It is the generic argument. Typically this argument is thedata set that has be transformed in the first place.
Details
The general idea of this package is to analyse the same data with different geometric concepts,in a fashion as similar as possible. For each of the four concepts there exists a unique transformexpressing the geometry in a linear subspace, keeping the relation to the variables. This uniquetransformation is computed by cdt. For acomp the transform is clr, for rcomp it is cpt, for aplusit is ilt, and for rplus it is iit. Each component of the result is identified with a unit vector in thedirection of the corresponding component of the original composition or amount. Keep in mind thatthe transform is not necessarily surjective and thus variances in the image space might be singular.
Value
A corresponding matrix or vector containing the transforms.
Author(s)
R. Tolosana-Delgado, K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
van den Boogaart, K.G. and R. Tolosana-Delgado (2008) "compositions": a unified R package to an-alyze Compositional Data, Computers & Geosciences, 34 (4), pages 320-338, doi:10.1016/j.cageo.2006.11.017.
See Also
idt, clr, cpt, ilt, iit
Examples
## Not run:# the cdt is defined bycdt <- function(x) UseMethod("cdt",x)cdt.default <- function(x) xcdt.acomp <- clrcdt.rcomp <- cptcdt.aplus <- iltcdt.rplus <- iit
50 ClamEast
## End(Not run)x <- acomp(1:5)(ds <- cdt(x))cdtInv(ds,x)(ds <- cdt(rcomp(1:5)))cdtInv(ds,rcomp(x))
data(Hydrochem)x = Hydrochem[,c("Na","K","Mg","Ca")]y = acomp(x)z = cdt(y)y2 = cdtInv(z,y)par(mfrow=c(2,2))for(i in 1:4){plot(y[,i],y2[,i])}
ClamEast Color-size compositions of 20 clam colonies from East Bay
Description
From East Bay, 20 clam colonies were randomly selected and from each a sample of clams weretaken. Each sample was sieved into three size ranges, large, medium, and small. Then each sizerange was sorted by the shale colour, dark or light.
Usage
data(ClamEast)
Details
The data consist of 20 cases and 2× 3 variables denoted:
dl portion of dark large clams,dm portion of dark medium clams,ds portion of dark small clams,ll portion of light large clams,lm portion of light medium clams,ls portion of light small clams.
All 6-part compositions sum to one, except for rounding errors.
Note
Courtesy of J. Aitchison
ClamWest 51
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name CLAMEAST.DAT, hereincluded under the GNU Public Library Licence Version 2 or newer.
References
Aitchison: The Statistical Analysis of Compositional Data, 1986, Data 14, pp18.
ClamWest Color-size compositions of 20 clam colonies from West Bay
Description
From West Bay, 20 clam colonies were randomly selected, and from each a sample of clams weretaken. Each sample was sieved into three size ranges, large, medium, and small. Then each sizerange was sorted by the shale colour, dark or light.
Usage
data(ClamWest)
Details
The data consist of 20 cases and 2× 3 variables denoted:
dl portion of dark large clams,dm portion of dark medium clams,ds portion of dark small clams,ll portion of light large clams,lm portion of light medium clams,ls portion of light small clams.
All 6-part compositions sum up to one.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name CLAMWEST.DAT, hereincluded under the GNU Public Library Licence Version 2 or newer.
References
Aitchison: The Statistical Analysis of Compositional Data, 1986, Data 15, pp17.
52 clo
clo Closure of a composition
Description
Closes compositions to sum up to one (or an optional total), by dividing each part by the sum.
Usage
clo( X, parts=1:NCOL(oneOrDataset(X)),total=1,detectionlimit=attr(X,"detectionlimit"),BDL=NULL,MAR=NULL,MNAR=NULL,SZ=NULL,storelimit=!is.null(attr(X,"detectionlimit")))
Arguments
X composition or dataset of compositions
parts vector containing the indices xor names of the columns to be used
total the total amount to which the compositions should be closed; either a singlenumber, or a numeric vector of length gsi.getN(X) specifying a different totalfor each compositional vector in the dataset.
detectionlimit a number, vector or matrix of positive numbers giving the detection limit of allvalues, all variables, or each value
BDL the code for values below detection limit in X
SZ the code for structural zeroes in X
MAR the code for values missed at random in X
MNAR the code for values missed not at random in X
storelimit a boolean indicating wether to store the detection limit as an attribute in the data.It defaults to FALSE if the detection limit is not already stored in the dataset.The attribute is only needed for very advanced analysis. Most times, this willnot be used.
Details
The closure operation is given by
clo(x) :=
xi/ D∑j=1
xj
clo generates a composition without assigning one of the compositional classes acomp or rcomp.Note that after computing the closed-to-one version, obtaining a version closed to any other valueis done by simple multiplication.
clr 53
Value
a composition or a data matrix of compositions, maybe without compositional class. The individualcompositions are forced to sum to 1 (or to the optionally-specified total). The result should have thesame shape as the input (vector, row, matrix).
Missing Policy
How missing values are coded in the output always follows the general rules described in compositions.missing.The BDL values are accordingly scaled during the scaling operations but not taken into acount forthe calculation of the total sum.
Note
clo can be used to unclass compositions.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de, Raimon Tolosana-Delgado
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
See Also
clr,acomp,rcomp
Examples
(tmp <- clo(c(1,2,3)))clo(tmp,total=100)data(Hydrochem)plot( clo(Hydrochem,8:9) ) # Giving points on a line
clr Centered log ratio transform
Description
Compute the centered log ratio transform of a (dataset of) composition(s) and its inverse.
Usage
clr( x,... )clrInv( z,... )
54 clr
Arguments
x a composition or a data matrix of compositions, not necessarily closed
z the clr-transform of a composition or a data matrix of clr-transforms of compo-sitions, not necessarily centered (i.e. summing up to zero)
... for generic use only
Details
The clr-transform maps a composition in the D-part Aitchison-simplex isometrically to a D-dimensonaleuclidian vector subspace: consequently, the transformation is not injective. Thus resulting covari-ance matrices are always singular.The data can then be analysed in this transformation by all classical multivariate analysis tools notrelying on a full rank of the covariance. See ilr and alr for alternatives. The interpretation of theresults is relatively easy since the relation between each original part and a transformed variable ispreserved.The centered logratio transform is given by
clr(x) :=
lnxi −1
D
D∑j=1
lnxj
i
Value
clr gives the centered log ratio transform, clrInv gives closed compositions with the given clr-transform
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data, Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
See Also
ilr,alr,apt
Examples
(tmp <- clr(c(1,2,3)))clrInv(tmp)clrInv(tmp) - clo(c(1,2,3)) # 0data(Hydrochem)cdata <- Hydrochem[,6:19]pairs(clr(cdata),pch=".")
clr2ilr 55
clr2ilr Convert between clr and ilr, and between cpt and ipt.
Description
Compute the centered log ratio transform of a (dataset of) isometric log-ratio transform(s) and itsinverse. Equivalently, compute centered and isometric planar transforms from each other. Acts invectors and in bilinear forms.
Usage
clr2ilr( x , V=ilrBase(x) )ilr2clr( z , V=ilrBase(z=z),x=NULL )clrvar2ilr( varx , V=ilrBase(D=ncol(varx)) )ilrvar2clr( varz , V=ilrBase(D=ncol(varz)+1) ,x=NULL)
Arguments
x the clr/cpt-transform of composition(s) (in the ilr2-routines provided only to givecolumn names.)
z the ilr/ipt-transform of composition(s)
varx variance or covariance matrix of clr/cpt-transformed compositions
varz variance or covariance matrix of ilr/ipt-transformed compositions
V a matrix with columns giving the chosen basis of the clr-plane
Details
These functions perform a matrix multiplication with V in an appropriate way.
Value
clr2ilr gives the ilr/ipt transform of the same composition(s),ilr2clr gives the clr/cpt transform of the same composition(s),clrvar2ilr gives the variance-/covariance-matrix of the ilr/ipt transform of the same composi-tional data set,ilrvar2clr gives the variance-/covariance-matrix of the clr/cpt transform of the same composi-tional data set.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
56 ClusterFinder1
References
Egozcue J.J., V. Pawlowsky-Glahn, G. Mateu-Figueras and C. Barcel’o-Vidal (2003) Isometric lo-gratio transformations for compositional data analysis. Mathematical Geology, 35(3) 279-300Aitchison, J, C. Barcel’o-Vidal, J.J. Egozcue, V. Pawlowsky-Glahn (2002) A consise guide to the al-gebraic geometric structure of the simplex, the sample space for compositional data analysis, TerraNostra, Schriften der Alfred Wegener-Stiftung, 03/2003
See Also
ilr, ipt
Examples
data(SimulatedAmounts)ilrInv(clr2ilr(clr(sa.lognormals)))-clo(sa.lognormals)clrInv(ilr2clr(ilr(sa.lognormals)))-clo(sa.lognormals)ilrvar2clr(var(ilr(sa.lognormals)))-var(clr(sa.lognormals))clrvar2ilr(var(cpt(sa.lognormals)))-var(ipt(sa.lognormals))
ClusterFinder1 Heuristics to find subpopulations of outliers
Description
The ClusterFinder is a heuristic to find subpopulations of outliers essentially by looking for sec-ondary modes in a density estimate.
Usage
ClusterFinder1(X,...)## S3 method for class 'acomp'ClusterFinder1(X,...,sigma=0.3,radius=1,asig=1,minGrp=3,
robust=TRUE)
Arguments
X the dataset to be clustered
... Further arguments to MahalanobisDist(X,...,robust=robust,pairwise=TRUE)
sigma numeric: The Bandwidth of the density estimation kernel in a robustly Maha-lanobis transformed space. (i.e. in the transform, where the main group has unitvariance)
radius The minimum size of a cluster in a robustly Mahalanobis transformed space.(i.e. in the transform, where the main group has unit variance)
ClusterFinder1 57
asig a scaling factor for the geometry of the robustly Mahalanobis transformed spacewhen computing the likelihood of an observation to belong to group (under aGaussian assumption). Higher values
minGrp the minimum size of group to be used. Smaller groups are treated as singleoutliers
robust A robustness description for estimating the variance of the main group. FALSEis probably not a usefull value. However later other robustness techniques thanmcd might be usefull. TRUE just picks the default method of the package.
Details
See outliersInCompositions for a comprehensive introduction into the outlier treatment in compo-sitions.The ClusterFinder is labeled with a number to make clear that this is just an implementation ofsome heuristic and not based on some eternal truth. Other might give better Clusterfinders.Unlike other Clustering Algorithms the basic model of this algorithm assumes that there is onedominating subpopulation and an unkown number of smaller subpopulations with a similar covari-ance structure but a different mean. The algorithm thus first estimates the covariance structure ofthe main population by a robust location scale estimator. Then it uses a simplified (Gaussian) kerneldensity estimator to find nonrandom secondary modes. The it tries to a assign the different obser-vations according to discrimination analysis model to the different modes. Groups under a givensize are considered as single outliers forming a seperate group. In this way the number of clustersis kept low even if there are many erratic measurements in the dataset.The main use of the clusters is descriptive plotting. The advantage of these cluster against othercluster techniques like k-mean or hclust is that it does not tear appart the central mass of the data,as these methods do to make the clusters as compact as possible.
Value
A list
types a factor representing the group assignments, when the small groups are ignored
typesTbl a table giving the number of members in each of these groups
groups a factor representing the found group assignments
isMax a logical vector indicating for each observation,whether it represent a local max-imum in the density estimate.
prob the infered probability to belong to the different groups given as an acomp com-position.
nmembers a tabel giving the number of members of each group
density the density estimated in each observation location
likeli The infered likelihood see this observation, for each of the groups
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
58 CoDaDendrogram
See Also
hclust, kmeans
Examples
data(SimulatedAmounts)cl <- ClusterFinder1(sa.outliers5,sigma=0.4,radius=1)plot(sa.outliers5,col=as.numeric(cl$types),pch=as.numeric(cl$types))legend(1,1,legend=levels(cl$types),xjust=1,col=1:length(levels(cl$types)),
pch=1:length(levels(cl$types)))
CoDaDendrogram Dendrogram representation of acomp or rcomp objects
Description
Function for plotting CoDa-dendrograms of acomp or rcomp objects.
Usage
CoDaDendrogram(X, V = NULL, expr=NULL, mergetree = NULL, signary = NULL,range = c(-4,4), ..., xlim = NULL, ylim = NULL, yaxt = NULL, box.pos = 0,box.space = 0.25, col.tree = "black", lty.tree = 1, lwd.tree = 1,col.leaf = "black", lty.leaf = 1, lwd.leaf = 1, add = FALSE,border=NULL,type = "boxplot")
Arguments
X data set to plot (an rcomp or acomp object)
V basis to use, described as an ilr matrix
expr a formula describing the partition basis, as with balanceBase
mergetree basis to use, described as a merging tree (as in hclust)
signary basis to use, described as a sign matrix (as in the example below)
range minimum and maximum value for all coordinates (horizontal axes)
... further parameters to pass to any function, be it a plotting function or one relatedto the "type" parameter below; likely to produce lots of warnings
xlim minimum and maximum values for the horizontal direction of the plot (relatedto number of parts)
ylim minimum and maximum values for the vertical direction of the plot (related tovariance of coordinates)
yaxt axis type for the vertical direction of the plot (see par)
CoDaDendrogram 59
box.pos if type="boxplot", this is the relative position of the box in the vertical direction:0 means centered on the axis, -1 aligned below the axis and +1 aligned abovethe axis
box.space if type="boxplot", size of the box in the vertical direction as a portion of theminimal variance of the coordinates
col.tree color for the horizontal axes
lty.tree line type for the horizontal axes
lwd.tree line width for the horizontal axes
col.leaf color for the vertical conections between an axis and a part (leaf)
lty.leaf line type for the leaves
lwd.leaf line width for the leaves
add should a new plot be triggered, or is the material to be added to an existingCoDa-dendrogram?
border the color for drawing the rectangles
type what to represent? one of "boxplot","density","histogram","lines","nothing" or"points", or an univocal abbreviation
Details
The object and an isometric basis are represented in a CoDa-dendrogram, as defined by Egozcueand Pawlowsky-Glahn (2005). This is a representation of the following elements:
• aa hierarchical partition (which can be specified either through an ilrBase matrix (see ilrBase),a merging tree structure (see hclust) or a signary matrix (see gsi.merge2signary))
• bthe sample mean of each coordinate of the ilr basis associated to that partition
• cthe sample variance of each coordinate of the ilr basis associated to that partition
• doptionally (potentially!), any graphical representation of each coordinate, as long as thisrepresentation is suitable for a univariate data set (box-plot, histogram, dispersion and kerneldensity are programmed or intended to, but any other may be added with little work).
Each coordinate is represented in a horizontal axis, which limits correspond to the values given inthe parameter range. The vertical bar going up from each one of these coordinate axes representthe variance of that specific coordinate, and the contact point the coordinate mean. Note that to beable to represent an initial dendrogram, the first call to this function must be given a full data set,as means and variances must be computed. This information is afterwards stored in a global list, toadd any sort of new material to all coordinates.The default option is type="boxplot", which produces a box-plot for each coordinate, customiz-able using box.pos and box.space, as well as typical par parameters (col, border, lty, lwd, etc.).To obtain only the first three aspects, the function must be called with type="lines". As exten-sions, one might represent a single datum/few data (e.g., a mean or a random subsample of thedata set) calling the function with add=TRUE and type="points". Other options (calling functionshistogram or density, and admitting their parameters) will be also soon available.Note that the original coda-dendrogram as defined by Egozcue and Pawlowsky-Glahn (2005) workswith acomp objects and ilr bases. Functionality is extended to rcomp objects using calls to idt.
60 CoDaDendrogram
Author(s)
Raimon Tolosana-Delgado, K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
Egozcue J.J., V. Pawlowsky-Glahn, G. Mateu-Figueras and C. Barcel’o-Vidal (2003) Isometric lo-gratio transformations for compositional data analysis. Mathematical Geology, 35(3) 279-300
Egozcue, J.J. and V. Pawlowsky-Glahn (2005). CoDa-Dendrogram: a new exploratory tool. In:Mateu-Figueras, G. and Barcel\’o-Vidal, C. (Eds.) Proceedings of the 2nd International Workshopon Compositional Data Analysis, Universitat de Girona, ISBN 84-8458-222-1, http://ima.udg.es/Activitats/CoDaWork05
See Also
ilrBase,balanceBase, rcomp, acomp,
Examples
# first example: take the data set from the example, select only# compositional partsdata(Hydrochem)x = acomp(Hydrochem[,-c(1:5)])gr = Hydrochem[,4] # river groups (useful afterwards)# use an ilr basis coming from a clustering of partsdd = dist(t(clr(x)))hc1 = hclust(dd,method="ward")plot(hc1)mergetree=hc1$mergeCoDaDendrogram(X=acomp(x),mergetree=mergetree,col="red",range=c(-8,8),box.space=1)# add the mean of each rivercolor=c("green3","red","blue","darkviolet")aux = clrInv(t(sapply(split(clr(x),gr),mean)))CoDaDendrogram(X=aux,add=TRUE,col=color,type="points",pch=4)
# second example: box-plots by rivers (filled)CoDaDendrogram(X=acomp(x),mergetree=mergetree,col="black",range=c(-8,8),type="l")xsplit = split(x,gr)for(i in 1:4){CoDaDendrogram(X=acomp(xsplit[[i]]),col=color[i],type="box",box.pos=i-2.5,box.space=0.5,add=TRUE)}
# third example: fewer parts, partition defined by a signary, and empty box-plotsx = acomp(Hydrochem[,c("Na","K","Mg","Ca","Sr","Ba","NH4")])signary = t(matrix( c(1, 1, 1, 1, 1, 1, -1,
1, 1, -1, -1, -1, -1, 0,1, -1, 0, 0, 0, 0, 0,0, 0, -1, 1, -1, -1, 0,0, 0, 1, 0, -1, 1, 0,0, 0, 1, 0, 0, -1, 0),ncol=7,nrow=6,byrow=TRUE))
coloredBiplot 61
CoDaDendrogram(X=acomp(x),signary=signary,col="black",range=c(-8,8),type="l")xsplit = split(x,gr)for(i in 1:4){
CoDaDendrogram(X=acomp(xsplit[[i]]),border=color[i],type="box",box.pos=i-2.5,box.space=1.5,add=TRUE)
CoDaDendrogram(X=acomp(xsplit[[i]]),col=color[i],type="line",add=TRUE)
}
coloredBiplot A biplot providing somewhat easier access to details of the plot.
Description
This function generates a simple biplot out of various sources and allows to give color and symbolto the x-objects individually.
Usage
## Default S3 method:coloredBiplot(x, y, var.axes = TRUE, col,
cex = rep(par("cex"), 2), xlabs = NULL, ylabs = NULL, expand=1,xlim = NULL, ylim = NULL, arrow.len = 0.1, main = NULL, sub = NULL,xlab = NULL, ylab = NULL, xlabs.col = NULL, xlabs.bg = NULL,xlabs.pc=NULL, ...)
## S3 method for class 'princomp'coloredBiplot(x, choices = 1:2, scale = 1,
pc.biplot=FALSE, ...)## S3 method for class 'prcomp'coloredBiplot(x, choices = 1:2, scale = 1,
pc.biplot=FALSE, ...)
Arguments
x a representation of the the co-information to be plotted, given by a result ofprincomp or prcomp; or the first set of coordinates to be plotted
y optional, the second set of coordinates to be potted
var.axes if ’TRUE’ the second set of points have arrows representing them as (unscaled)axes
col one color (to be used for the y set) or a vector of two colors (to be used for x andy sets respectively, if xlabs.col is NULL)
cex the usual cex parameter for plotting; can be a length-2 vector to format differ-ently x and y labels/symbols
62 coloredBiplot
xlabs names to write for the points of the first set
ylabs names to write for the points of the second set
expand expansion factor to apply when plotting the second set of points relative to thefirst. This can be used to tweak the scaling of the two sets to a physically com-parable scale
xlim horizontal axis limits
ylim vertical axis limits
arrow.len length of the arrow heads on the axes plotted if ’var.axes’ is true. The arrowhead can be suppressed by ’arrow.len=0’
main main title
sub subtitle
xlab horizontal axis title
ylab vertical axis title
xlabs.col the color(s) to draw the points of the first set, if xlabs is null
xlabs.bg the filling color(s) to draw the points of the first set, if xlabs is null and xlabs.pcis between 21 and 25.
xlabs.pc the plotting character(s) for the first set, if xlabs is null
scale the way to distribute the singular values on the right or left singular vectors forprincomp and prcomp objects (see biplot)
choices the components to be plotted (see biplot)
pc.biplot should be scaled by sqrt(nrow(X))? (see biplot)
... further parameters for plot
Details
The functions is provided for convenience.
Value
The function is called only for the side effect of plotting. It is a modification of the standard Rroutine ’biplot’.
Author(s)
Raimon Tolosana-Delgado, K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
biplot, plot.acomp
Examples
data(SimulatedAmounts)coloredBiplot(x=princomp(acomp(sa.outliers5)),pc.biplot=FALSE,
xlabs.pc=c(1,2,3), xlabs.col=2:4, col="black")
colorsForOutliers 63
colorsForOutliers Create a color/char palette or for groups of outliers
Description
Conveniance Functions to generate meaningfull color palettes for factors representing differenttypes of outliers.
Usage
colorsForOutliers1(outfac, family=rainbow,extreme="cyan",outlier="red",ok="gray40",unknown="blue")
colorsForOutliers2(outfac,use=whichBits(gsi.orSum(levels(outfac))),codes=c(2^outer(c(24,16,8),1:7,"-")),ok="yellow")
pchForOutliers1(outfac,ok='.',outlier='\004',extreme='\003',unknown='\004',...,other=c('\001','\002','\026','\027','\010','\011','\012','\013','\014','\015','\016',strsplit("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ","")[[1]]))
Arguments
outfac a factor given by an OutlierClassifier (e.g. OutlierClassifier1). colorsForOutliers1is used for the types "best","type","outlier","grade". colorsForOutliers2 isused for type all.
family a function generating a color palette from a numer of colors requested.
extreme The color/char for extrem but not definitivly outlying observations.
outlier The color/char for detected outliers.
unknown The color/char for observation with unclear classification.
other The character codes for other outlier classes.
ok The color/char for nonoutlying usual observations.
use a numerical vector giving the indices of the bits of the output to be represented.The sequence of the bits determins how each bit is represented.
codes The color influences to be used for each bit.
... further codings for other factorlevels
Details
This functions are provided for coveniance to quickly generate a palette of reasonable colors orplotting chars for groups of outliers classified by OutlierClassifier1.
Value
a character vector of colors or a numeric vector of plot chars.
64 CompLinModCoReg
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
OutlierClassifier1
Examples
## Not run:data(SimulatedAmounts)data5 <- acomp(sa.outliers5)olc <- OutlierClassifier1(data5)plot(data5,col=colorsForOutliers1(olc)[olc])olc <- OutlierClassifier1(data5,type="all")plot(data5,col=colorsForOutliers2(olc)[olc])
## End(Not run)
CompLinModCoReg Compositional Linear Model of Coregionalisation
Description
Creates a Variogram model according to the linear model of spatial corregionalisation for a compo-sitional geostatistical analysis.
Usage
CompLinModCoReg(formula,comp,D=ncol(comp),envir=environment(formula))
Arguments
formula A formula without left side providing a formal description of a variogram model.
comp a compositional dataset, needed to provide the frame size
D The dimension of the multivariate dataset
envir The enviroment in which formula should be interpreted.
Details
The linear model of coregionalisation says uses the fact, that sums of valid variogram models arevalid variograms and that skalar variograms multiplid with a positiv definit matrix are valid vectorvariograms. The function computes such a variogram function from a formal description. Theformula is a sum. Each summand is either a product of a matrix description and a scalar variogramdescription or only a scalar variogram description. Scalar variogram descriptions are either formalfunction calls to
CompLinModCoReg 65
• sph(range) for sperical variogram with range range
• exp(range) for an exponential variogram with range range
• gauss(range) for a Gaussian variogram with range range
• gauss(range) for a cardinal sine variogram with (non-effective) range range
• pow(range) for an power variogram with range range
• lin(unit) linear variogram 1 at unit.
• nugget() for adding a nuggeteffect.
alternatively it can be any expression, which will be evaluated in envir and should depende on adataset of distantce vectrs h.The matrix description always comes first. It can be R1 for a rank 1 matrix PSD for a PositiveSemidefinite matrix or \(S\) for a scalar Sill factor to be multipled with identiy. Or any other con-struct evaluating to a matrix or a function of some parameters with default values, that if called isevaluated to a positive semidefinit matrix. R1 and PSD can also be written as calls with providing avector or respectively a matrix providing the parameter.The variogram is created with default parametervalues. The parameters can later be modified bymodifiying the default parameter with assignments like formals(vg)$sPSD1 = parameterPosdefMat(4*diag(5)).We would anyway expect you to fit the model to the data by a command like vgmFit(logratioVariogram(...),CompLinModCoReg(...))
Value
A variogram function.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
What to cite??
See Also
vgram2lrvgram, CompLinModCoReg, vgmFit
Examples
## Not run:data(juraset)X <- with(juraset,cbind(X,Y))comp <- acomp(juraset,c("Cd","Cu","Pb","Co","Cr"))CompLinModCoReg(~nugget()+sph(0.5)+R1*exp(0.7),comp)CompLinModCoReg(~nugget()+R1*sph(0.5)+R1*exp(0.7)+(0.3*diag(5))*gauss(0.3),comp)CompLinModCoReg(~nugget()+R1*sph(0.5)+R1(c(1,2,3,4,5))*exp(0.7),comp)
## End(Not run)
66 compOKriging
compOKriging Compositional Ordinary Kriging
Description
Geostatistical prediction for compositional data with missing values.
Usage
compOKriging(comp,X,Xnew,vg,err=FALSE)
Arguments
comp an acomp compositional dataset
X A dataset of locations
Xnew The locations, where a geostatistical prediction should be computed.
vg A compositional variogram function.
err boolean: If true kriging errors are computed additionally.
Details
The function performes multivariate ordinary kriging of compositions based on transformes ad-dapted to the missings in every case. The variogram is assumed to be a clr variogram.
Value
A list of class "logratioVariogram"
X The new locations as given by Xnew
Z The predicted values as acomp compositions.
err An ncol(Z)xDxD array with the clr kriging errors.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
Pawlowsky-Glahn, Vera and Olea, Ricardo A. (2004) Geostatistical Analysis of CompositionalData, Oxford University Press, Studies in Mathematical Geology
Tolosana (2008) ...
Tolosana, van den Boogaart, Pawlowsky-Glahn (2009) Estimating and modeling variograms ofcompositional data with occasional missing variables in R, StatGis09
ConfRadius 67
See Also
vgram2lrvgram, CompLinModCoReg, vgmFit
Examples
## Not run:# Load datadata(juraset)X <- with(juraset,cbind(X,Y))comp <- acomp(juraset,c("Cd","Cu","Pb","Co","Cr"))lrv <- logratioVariogram(comp,X,maxdist=1,nbins=10)plot(lrv)
# Fit a variogram modelvgModel <- CompLinModCoReg(~nugget()+sph(0.5)+R1*exp(0.7),comp)fit <- vgmFit2lrv(lrv,vgModel)fitplot(lrv,lrvg=vgram2lrvgram(fit$vg))
# Define A gridx <- (0:30/30)*6y <- (0:30/30)*6Xnew <- cbind(rep(x,length(y)),rep(y,each=length(x)))
# Krigingerg <- compOKriging(comp,X,Xnew,fit$vg,err=FALSE)par(mar=c(0,0,1,0))pairwisePlot(erg$Z,panel=function(a,b,xlab,ylab) {image(x,y,
structure(log(a/b),dim=c(length(x),length(y))),main=paste("log(",xlab,"/",ylab,")",sep=""));points(X,pch=".")})
# Check interpolation propertiesergR <- compOKriging(comp,X,X,fit$vg,err=FALSE)pairwisePlot(ilr(comp),ilr(ergR$Z))ergR <- compOKriging(comp,X,X+1E-7,fit$vg,err=FALSE)pairwisePlot(ilr(comp),ilr(ergR$Z))ergR <- compOKriging(comp,X,X[rev(1:31),],fit$vg,err=FALSE)pairwisePlot(ilr(comp)[rev(1:31),],ilr(ergR$Z))
## End(Not run)
ConfRadius Helper to compute confidence ellipsoids
Description
Computes the quantile of the Mahalanobis distance needed to draw confidence ellipsoids.
68 cor.acomp
Usage
ConfRadius(model,prob=1-alpha,alpha)
Arguments
model A multivariate linear model
prob The confidence probability
alpha The alpha error allowed, i.e. the complement of the confidence probability
Details
Calculates the radius to be used in confidence ellipses for the parameters based on the HottelingsT 2 distribution.
Value
a scalar
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
lm, mvar, AIC
Examples
data(SimulatedAmounts)model <- lm(ilr(sa.groups)~sa.groups.area)cf = coef(model)plot(ilrInv(cf, x=sa.groups))for(i in 1:nrow(cf)){
vr = vcovAcomp(model)[,,i,i]vr = ilrvar2clr(vr)ellipses(ilrInv(cf[i,]), vr, r=ConfRadius(model, alpha=0.05) )}
cor.acomp Correlations of amounts and compositions
Description
Computes the correlation matrix in the various approaches of compositional and amount data anal-ysis.
cor.acomp 69
Usage
cor(x,y=NULL,...)## Default S3 method:
cor(x, y=NULL, use="everything",method=c("pearson", "kendall", "spearman"),...)
## S3 method for class 'acomp'cor(x,y=NULL,...,robust=getOption("robust"))
## S3 method for class 'rcomp'cor(x,y=NULL,...,robust=getOption("robust"))
## S3 method for class 'aplus'cor(x,y=NULL,...,robust=getOption("robust"))
## S3 method for class 'rplus'cor(x,y=NULL,...,robust=getOption("robust"))
## S3 method for class 'rmult'cor(x,y=NULL,...,robust=getOption("robust"))
Arguments
x a data set, eventually of amounts or compositions
y a second data set, eventually of amounts or compositions
use see cor
method see cor
... further arguments to cor e.g. use
robust A description of a robust estimator. FALSE for the classical estimators. Seemean.acomp for further details.
Details
The correlation matrix does not make much sense for compositions.
In R versions older than v2.0.0, cor was defined in package “base” instead of in “stats”. This mightproduce some misfunction.
Value
The correlation matrix.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
var.acomp
70 Coxite
Examples
data(SimulatedAmounts)meanCol(sa.lognormals)cor(acomp(sa.lognormals5[,1:3]),acomp(sa.lognormals5[,4:5]))cor(rcomp(sa.lognormals5[,1:3]),rcomp(sa.lognormals5[,4:5]))cor(aplus(sa.lognormals5[,1:3]),aplus(sa.lognormals5[,4:5]))cor(rplus(sa.lognormals5[,1:3]),rplus(sa.lognormals5[,4:5]))cor(acomp(sa.lognormals5[,1:3]),aplus(sa.lognormals5[,4:5]))
Coxite Compositions, depths and porosities of 25 specimens of coxite
Description
A mineral compositions of 25 rock specimens of coxite type. Each composition consists of thepercentage by weight of five minerals, albite, blandite, cornite, daubite, endite, the depth of location,and porosity.
Usage
data(Coxite)
Details
A mineral compositions of 25 rock specimens of coxite type. Each composition consists of thepercentage by weight of five minerals, albite, blandite, cornite, daubite, endite,the recorded depth oflocation of each specimen, and porosity. Porosity is the percentage of void space that the specimencontains. We abbreviate the minerals names to A, B, C, D, E.
All row percentage sums to 100.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name BOXITE.DAT, here in-cluded under the GNU Public Library Licence Version 2 or newer.
References
Aitchison: The Statistical Analysis of Compositional Data, 1986, Data 4, pp4.
cpt 71
cpt Centered planar transform
Description
Compute the centered planar transform of a (dataset of) compositions and its inverse.
Usage
cpt( x,... )cptInv( z,... )
Arguments
x a composition or a data.matrix of compositions, not necessarily closed
z the cpt-transform of a composition or a data matrix of cpt-transforms of compo-sitions. It is checked that the z sum up to 0.
... generic arguments. not used.
Details
The cpt-transform maps a composition in the D-part real-simplex isometrically to a D-1 dimen-sional euclidian vector space, identified with a plane parallel to the simplex but passing through theorigin. However the transformation is not injective and does not even reach the whole plane. Thusresulting covariance matrices are always singular.
The data can then be analysed in this transformed space by all classical multivariate analysis toolsnot relying on a full rank of the covariance matrix. See ipt and apt for alternatives. The interpreta-tion of the results is relatively easy since the relation of each transformed component to the originalparts is preserved.
The centered planar transform is given by
cpt(x)i := clo(x)i −1
D
Value
cpt gives the centered planar transform, cptInv gives closed compositions with the given cpt-transforms.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
72 DiagnosticProb
References
van den Boogaart, K.G. and R. Tolosana-Delgado (2008) "compositions": a unified R package to an-alyze Compositional Data, Computers & Geosciences, 34 (4), pages 320-338, doi:10.1016/j.cageo.2006.11.017.
See Also
clr,apt,ipt
Examples
(tmp <- cpt(c(1,2,3)))cptInv(tmp)cptInv(tmp) - clo(c(1,2,3)) # 0data(Hydrochem)cdata <- Hydrochem[,6:19]pairs(cpt(cdata),pch=".")
DiagnosticProb Diagnostic probabilities
Description
Data record the probabilities assigned by subjective diagnostic of 15 clinicians and 15 statisticians.
Usage
data(DiagnosticProb)
Details
The data consist of 30 cases: 15 diagnostics probabilities assigned by clinicians, 15 diagnosticsprobabilities assigned by statisticians, and 4 variables: probabilities A, B, and C, and type i.e. 1 forclinicians, 2 for statisticians.
In the study of subjective performance in inferential task the subject is faced with the finite set ofmutually exclusive and exhaustive hypothesis, and the basis of specific information presented tohim/her is required to divide the available unit of probability among these probabilities. In thisstudy the task is presented as a problem of differential diagnosis of three mutually exclusive andexhaustive diseases of students, known under the generic title of ’newmath syndrome’,
A - algebritis,B - bilateral paralexia,C - calculus deficiency.
The subject, playing the role of diagnostician, is informed that the three diseases types are equallycommon and is shown the results of 10 diagnostic tests on 60 previous cases of known diagnosis,20 of each type. The subject is then shown the results of the 10 tests for a new undiagnosed casesand asked to assign diagnostic probabilities to the three possible disease types.
dist 73
Data record the subjective assessments of 15 clinicians and 15 statisticians for the same case. Forthis case the objective diagnosis probabilities are known to be $(.08, .05, .87).$
All row probabilities sum to 1, except for some rounding errors.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name DIAGPROB.DAT, hereincluded under the GNU Public Library Licence Version 2 or newer.
References
Aitchison: The Statistical Analysis of Compositional Data, 1986, Data 17, pp20.
dist Distances in variouse approaches
Description
Calculates a distance matrix from a data set.
Usage
dist(x,...)## Default S3 method:dist(x,...)
Arguments
x a dataset
... further arguments to dist
Details
The distance is computed based on cdt
Value
a distance matrix
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
74 ellipses
See Also
aplus
Examples
data(SimulatedAmounts)phc <- function(d) { plot(hclust(d))}phc(dist(iris[,1:4]))phc(dist(acomp(sa.lognormals),method="manhattan"))phc(dist(rcomp(sa.lognormals)))phc(dist(aplus(sa.lognormals)))phc(dist(rplus(sa.lognormals)))
ellipses Draw ellipses
Description
Draws ellipses from a mean and a variance into a plot.
Usage
ellipses(mean,...)## S3 method for class 'acomp'
ellipses(mean,var,r=1,...,steps=72,thinRatio=NULL,aspanel=FALSE)
## S3 method for class 'rcomp'ellipses(mean,var,r=1,...,steps=72,
thinRatio=NULL,aspanel=FALSE)## S3 method for class 'aplus'
ellipses(mean,var,r=1,...,steps=72,thinRatio=NULL)## S3 method for class 'rplus'
ellipses(mean,var,r=1,...,steps=72,thinRatio=NULL)## S3 method for class 'rmult'
ellipses(mean,var,r=1,...,steps=72,thinRatio=NULL)
Arguments
mean a compositional dataset or value of means or midpoints of the ellipses
var a variance matrix or a set of variance matrices given by var[i,,] (multiplecovariance matrices are not consitently implemented as of today). The principalaxis of the variance give the axis of the ellipses, whereas the square-root of theeigenvalues times r give the half-diameters of the ellipse.
r a scaling of the half-diameters
... further graphical parameters
ellipses 75
steps the number of discretisation points to draw the ellipses.
thinRatio The ellipse function now be default plots the whole ellipsiod by giving its princi-ple circumferences. However this is not reasonable for the thinner directions. Ifa direction other than the first two eigendirections has an eigenvalue not biggerthan thinRatio*rmax it is not plotted. Thus thinRatio=1 reinstantiates the old be-havior of the function. Later thinratio=NULL will become the default, in whichcase the projection of the ellipse is plotted. However this is not implementedyet.
aspanel Is the function called as slave to draw in a panel of a gsi.pairs plot, or as a userfunction setting up the plots.
Details
The ellipsoid/ellipse drawn is given by the solutions of
(x−mean)tvar−1(x−mean) = r2
in the respective geometry of the parameter space. Note that these ellipses can be added to panelplots (by means of orthogonal projections in the corresponding geometry).
There are actually three possibilities of drawing a a hyperdimensional ellipsoid or ellipse and nonof them is perfect.
• thinRatio=1.1 This works like, what was implemented in the older versions of compositons,but never correctly documented. It draws an ellipse with main axes given by the two largestEigendirections of the var-Matrix given.
• thinRatio=0 Draws all the ellipses given by every pair of eigendirections. In this way weget a visual impression of the high dimensional ellipsoid represend by the variance matrix.However the plots gets fastly cluttered in dimensions, when D>4. A 0<thinRatio<1 can avoidusing eigendirection with small extend (i.e. smaller than thinRatio*largest Eigenvalue.
• thinRatio=NULL Draws in each Panel a two dimensional ellipse representing the marginalvariance in the projection of the plot, if var was to be interpreted as a variance matrix. Thiscan be seen as some sort of projection of the high dimensional ellipsoid, but is not necessarilyits visual outline.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
plot.acomp,
Examples
data(SimulatedAmounts)
plot(acomp(sa.lognormals))tt<-acomp(sa.lognormals); ellipses(mean(tt),var(tt),r=2,col="red")tt<-rcomp(sa.lognormals); ellipses(mean(tt),var(tt),r=2,col="blue")
76 endmemberCoordinates
plot(aplus(sa.lognormals[,1:2]))tt<-aplus(sa.lognormals[,1:2]); ellipses(mean(tt),var(tt),r=2,col="red")tt<-rplus(sa.lognormals[,1:2]); ellipses(mean(tt),var(tt),r=2,col="blue")
plot(rplus(sa.lognormals[,1:2]))tt<-aplus(sa.lognormals[,1:2]); ellipses(mean(tt),var(tt),r=2,col="red")tt<-rplus(sa.lognormals[,1:2]); ellipses(mean(tt),var(tt),r=2,col="blue")tt<-rmult(sa.lognormals[,1:2]); ellipses(mean(tt),var(tt),r=2,col="green")
endmemberCoordinates Recast amounts as mixtures of end-members
Description
Computes the convex combination of amounts as mixtures of endmembers to explain X as well aspossible.
Usage
endmemberCoordinates(X,...)endmemberCoordinatesInv(K,endmembers,...)## Default S3 method:
endmemberCoordinates(X,endmembers=diag(gsi.getD(X)), ...)
## S3 method for class 'acomp'endmemberCoordinates(X,
endmembers=clrInv(diag(gsi.getD(X))),...)## S3 method for class 'aplus'
endmemberCoordinates(X,endmembers,...)## S3 method for class 'rplus'
endmemberCoordinates(X,endmembers,...)## S3 method for class 'rmult'
endmemberCoordinatesInv(K,endmembers,...)## S3 method for class 'acomp'
endmemberCoordinatesInv(K,endmembers,...)## S3 method for class 'rcomp'
endmemberCoordinatesInv(K,endmembers,...)## S3 method for class 'aplus'
endmemberCoordinatesInv(K,endmembers,...)## S3 method for class 'rplus'
endmemberCoordinatesInv(K,endmembers,...)
endmemberCoordinates 77
Arguments
X a data set of amounts or compositions, to be represented in as convex combina-tion of the endmembers in the given geometry
K weights of the endmembers in the convex combination
endmembers a dataset of compositions of the same class as X. The number of endmembersgiven must not exceed the dimension of the space plus one.
... currently unused
Details
The convex combination is performed in the respective geometry. This means that, for rcompobjects, positivity of the result is only guaranteed with endmembers corresponding to extremal in-dividuals of the sample, or completely outside its hull. Note also that, in acomp geometry, theendmembers must necessarily be outside the hull.The main idea behind this functions is that the composition actually observed came from a convexcombination of some extremal compositions, specified by endmembers. Up to now, this is con-sidered as meaningful only in rplus geometry, and under some special circumstances, in rcompgeometry. It is not meaningful in terms of mass conservation in acomp and aplus geometries, be-cause these geometries do not preserve mass: whether such an operation has an interpretation isstill a matter of debate. In rcomp geometry, the convex combination is dependent on the units ofmeasurements, and will be completely different for volume and mass %. Even more, it is valid onlyif the whole composition is observed (!).
Value
The endmemberCoordinates functions give a rmult data set with the weights (a.k.a. barycentriccoordinates) allowing to build X as good as possible as a convex combination (a mixture) fromendmembers. The result is of class rmult because there is no guarantee that the resulting weightsare positive (although they sum up to one).The endmemberCoordinatesInv functions reconstruct the convex combination from the weights Kand the given endmembers. The class of endmembers determines the geometry chosen and the classof the result.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de, Raimon Tolosana-Delgado
References
Shurtz, Robert F., 2003. Compositional geometry and mass conservation. Mathematical Geology35 (8), 972–937.
Examples
data(SimulatedAmounts)ep <- aplus(rbind(c(2,1,2),c(2,2,1),c(1,2,2)))# mix the endmembers in "ep" with weights given by "sa.lognormals"dat <- endmemberCoordinatesInv(acomp(sa.lognormals),acomp(ep))par(mfrow=c(1,2))
78 Firework
plot(dat)plot(acomp(ep),add=TRUE,col="red",pch=19)
# compute the barycentric coordinates of the mixture in the "end-member simplex"plot( acomp(endmemberCoordinates(dat,acomp(ep))))
dat <- endmemberCoordinatesInv(rcomp(sa.lognormals),rcomp(ep))plot(dat)plot( rcomp(endmemberCoordinates(dat,rcomp(ep))))
dat <- endmemberCoordinatesInv(aplus(sa.lognormals),aplus(ep))plot(dat)plot( endmemberCoordinates(dat,aplus(ep)))
dat <- endmemberCoordinatesInv(rplus(sa.lognormals),rplus(ep))plot(dat)plot(endmemberCoordinates(rplus(dat),rplus(ep)))
Firework Firework mixtures
Description
Data show two measured properties, brilliance and vorticity, of 81 girandoles composed of differentmixtures of five ingredients: a – e. Of these ingredients, a and b are the primary light-producting, cis principal propellant, and d and e are binding agents for c.
Usage
data(Firework)
Details
The data consist of 81 cases and 7 variables: ingredients a, b, c, d, and e, and the two measuredproperties brilliance and vorticity. The 81 different mixtures form a special experiment design.First the 81 possible quadruples formed from the three values -1, 0, 1 were arranged in ascendingorder. Then for each such quadruple z, the corresponding mixture x(z)=(a,b,c,d,e)=alrInv(z) iscomputed. Thus the No. 4 girandole corresponds to z=(-1,-1,0,-1) and so is composed of a mixturex=(.12,.12,.32,.12,.32) of the five ingredients. All 5-part mixtures sum up to one.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name YATQUAD.DAT, hereincluded under the GNU Public Library Licence Version 2 or newer.
fitdirichlet 79
References
Aitchison: The Statistical Analysis of Compositional Data, 1986, Data 13, pp17.
fitdirichlet Fitting a Dirichlet distribution
Description
Fits a Dirichtlet Distribution to a dataset by maximum likelihood.
Usage
fitDirichlet(x,elog=mean(ult(x)),alpha0=rep(1,length(elog)),maxIter=20,n=nrow(x))
Arguments
x a dataset of compositions (acomp)
elog the expected log can provided instead of the dataset itself.
alpha0 the start value for alpha parameter in the iteration
maxIter The maximum number of iterations in the Fischer scoring method.
n the number of datapoints used to estimate elog
Details
The fitting is done using a modified version of the Fisher-Scoring method using analytiscal expres-sions for log mean and log variance. The modification is introducted to prevent the algorithm fromleaving the admissible parameter set. It reduced the stepsize to at most have of distance to the limitof the admissible parameter set.
Value
alpha the estimated parameter
loglikelihood the likelihood
df The dimension of the dataset minus the dimension of the parameter
Missing Policy
Up to now the fitting can not handle missings.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
80 fitSameMeanDifferentVarianceModel
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
See Also
rDirichlet, acompDirichletGOF.test, runif.acomp, rnorm.acomp,
Examples
x <- rDirichlet.acomp(100,c(1,2,3,4))fitDirichlet(x)
fitSameMeanDifferentVarianceModel
Fit Same Mean Different Variance Model
Description
Fits a model of the same mean, but different variances model to a set of several multivariate normalgroups by maximum likelihood.
Usage
fitSameMeanDifferentVarianceModel(x)
Arguments
x list of rmult type datasets
Details
The function tries to fit a normal model with different variances but the same mean between differentgroups.
Value
mean the estimated mean
vars a list of estimated variance-covariance matrices
N a vector containing the sizes of the groups
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
gausstest 81
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
See Also
acompNormalLocation.test
Examples
fitSameMeanDifferentVarianceModel
gausstest Classical Gauss Test
Description
One and two sample Gauss test for equal mean of normal random variates with known variance.
Usage
Gauss.test(x,y=NULL,mean=0,sd=1,alternative = c("two.sided", "less", "greater"))
Arguments
x a numeric vector providing the first datasety optional second datasetmean the mean to compare withsd the known standard deviationalternative the alternative to be used in the test
Details
The Gauss test is in every Text-Book, but not in R, because it is nearly never used. However it isincluded here for educational purposes.
Value
A classical "htest" object
data.name The name of the dataset as specifiedmethod a name for the test usedparameter the mean and variance provided to the testalternative an empty stringp.value The p.value computed for this test
82 geometricmean
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
t.test
Examples
x <- rnorm(100)y <- rnorm(100)Gauss.test(x,y)
geometricmean The geometric mean
Description
Computes the geometric mean.
Usage
geometricmean(x,...)geometricmeanRow(x,...)geometricmeanCol(x,...)gsi.geometricmean(x,...)gsi.geometricmeanRow(x,...)gsi.geometricmeanCol(x,...)
Arguments
x a numeric vector or matrix of data
... further arguments to compute the mean
Details
The geometric mean is defined as:
geometricmean(x) :=
(n∏i=1
xi
)1/n
The geometric mean is actually computed by exp(mean(log(c(unclass(x))),...)).
getdetectionlimit 83
Value
The geometric means of x as a whole (geometricmean), its rows (geometricmeanRow) or its columns(geometricmeanCol).
Missing Policy
The the first three functions take the geometric mean of all non-missing values. This is because theyshould yield a result in term of data analysis.
Contrarily, the gsi.* functions inherit the arithmetic IEEE policy of R through exp(mean(log(c(unclass(x))),...)).Thus, NA codes a not available i.e. not measured, NaN codes a below detection limit, and 0.0 codesa structural zero. If any of the elements involved is 0, NA or NaN the result is of the same type.Here 0 takes precedence over NA, and NA takes precedence over NaN. For example, if a structural0 appears, the geometric mean is 0 regardless of the presence of NaN’s or NA’s in the rest. Valuesbelow detection limit become NaN’s if they are coded as negative values.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
mean.rplus
Examples
geometricmean(1:10)geometricmean(c(1,0,NA,NaN)) # 0X <- matrix(c(1,NA,NaN,0,1,2,3,4),nrow=4)XgeometricmeanRow(X)geometricmeanCol(X)
getdetectionlimit Gets the detection limit stored in the data set
Description
The detection limit of those values below-detection-limit are stored as negative values in composi-tional dataset. This function extracts that information.
Usage
getDetectionlimit(x,dl=attr(x,"detectionlimit"))
Arguments
x a data set
dl a default to replace the information in the dataset
84 Glacial
Details
For a proper treatment of truncated data it would be necessary to know the detection limit even forobserved data. Unfortunately, there is no clear way to encode this information without annoying theuser.
Value
a matrix in the same shape as x, with a positive value (the detection limit) where available, and NAin the other cells.
Author(s)
K.Gerald van den Boogaart
References
Boogaart, K.G. v.d., R. Tolosana-Delgado, M. Bren (2006) Concepts for handling of zeros and miss-ing values in compositional data, in E. Pirard (ed.) (2006)Proceedings of the IAMG’2006 AnnualConference on "Quantitative Geology from multiple sources", September 2006, Liege, Belgium,S07-01, 4pages, ISBN 978-2-9600644-0-7, http://www.stat.boogaart.de/Publications/iamg06_s07_01.pdf
See Also
compositions.missings,zeroreplace
Examples
x <- c(2,-0.5,4,3,-0.5,5,BDLvalue,MARvalue,MNARvalue)getDetectionlimit(x)
Glacial Compositions and total pebble counts of 92 glacial tills
Description
In a pebble analysis of glacial tills, the total number of pebbles in each of 92 samples was countedand the pebbles were sorted into four categories red sandstone, gray sandstone, crystalline andmiscellaneous. The percentages of these four categories and the total pebble counts are recorded.
Usage
data(Glacial)
gof 85
Details
Percentages by weight in 92 samples of pebbles of glacial tills sorted into four categories, redsandstone, gray sandstone, crystalline and miscellaneous. The percentages of these four categoriesand the total pebbles counts are recorded. The glaciologist is interested in describing the pattern ofvariability of his data and whether the compositions are in any way related to abundance. All rowssum to 100, except for some rounding errors.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name GLACIAL.DAT, hereincluded under the GNU Public Library Licence Version 2 or newer.
References
Aitchison: The Statistical Analysis of Compositional Data, 1986, Data 18, pp21.
gof Compositional Goodness of fit test
Description
Goodness of fit tests for compositional data.
Usage
acompGOF.test(x,...)acompNormalGOF.test(x,...,method="etest")## S3 method for class 'formula'acompGOF.test(formula, data,...,method="etest")## S3 method for class 'list'acompGOF.test(x,...,method="etest")gsi.acompUniformityGOF.test(x,samplesize=nrow(x)*20,R=999)acompTwoSampleGOF.test(x,y,...,method="etest",data=NULL)
Arguments
x a dataset of compositions (acomp)
y a dataset of compositions (acomp)
samplesize number of observations in a reference sample specifying the distribution to com-pare with. Typically substantially larger than the sample under investigation
R The number of replicates to compute the distribution of the test statistic
86 gof
method Selecting a method to be used. Currently only "etest" for using an energy test issupported.
... further arguments to the methods
formula an anova model formula defining groups in the dataset
data unused
Details
The compositional goodness of fit testing problem is essentially a multivariate goodness of fit test.However there is a lack of standardized multivariate goodness of fit tests in R. Some can be foundin the energy-package.
In principle there is only one test behind the Goodness of fit tests provided here, a two sample testwith test statistic. ∑
ij k(xi, yi)√∑ij k(xi, xi)
∑ij k(yi, yi)
The idea behind that statistic is to measure the cos of an angle between the distributions in a scalarproduct given by
(X,Y ) = E[k(X,Y )] = E[
∫K(x−X)K(x− Y )dx]
where k and K are Gaussian kernels with different spread. The bandwith is actually the standardde-viation of k.The other goodness of fit tests against a specific distribution are based on estimating the parametersof the distribution, simulating a large dataset of that distribution and apply the two sample goodnessof fit test.
For the moment, this function covers: two-sample tests, uniformity tests and additive logistic nor-mality tests. Dirichlet distribution tests will be included soon.
Value
A classical "htest" object
data.name The name of the dataset as specified
method a name for the test used
alternative an empty string
replicates a dataset of p-value distributions under the Null-Hypothesis got from nonpara-metric bootstrap
p.value The p.value computed for this test
Missing Policy
Up to now the tests can not handle missings.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
gof 87
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
See Also
fitDirichlet,rDirichlet, runif.acomp, rnorm.acomp,
Examples
## Not run:x <- runif.acomp(100,4)y <- runif.acomp(100,4)
erg <- acompTwoSampleGOF.test(x,y)#continueergunclass(erg)erg <- acompGOF.test(x,y)
x <- runif.acomp(100,4)y <- runif.acomp(100,4)dd <- replicate(1000,acompGOF.test(runif.acomp(100,4),runif.acomp(100,4))$p.value)hist(dd)
dd <- replicate(1000,acompGOF.test(runif.acomp(20,4),runif.acomp(100,4))$p.value)hist(dd)dd <- replicate(1000,acompGOF.test(runif.acomp(10,4),runif.acomp(100,4))$p.value)
hist(dd)dd <- replicate(1000,acompGOF.test(runif.acomp(10,4),runif.acomp(400,4))$p.value)hist(dd)dd <- replicate(1000,acompGOF.test(runif.acomp(400,4),runif.acomp(10,4),bandwidth=4)$p.value)hist(dd)
dd <- replicate(1000,acompGOF.test(runif.acomp(20,4),runif.acomp(100,4)+acomp(c(1,2,3,1)))$p.value)
hist(dd)
# test uniformity
attach("gsi") # the uniformity test is only available as an internal functionx <- runif.acomp(100,4)gsi.acompUniformityGOF.test.test(x)
dd <- replicate(1000,gsi.acompUniformityGOF.test.test(runif.acomp(10,4))$p.value)hist(dd)detach("gsi")
88 groupparts
## End(Not run)
groupparts Group amounts of parts
Description
Groups parts by amalgamation or balancing of their amounts or proportions.
Usage
groupparts(x,...)## S3 method for class 'acomp'groupparts(x,...,groups=list(...))## S3 method for class 'rcomp'groupparts(x,...,groups=list(...))## S3 method for class 'aplus'groupparts(x,...,groups=list(...))## S3 method for class 'rplus'groupparts(x,...,groups=list(...))## S3 method for class 'ccomp'groupparts(x,...,groups=list(...))
Arguments
x an amount/compositional dataset
... further parameters to use (actually ignored)
groups a list of numeric xor character vectors, each giving a group of parts
Details
In the real geometry grouping is done by amalgamation (i.e. adding the parts). In the Aitchison-geometry grouping is done by taking geometric means. The new parts are named by named formalarguments. Not-mentioned parts remain ungrouped.
Value
a new dataset of the same type with each group represented by a single column
Missing Policy
For the real geometries, SZ and BDL are considered as 0, and MAR and MNAR are kept as missingof the same type. For the relative geometries, a BDL is a special kind of MNAR, whereas a SZis qualitatively different (thus a balance with a SZ has no sense). MAR values transfer their MARproperty to the resulting new variable.
Hongite 89
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de, Raimon Tolosana-Delgado
References
Egozcue, J.J. and V. Pawlowsky-Glahn (2005) Groups of Parts and their Balances in CompositionalData Analysis, Mathematical Geology, in press
See Also
aplus
Examples
data(SimulatedAmounts)plot(groupparts(acomp(sa.lognormals5),A=c(1,2),B=c(3,4),C=5))plot(groupparts(aplus(sa.lognormals5),B=c(3,4),C=5))plot(groupparts(rcomp(sa.lognormals5),A=c("Cu","Pb"),B=c(2,5)))hist(groupparts(rplus(sa.lognormals5),1:5))
Hongite Compositions of 25 specimens of hongite
Description
A mineral compositions of 25 rock specimens of hongite type. Each composition consists of thepercentage by weight of five minerals, albite, blandite, cornite, daubite, endite.
Usage
data(Hongite)
Details
A mineral compositions of 25 rock specimens of hongite type. Each composition consists of the per-centage by weight of five minerals, albite, blandite, cornite, daubite, endite, which we convenientlyabbreviate to A, B, C, D, E. All row sums are equal to 100, except for rounding errors.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name HONGITE.DAT, hereincluded under the GNU Public Library Licence Version 2 or newer.
90 HotellingsTsq
References
Aitchison (1986): The Statistical Analysis of Compositional Data; (Data 1), pp2, 9.
HotellingsTsq Hotellings T square distribution
Description
The Hotellings T square distribution is the distribution of the squared Mahalanobis distances withrespected to estimated variance covariance matrices.
Usage
qHotellingsTsq(p,n,m)pHotellingsTsq(q,n,m)
Arguments
p a (vector of) probabilities
q a vector of quantils
n number of parameters, the p parameter of Hotellings T 2 distribution
m number of dimensions, the m parameter of the Hotellings T 2 distribution
Details
The Hotellings T 2 with paramter p and m is the distribution empirical squared Mahalanobis dis-tances of a m dimensional vector with respect to a variance covariance matrix estimated based onnp degrees of freedom.
Value
qHotellingsT2 a vector of quantils
pHotellingsT2 a vector of probabilities
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
ellipses, ConfRadius, pf
Examples
(q <- qHotellingsTsq(seq(0,0.9,by=0.1),3,25))pHotellingsTsq(q,3,25)
HouseholdExp 91
HouseholdExp Household Expenditures
Description
Household budget survey data on month expenditures of twenty men and twenty women for fourcommodity groups: housing, foodstuffs, other, and services. Amounts in HK\$ are given. There are40 cases and 5 variables for 4 commodity groups and sex.
Usage
data(HouseholdExp)
Details
In a sample survey of people living alone in a rented accommodation, twenty men and twentywomen were randomly selected and asked to record over a period of one month their expenditures onthe following four mutually exclusive and exhaustive commodity groups: Housing, including fueland lights, Foodstuffs, including alcohol and tobacco, Other goods, including clothing, footwearand durable goods, and Services, including transport and vehicles. Amounts in HK\$ are given.There are 40 cases, 20 men and 20 women and 5 variables: 4 for commodity groups Housing,Food, Other, Services and the fifth sex, 1 for men, $-1$ for women. Note that the data has no sumconstraint.
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name HEMF.DAT, here in-cluded under the GNU Public Library Licence Version 2 or newer.
References
Aitchison (1986): The Statistical Analysis of Compositional Data, (Data 08), pp13.
Hydrochem Hydrochemical composition data set of Llobregat river basin water(NE Spain)
Description
Contains a hydrochemical amount/compositional data set obtained from several rivers in the Llo-bregat river basin, in northeastern Spain.
Usage
data(Hydrochem)
92 Hydrochem
Format
Data matrix with 485 cases and 19 variables.
Details
This hydrochemical data set contains measurements of 14 components, H, Na, K, Ca, Mg, Sr,Ba, NH4, Cl, HCO3, NO3, SO4, PO4, TOC. From them, hydrogen was derived by inverting therelationship between its stable form in water, H3O+, and pH. Details can be found in Otero et al.(2005). Each of these parameters is measured approximately once each month during 2 years in31 stations, placed along the rivers and main tributaries of the Llobregat river, one of the mediumrivers in northeastern Spain.
The Llobregat river drains an area of 4948.2 km2, and it is 156.6 km long, with two main tribu-taries, Cardener and Anoia. The headwaters of Llobregat and Cardener are in a rather unpollutedarea of the Eastern Pyrenees. Mid-waters these rivers flow through a densely populated and indus-trialized area, where potash mining activity occurs and there are large salt mine tailings stored withno water proofing. There, the main land use is agriculture and stockbreeding. The lower courseflows through one of the most densely populated areas of the Mediterranean region (around the cityof Barcelona) and the river receives large inputs from industry and urban origin, while intensiveagriculture activity is again present in the Llobregat delta. Anoia is quite different. Its headwatersare in an agricultural area, downwaters it flows through an industrialized zone (paper mills, tan-nery and textile industries), and near the confluence with Llobregat the main land use is agricultureagain, mainly vineyards, with a decrease in industry and urban contribution. Given this variety ingeological background and human activities, the sample has been splitted in four groups (higherLlobregat course, Cardener, Anoia and lower Llobregat course), which in turn are splitted into mainriver and tributaries (Otero et al, 2005). Information on these groupings, the sampling locations andsampling time is included in 5 complementary variables.
Author(s)
Raimon Tolosana-Delgado
Source
The dataset is also accessible in Otero et al. (2005), and are here included under the GNU PublicLibrary Licence Version 2 or newer.
References
Otero, N.; R. Tolosana-Delgado, A. Soler, V. Pawlowsky-Glahn and A. Canals (2005). Relative vs.absolute statistical analysis of compositions: A comparative study of surface waters of a Mediter-ranean river. Water Research, 39(7): 1404-1414. doi: 10.1016/j.watres.2005.01.012
Tolosana-Delgado, R.; Otero, N.; Pawlowsky-Glahn, V.; Soler, A. (2005). Latent CompositionalFactors in The Llobregat River Basin (Spain) Hydrogeochemistry. Mathematical Geology 37(7):681-702.
Examples
data(Hydrochem)biplot(princomp(rplus(Hydrochem)))
idt 93
biplot(princomp(rcomp(Hydrochem)))
biplot(princomp(aplus(Hydrochem)))biplot(princomp(acomp(Hydrochem)))
idt Isometric default transform
Description
Compute the isometric default transform of a vector (or dataset) of compositions or amounts in theselected class.
Usage
idt(x,...)## Default S3 method:
idt( x,... )## S3 method for class 'acomp'
idt( x ,...)## S3 method for class 'rcomp'
idt( x ,...)## S3 method for class 'aplus'
idt( x ,...)## S3 method for class 'rplus'
idt( x ,...)## S3 method for class 'rmult'
idt( x ,...)## S3 method for class 'ccomp'
idt( x ,...)## S3 method for class 'factor'
idt( x ,...)idtInv(x,orig,...)## Default S3 method:
idtInv( x ,orig,...)## S3 method for class 'acomp'
idtInv( x ,orig,...)## S3 method for class 'rcomp'
idtInv( x ,orig,...)## S3 method for class 'aplus'
idtInv( x ,orig,...)## S3 method for class 'rplus'
idtInv( x ,orig,...)## S3 method for class 'ccomp'
idtInv( x ,orig,...)## S3 method for class 'rmult'
idtInv( x ,orig,...)
94 idt
Arguments
x a classed amount or composition, to be transformed with its isometric defaulttransform, or its inverse
... generic arguments past to underlying functions
orig a compositional object which should be mimicked by the inverse transforma-tion. It is used to determine the backtransform to be used, and eventually toreconstruct the names of the parts. It is the generic argument. Typically theorig-argument is the dataset that has been transformed in the first place.
Details
The general idea of this package is to analyse the same data with different geometric concepts, ina fashion as similar as possible. For each of the four concepts there exists an isometric transformexpressing the geometry in a full-rank euclidean vector space. Such a transformation is computedby idt. For acomp the transform is ilr, for rcomp it is ipt, for aplus it is ilt, and for rplus it isiit. Keep in mind that the transform does not keep the variable names, since there is no guaranteedone-to-one relation between the original parts and each transformed variable.The inverse idtInv is intended to allow for an "easy" and automatic back-transformation, withoutintervention of the user. The argument orig (the one determining the behaviour of idtInv as ageneric function) tells the function which back-transformation should be applied, and gives thecolumn names of orig to the back-transformed values of x. Therefore, it is very conventient to givethe original classed data set used in the analysis as orig.
Value
A corresponding matrix of row-vectors containing the transforms.
Author(s)
R. Tolosana-Delgado, K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
van den Boogaart, K.G. and R. Tolosana-Delgado (2008) "compositions": a unified R package to an-alyze Compositional Data, Computers & Geosciences, 34 (4), pages 320-338, doi:10.1016/j.cageo.2006.11.017.
See Also
cdt, ilr, ipt, ilt, cdtInv, ilrInv, iptInv, iltInv, iitInv
Examples
## Not run:# the idt is defined byidt <- function(x) UseMethod("idt",x)idt.default <- function(x) x
iit 95
idt.acomp <- function(x) ilr(x)idt.rcomp <- function(x) ipt(x)idt.aplus <- iltidt.rplus <- iit
## End(Not run)idt(acomp(1:5))idt(rcomp(1:5))
data(Hydrochem)x = Hydrochem[,c("Na","K","Mg","Ca")]y = acomp(x)z = idt(y)y2 = idtInv(z,y)par(mfrow=c(2,2))for(i in 1:4){plot(y[,i],y2[,i])}
iit Isometric identity transform
Description
Compute the isometric identity transform of a vector (dataset) of amounts and its inverse.
Usage
iit( x ,...)iitInv( z ,... )
Arguments
x a vector or data matrix of amounts
z the iit-transform of a vector or data.matrix of iit-transforms of amounts
... generic arguments, to pass to other functions.
Details
The iit-transform maps D amounts (considered in a real geometry) isometrically to a D dimensonaleuclidian vector. The iit is part of the rplus framework. Despite its trivial operation, it is presentto achieve maximal analogy between the aplus and the rplus framework.The data can then be analysed in this transformated space by all classical multivariate analysistools. The interpretation of the results is easy since the relation to the original variables is pre-served. However results may be inconsistent, since the multivariate analysis tools disregard thepositivity condition and the inner laws of amounts.
The isometric identity transform is a simple identity given by
iit(x)i := xi
96 ilr
Value
ilt gives the isometric identity transform, i.e. simply the input stripped of the "rplus" class attribute,iptInv gives amounts with class "rplus" with the given iit, i.e. simply the argument checked to bea valid "rplus" object, and with this class attribute.
Note
iit can be used to unclass amounts.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
van den Boogaart, K.G. and R. Tolosana-Delgado (2008) "compositions": a unified R package to an-alyze Compositional Data, Computers & Geosciences, 34 (4), pages 320-338, doi:10.1016/j.cageo.2006.11.017.
See Also
ilt, ilr, rplus
Examples
(tmp <- iit(c(1,2,3)))iitInv(tmp)iitInv(tmp) - c(1,2,3) # 0data(Hydrochem)cdata <- Hydrochem[,6:19]pairs(iit(cdata))
ilr Isometric log ratio transform
Description
Compute the isometric log ratio transform of a (dataset of) composition(s), and its inverse.
Usage
ilr( x , V = ilrBase(x) ,...)ilrInv( z , V = ilrBase(z=z),..., orig=NULL)
ilr 97
Arguments
x a composition, not necessarily closed
z the ilr-transform of a composition
V a matrix, with columns giving the chosen basis of the clr-plane
... generic arguments. not used.
orig a compositional object which should be mimicked by the inverse transformation.It is especially used to reconstruct the names of the parts.
Details
The ilr-transform maps a composition in the D-part Aitchison-simplex isometrically to a D-1 di-mensonal euclidian vector. The data can then be analysed in this transformation by all classicalmultivariate analysis tools. However the interpretation of the results may be difficult, since there isno one-to-one relation between the original parts and the transformed variables.
The isometric logratio transform is given by
ilr(x) := V tclr(x)
with clr(x) the centred log ratio transform and V ∈ Rd×(d−1) a matrix which columns form anorthonormal basis of the clr-plane. A default matrix V is given by ilrBase(D).
Value
ilr gives the isometric log ratio transform, ilrInv gives closed compositions with the given ilr-transforms
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de, Raimon Tolosana-Delgado
References
Egozcue J.J., V. Pawlowsky-Glahn, G. Mateu-Figueras and C. Barcel’o-Vidal (2003) Isometric lo-gratio transformations for compositional data analysis. Mathematical Geology, 35(3) 279-300Aitchison, J, C. Barcel’o-Vidal, J.J. Egozcue, V. Pawlowsky-Glahn (2002) A consise guide to the al-gebraic geometric structure of the simplex, the sample space for compositional data analysis, TerraNostra, Schriften der Alfred Wegener-Stiftung, 03/2003http://ima.udg.es/Activitats/CoDaWork03
See Also
clr,alr,apt, ilrBase
98 ilrBase
Examples
(tmp <- ilr(c(1,2,3)))ilrInv(tmp)ilrInv(tmp) - clo(c(1,2,3)) # 0data(Hydrochem)cdata <- Hydrochem[,6:19]pairs(ilr(cdata))ilrBase(D=3)
ilrBase The canonical basis in the clr plane used for ilr and ipt transforms.
Description
Compute the basis of a clr-plane, to use with isometric log-ratio or planar transform of a (datasetof) compositions.
Usage
ilrBase( x=NULL , z=NULL , D = NULL, method = "basic" )
Arguments
x optional dataset or vector of compositions
z optional dataset or vector containing ilr or ipt coordinates
D number of parts of the simplex
method method to build the basis, one of "basic", "balanced", "optimal" "PBhclust","PBmaxvar" or "PBangprox"
Details
Method "basic" computes a triangular Helmert matrix (corresponding to the original ilr transforma-tion defined by Egozcue et al, 2003). In this case, ilrBase is a wrapper catching the answers ofgsi.ilrBase and is to be used as the more convenient function.
Method "balanced" returns an ilr matrix associated with a balanced partition, splitting the parts ingroups as equal as possible. Transforms ilr and ipt computed with this basis are less affected byany component (as happens with "basic").
The following methods are all data-driven and will fail if x is not given. Some of these methods areextended to non-acomp datasets via the cpt general functionality. Use with care with non-acompobjects!
Method "optimal" is a wrapper to gsi.optimalilrBase, providing the ilr basis with less influ-ence of missing values. It is computed as a hierarchical cluster of variables, with parts previouslytransformed to 1 (if the value is lost) or 0 (if it is recorded).
ilt 99
Methods "PBhclust", "PBmaxvar" and "PBangprox" are principal balance methods (i.e. balancesapproximating principal components in different ways). These are all resolved by calls to gsi.PrinBal.Principal balances functionality should be considered beta!
Value
All methods give a matrix containing by columns the basis elements for the canonical basis ofthe clr-plane used for the ilr and ipt transform. Only one of the arguments x, z or D is needed todetermine the dimension of the simplex.
References
Egozcue J.J., V. Pawlowsky-Glahn, G. Mateu-Figueras and C. Barcel’o-Vidal (2003) Isometric lo-gratio transformations for compositional data analysis. Mathematical Geology, 35(3) 279-300
http://ima.udg.es/Activitats/CoDaWork03
See Also
clr,ilr,ipt
Examples
ilr(c(1,2,3))ilrBase(D=2)ilrBase(c(1,2,3))ilrBase(z= ilr(c(1,2,3)) )round(ilrBase(D=7),digits= 3)ilrBase(D=7,method="basic")ilrBase(D=7,method="balanced")
ilt Isometric log transform
Description
Compute the isometric log transform of a vector (dataset) of amounts and its inverse.
Usage
ilt( x ,...)iltInv( z ,... )
Arguments
x a vector or data matrix of amountsz the ilt-transform of a vector or data matrix of ilt-transforms of amounts... generic arguments, not used.
100 ipt
Details
The ilt-transform maps D amounts (considered in log geometry) isometrically to a D dimensionaleuclidean vector. The ilt is part of the aplus framework.The data can then be analysed in this transformation by all classical multivariate analysis tools. Theinterpretation of the results is easy since the relation to the original variables is preserved.
The isometric log transform is given by
ilt(x)i := lnxi
Value
ilt gives the isometric log transform, i.e. simply the log of the argument, whereas iltInv givesamounts with the given ilt, i.e. simply the exp of the argument.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
van den Boogaart, K.G. and R. Tolosana-Delgado (2008) "compositions": a unified R package to an-alyze Compositional Data, Computers & Geosciences, 34 (4), pages 320-338, doi:10.1016/j.cageo.2006.11.017.
See Also
ilr, iit, aplus
Examples
(tmp <- ilt(c(1,2,3)))iltInv(tmp)iltInv(tmp) - c(1,2,3) # 0data(Hydrochem)cdata <- Hydrochem[,6:19]pairs(ilt(cdata))
ipt Isometric planar transform
Description
Compute the isometric planar transform of a (dataset of) composition(s) and its inverse.
ipt 101
Usage
ipt( x , V = ilrBase(x),... )iptInv( z , V = ilrBase(z=z),...,orig=NULL)uciptInv( z , V = ilrBase(z=z),...,orig=NULL )
Arguments
x a composition or a data matrix of compositions, not necessarily closed
z the ipt-transform of a composition or a data matrix of ipt-transforms of compo-sitions
V a matrix with columns giving the chosen basis of the clr-plane
... generic arguments. not used.
orig a compositional object which should be mimicked by the inverse transformation.It is especially used to reconstruct the names of the parts.
Details
The ipt-transform maps a composition in the D-part real-simplex isometrically to a D-1 dimensonaleuclidian vector. Although the transformation does not reach the whole RD−1, resulting covariancematrices are typically of full rank.The data can then be analysed in this transformation by all classical multivariate analysis tools.However, interpretation of results may be difficult, since the transform does not keep the variablenames, given that there is no one-to-one relation between the original parts and each transformedvariables. See cpt and apt for alternatives.
The isometric planar transform is given by
ipt(x) := V tcpt(x)
with cpt(x) the centred planar transform and V ∈ Rd×(d−1) a matrix which columns form anorthonormal basis of the clr-plane. A default matrix V is given by ilrBase(D)
Value
ipt gives the centered planar transform, iptInv gives closed compositions with with the givenipt-transforms, uciptInv unconstrained iptInv does the same as iptInv but sets illegal values to NArather than giving an error. This is a workaround to allow procedures not honoring the constraintsof the space.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
van den Boogaart, K.G. and R. Tolosana-Delgado (2008) "compositions": a unified R package to an-alyze Compositional Data, Computers & Geosciences, 34 (4), pages 320-338, doi:10.1016/j.cageo.2006.11.017.
102 is.acomp
See Also
ilr,ilrBase, cpt
Examples
(tmp <- ipt(c(1,2,3)))iptInv(tmp)iptInv(tmp) - clo(c(1,2,3)) # 0data(Hydrochem)cdata <- Hydrochem[,6:19]pairs(ipt(cdata))
is.acomp Check for compositional data type
Description
is.XXX returns TRUE if and only if its argument is of type XXX
Usage
is.acomp(x)is.rcomp(x)is.aplus(x)is.rplus(x)is.rmult(x)is.ccomp(x)
Arguments
x any object to be checked
Details
These functions only check for the class of the object.
Value
TRUE or FALSE
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
acomp, rcomp aplus, rplus
IsMahalanobisOutlier 103
Examples
is.acomp(1:3)is.acomp(acomp(1:3))is.rcomp(acomp(1:3))is.acomp(acomp(1:3)+acomp(1:3))
IsMahalanobisOutlier Checking for outliers
Description
Detect outliers with respect to a normal distribution model.
Usage
IsMahalanobisOutlier(X,...,alpha=0.05,goodOnly=NULL,replicates=1000,corrected=TRUE,robust=TRUE,crit=NULL)
Arguments
X a dataset (e.g. given as acomp, rcomp, aplus, rplus or rmult) object to which idtand MahalanobisDist apply.
... further arguments to MahalanobisDist/gsi.mahOutlier
alpha The confidence level for identifying outliers.
goodOnly an integer vector. Only the specified index of the dataset should be used forestimation of the outlier criteria. This parameter if only a small portion of thedataset is reliable.
replicates The number of replicates to be used in the Monte Carlo simulations for deter-mination of the quantiles. The replicates not given a minimum is computedfrom the alpha level to ensure reasonable precission.
corrected logical. Literatur often proposed to compare the Mahalanobis distances withChisq-Approximations of there distributions. However this does not correct formultiple testing. If corrected is true a correction for multiple testing is used. Inany case we do not use the chisq-approximation, but a simulation based proce-dure to compute confidence bounds.
robust A robustness description as define in robustnessInCompositions
crit The critical value to be used. Typically the routine is called mainly for the pur-pose of finding this value, which it does, when crit is NULL, however sometimeswe might want to specifiy a value used by someone else to reproduce the results.
Details
See outliersInCompositions and robustnessInCompositions for a comprehensive introduction intothe outlier treatment in compositions.
See OutlierClassifier1 for a highlevel method to classify observations in the context of outliers.
104 isoPortionLines
Value
A logical vector giving for each element the result of the alpha-level test for beeing an outlier.TRUE corresponds to a significant result.
Note
For some unkown reasons the computation sometimes produces NaN’s. In this case a warning isissued and a recomputation is tried.
The package robustbase is required for using the robust estimations.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
OutlierClassifier1 , outlierplot, ClusterFinder1
Examples
## Not run:data(SimulatedAmounts)
datas <- list(data1=sa.outliers1,data2=sa.outliers2,data3=sa.outliers3,data4=sa.outliers4,data5=sa.outliers5,data6=sa.outliers6)
opar<-par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))tmp<-mapply(function(x,y){plot(x,col=ifelse(IsMahalanobisOutlier(x),"red","gray"))
title(y)},datas,names(datas))
## End(Not run)
isoPortionLines Isoportion- and Isoproportion-lines
Description
Add lines of equal portion and proportion to ternary diagrams, to serve as reference axis.
Usage
isoPortionLines(...)## S3 method for class 'acomp'isoPortionLines(by=0.2,at=seq(0,1,by=by),...,
parts=1:3,total=1,labs=TRUE,lines=TRUE,unit="")## S3 method for class 'rcomp'
isoPortionLines 105
isoPortionLines(by=0.2,at=seq(0,1,by=by),...,parts=1:3,total=1,labs=TRUE,lines=TRUE,unit="")
isoProportionLines(...)## S3 method for class 'acomp'isoProportionLines(by=0.2,at=seq(0,1,by=by),...,
parts=1:3,labs=TRUE,lines=TRUE)## S3 method for class 'rcomp'isoProportionLines(by=0.2,at=seq(0,1,by=by),...,
parts=1:3,labs=TRUE,lines=TRUE)
Arguments
... graphical arguments
at numeric in [0,1]: which portions/proportions should be marked?
by numeric in (0,1]: steps between protions/proportions
parts numeric vector subset of {1,2,3}: the variables to be marked
total the total amount to be used in labeling
labs logical: plot the labels?
lines logical: plot the lines?
unit mark of the units e.g. "%"
Details
Isoportion lines give lines of the same portion of one of the parts, while isoproportion line giveslines of the same ratio between two parts. The isoproportion lines are straight lines in both theAitchison and the real geometries of the simplex, while the isoportion lines are not straight in anAitchison sense (only in the real one). However, note that both types of lines remain straight in thereal sense when perturbed (von Eynatten et al., 2002).
Note
Currently IsoLines only works with individual plots. This is mainly due to the fact that I have noidea, what the user interface of this function should look like for multipanel plots. This includesphilosophical problems with the meaning of isoportions in case of marginal plots.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
von Eynatten, H., V. Pawlowsky-Glahn, and J.J. Egozcue (2002) Understanding perturbation on thesimplex: a simple method to better visualize and interpret compositional data in ternary diagrams.Mathematical Geology 34, 249-257
106 jura
See Also
plot.acomp
Examples
data(SimulatedAmounts)plot(acomp(sa.lognormals))isoPortionLines()plot(acomp(sa.lognormals),center=TRUE)isoPortionLines()plot(rcomp(sa.lognormals))isoPortionLines()plot(acomp(sa.lognormals))isoProportionLines()plot(acomp(sa.lognormals),center=TRUE)isoProportionLines()plot(rcomp(sa.lognormals))isoProportionLines()
jura The jura dataset
Description
A geochemical dataset from the Swiss Jura.
Usage
data(juraset)data(jura259)
Format
A 359x11 or 259x11 dataframe
Details
The JURA data set provided by J.-P. Dubois, IATE-Paedologie, Ecole Polytechnique Federale deLausanne, 1015 Lausanne, Switzerland. Spatial coordinates and values of categorial and continuousattributes at the 359 sampled sites. The 100 test locartions are denoted with a star. Rock Types: 1:Argovian, 2: Kimmeridgian, 3: Sequanian, 4: Portlandian, 5: Quaternary. Land uses: 1: Forest, 2:Pasture, 3: Meadow , 4: Tillage
X X location coordinateY Y location coordinateRock Categorical: rocktype,Land Categorical: land usageCd element amount,
kingTetrahedron 107
Cu element amount,Pb element amount,Co element amount,Cr element amount,Ni element amount,
All 3-part compositions sum to one.
Source
AI-Geostats
References
Atteia, O., Dubois, J.-P., Webster, R., 1994, Geostatistical analysis of soil contamination in theSwiss Jura: Environmental Pollution 86, 315-327
Webster, R., Atteia, O., Dubois, J.-P., 1994, Coregionalization of trace metals in the soil in the SwissJura: European Journal of Soil Science 45, 205-218
Examples
## Not run:data(juraset)X <- with(juraset,cbind(X,Y))comp <- acomp(juraset,c("Cd","Cu","Pb","Co","Cr"))lrv <- logratioVariogram(comp,X,maxdist=1,nbins=10)plot(lrv)
## End(Not run)
kingTetrahedron Ploting composition into rotable tetrahedron
Description
Plots acomp/rcomp objects into tetrahedron exported in kinemage format.
Usage
kingTetrahedron(X, parts=1:4, file="tmptetrahedron.kin",clu=NULL,vec=NULL, king=TRUE, scale=0.2, col=1,title="Compositional Tetrahedron")
108 kingTetrahedron
Arguments
X a compositional acomp or rcomp object of 4 or more parts
parts a numeric or character vector specifying the 4 parts to be used.
file file.kin for 3D display with the KiNG (Kinemage, Next Generation) interactivesystem for three-dimensional vector graphics.
clu partition determining the colors of points
vec vector of values determining points sizes
king FALSE for Mage; TRUE for King (described below)
scale relative size of points
col color of points if clu=NULL
title The title of the plot
Details
The routine transforms a 4 parts mixture m quadrays into 3-dimensional XYZ coordinates andwrites them as file.kin. For this transformation we apply K. Urner: Quadrays and XYZ at http://www.grunch.net/synergetics/quadxyz.html. The kin file we display as 3-D animation withKiNG viewer a free software available at http://kinemage.biochem.duke.edu. A kinemage isa dynamic, 3-D illustration. The best way to take advantage of that is by rotating it and twistingit around with the mouse click near the center of the graphics window and slowly draging right orleft, up or down. Furthermore by clicking on points with the mouse (left button again), the labelassociated with each point will appear in the bottom left of the graphics area and also the distancefrom this point to the last will be displayed. With the right button drag we can zoom in and out ofthe picture. This animation supports coloring and different sizing of points.
We can display the kin file as 3-D animation also with MAGE viewer a previous version of KiNG,also a free software available at http://kinemage.biochem.duke.edu. For this one has to put king=FALSEas a parameter.
Value
The function is called for its side effect of generating a file for 3D display with the KiNG (Kinemage,Next Generation) interactive system for three-dimensional vector graphics. Works only with KiNGviewer available at http://kinemage.biochem.duke.edu
Note
This routine and the documentation is based on mix.Quad2net from the MixeR-package of VladimirBatagelj and Matevz Bren, and has been contributed by Matevz Bren to this package. Only slightmodifications have been applied to make function compatible with the philosophy and objects ofthe compositions package.
Author(s)
Vladimir Batagelj and Matevz Bren, with slight modifications of K.Gerald van den Boogaart
Kongite 109
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
http://vlado.fmf.uni-lj.si/pub/MixeR/
http://www.grunch.net/synergetics/quadxyz.html
See Also
plot.acomp
Examples
data(SimulatedAmounts)dat <- acomp(sa.groups5)hc <- hclust(dist(dat), method = "complete") # data are clusteredkingTetrahedron(dat,parts=1:4, file="myfirst.kin", clu=cutree(hc,7),scale=0.2)# the 3-D plot is written into Glac1.kin file to be displayed with KiNG viewer.# The seven clusters partition is notated with different colors of points.
Kongite Compositions of 25 specimens of kongite
Description
A mineral compositions of 25 rock specimens of kongite type. Each composition consists of thepercentage by weight of five minerals, albite, blandite, cornite, daubite, endite.
Usage
data(Kongite)
Details
A mineral compositions of 25 rock specimens of hongite type. Each composition consists of the per-centage by weight of five minerals, albite, blandite, cornite, daubite, endite, which we convenientlyabbreviate to A, B, C, D, E. All row percentage sums to 100.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name HONGITE.DAT, hereincluded under the GNU Public Library Licence Version 2 or newer.
110 lines
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data, (Data 2), pp2.
lines Draws connected lines from point to point.
Description
Functions taking coordinates given in various ways and joining the corresponding points with linesegments.
Usage
## S3 method for class 'acomp'lines(x,...,steps=30,aspanel=FALSE)
## S3 method for class 'rcomp'lines(x,...,steps=30,aspanel=FALSE)
## S3 method for class 'aplus'lines(x,...,steps=30,aspanel=FALSE)
## S3 method for class 'rplus'lines(x,...,steps=30,aspanel=FALSE)
## S3 method for class 'rmult'lines(x,...,steps=30,aspanel=FALSE)
Arguments
x a dataset of the given type
... further graphical parameters
steps the number of discretisation points to draw the segments, which might be notvisually straight.
aspanel Logical, indicates use as slave to do acutal drawing only.
Details
The functions add lines to the graphics generated with the corresponding plot functions.Adding to multipaneled plots, redraws the plot completely and is only possible, when the plot hasbeen created with the plotting routines from this library.For the rcomp/rplus geometries the main problem is providing a function that reasonably workswith lines leaving the area. We tried to use a policy of cuting the line at the actual borders of the(high dimensional) simplex. That can lead to very strange visual impression showing lines endingsomewhere in the middle of the plot. However these lines actually hit some border of the simplexthat is not shown in the plot. A hyper dimensional tetrahedron is even more difficult to imagin thana hyperdimensional cube.
logratioVariogram 111
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de, Raimon Tolosana-Delgado
See Also
plot.acomp, straight
Examples
data(SimulatedAmounts)
plot(acomp(sa.lognormals))lines(acomp(sa.lognormals),col="red")lines(rcomp(sa.lognormals),col="blue")
plot(aplus(sa.lognormals[,1:2]))lines(aplus(sa.lognormals[,1:2]),col="red")lines(rplus(sa.lognormals)[,1:2],col="blue")
plot(rplus(sa.lognormals[,1:2]))tt<-aplus(sa.lognormals[,1:2]); ellipses(mean(tt),var(tt),r=2,col="red")tt<-rplus(sa.lognormals[,1:2]); ellipses(mean(tt),var(tt),r=2,col="blue")tt<-rmult(sa.lognormals[,1:2]); ellipses(mean(tt),var(tt),r=2,col="green")
logratioVariogram Empirical variograms for compositions
Description
Computes the matrix of logratio variograms.
Usage
logratioVariogram(comp,loc,maxdist=max(dist(loc))/2,nbins=20,dists=seq(0,maxdist,length.out=nbins+1),bins=cbind(dists[-length(dists)],dists[-1]),azimuth=0,azimuth.tol=180)
112 logratioVariogram
Arguments
comp an acomp compositional dataset
loc a matrix or dataframe providing the observation locations of the compositions.Any number of dimension >= 2 is supported.
maxdist the maximum distance to compute the variogram for.
nbins The number of distance bins to compute the variogram for
dists The distances seperating the bins
bins a matrix with lower and upper limit for the distances of each bin. A pair iscounted if min<h<=max. min and max are provided as columns. bins is com-puted from maxdist,nbins and dists. If it is provided, it is used directly.
azimuth For directional variograms the direction, either as an azimuth angle (i.e. a singlereal number) for 2D datasets or a unit vector pointing of the same dimension asthe locations. The angle is clockwise from North in degree.
azimuth.tol The angular tolerance it should be below 90 if a directional variogram is in-tended.
Details
The logratio-variogram is the set of variograms of each of the pairwise logratios. It can be proventhat it carries the same information as a usual multivariate variogram. The great advantage is thatall the funcitions have a direct interpreation and can be estimated even with (MAR) missings in thedataset.
Value
A list of class "logratioVariogram".
vg A nbins x D x D array containing the logratio variograms
h A nbins x D x D array containing the mean distance the value is computed on.
n A nbins x D x D array containing the number of nonmissing pairs used for thecorresponding value.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
Tolosana, van den Boogaart, Pawlowsky-Glahn (2009) Estimating and modeling variograms ofcompositional data with occasional missing variables in R, StatGis09
Pawlowsky-Glahn, Vera and Olea, Ricardo A. (2004) Geostatistical Analysis of CompositionalData, Oxford University Press, Studies in Mathematical Geology
See Also
vgram2lrvgram, CompLinModCoReg, vgmFit
lrvgram 113
Examples
## Not run:data(juraset)X <- with(juraset,cbind(X,Y))comp <- acomp(juraset,c("Cd","Cu","Pb","Co","Cr"))lrv <- logratioVariogram(comp,X,maxdist=1,nbins=10)plot(lrv)
## End(Not run)
lrvgram vgram2lrvgram
Description
Transforms model functions for different types of compositional (logratio)(co)variograms.
Usage
cgram2vgram(cgram)vgram2lrvgram(vgram)
Arguments
cgram A (matrix valued) covariance function.
vgram A (matrix valued) variogram functions.
Details
The variogram is given by cgram(0)-cgram(h) and lrvgram(h)[,i,j]==vgram(h)[,i,i]+vgram(h)[,i,j]-2*vgram(h)[,i,j].
The logratio-variogram is the set of variograms of each of the pairwise logratios. It can be proventhat it carries the same information as a usual multivariate variogram. The great advantage is thatall the funcitions have a direct interpreation and can be estimated even with (MAR) missings in thedataset.
Value
A function that takes the same parameters as the input function (through a . . . parameterlist), but pro-vides the correponding variogram values (cgram2vgram) or logratio Variogram (vgram2lrvgram)values.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
114 MahalanobisDist
References
Tolosana, van den Boogaart, Pawlowsky-Glahn (2009) Estimating and modeling variograms ofcompositional data with occasional missing variables in R, StatGis09
See Also
logratioVariogram, CompLinModCoReg, vgmFit
Examples
data(juraset)comp <- acomp(juraset,c("Cd","Cu","Pb","Co","Cr"))vg <- CompLinModCoReg(~nugget()+sph(0.5)+R1*exp(0.7),comp)vg(1:3)vgram2lrvgram(vg)(1:3)
MahalanobisDist Compute Mahalanobis distances based von robust Estimations
Description
MahalanobisDist computes the Mahalanobis distances to the center or to other observations.
Usage
MahalanobisDist(x,center=NULL,cov=NULL,inverted=FALSE,...)## S3 method for class 'rmult'MahalanobisDist(x,center=NULL,cov=NULL,inverted=FALSE,...,
goodOnly=NULL,pairwise=FALSE,pow=1,robust=FALSE,giveGeometry=FALSE)## S3 method for class 'acomp'MahalanobisDist(x,center=NULL,cov=NULL,inverted=FALSE,...,
goodOnly=NULL, pairwise=FALSE,pow=1,robust=FALSE,giveGeometry=FALSE)
Arguments
x the dataset
robust logical or a robust method description (see robustnessInCompositions) speci-fiying how the center and covariance matrix are estimated,if not given.
... Further arguments to solve.
center An estimated for the center (mean) of the dataset. If center is NULL it will beestimated based using the given robust option.
cov An estimated for the spread (covariance matrix) of the dataset. If cov is NULLit will be estimated based using the given robust option.
inverted TRUE if the inverse of the covariance matrix is given.
MahalanobisDist 115
goodOnly An vector of indices to the columns of x that should be used for estimation ofcenter and spread.
pairwise If FALSE the distances to the center are returned as a vector. If TRUE thedistances between the cases are returned as a distance matrix.
pow The power of the Mahalanobis distance to be used. 1 correponds to the squareroot of the squared distance in transformed space, like it is defined in mostbooks. The choice 2 corresponds to what is implemented in many softwarepackage including the mahalanobis function in R.
giveGeometry If true an atrributes "center" and "cov" given the center and the idt-varianceused for the calculations.
Details
The Mahalanobis distance is the distance in a linearly transformed space, where the linear transfor-mation is selected in such a way,that the variance is the unit matrix. Thus the distances are given inmultiples of standard deviation.
Value
Either a vector of Mahalanobis distances to the center, or a distance matrix (like from dist) givingthe pairwise Mahalanobis distances of the data.
Note
Unlike the mahalanobis function this function does not be default compute the square of the ma-halanobis distance. The pow option is provided if the square is needed.The package robustbase is required for using the robust estimations.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
dist, OutlierClassifier1
Examples
data(SimulatedAmounts)data5 <- acomp(sa.outliers5)
cl <- ClusterFinder1(data5,sigma=0.4,radius=1)plot(data5,col=as.numeric(cl$types),pch=as.numeric(cl$types))legend(1,1,legend=levels(cl$types),xjust=1,col=1:length(levels(cl$types)),
pch=1:length(levels(cl$types)))
116 matmult
matmult inner product for matrices and vectors
Description
Multiplies two matrices, if they are conformable. If one argument is a vector, it will be coerced toeither a row or a column matrix to make the two arguments conformable. If both are vectors it willreturn the inner product.
Usage
x %*% y## Default S3 method:x %*% y
Arguments
x,y numeric or complex matrices or vectors
Details
This is a copy of the %*% function. The function is made generic to allow the definition of specificmethods.
Value
The matrix product. Uses ’drop’ to get rid of dimensions which have only one level.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
%*%.rmult
Examples
M <- matrix(c(0.2,0.1,0.0,0.1,0.2,0.0,0.0,0.0,0.2),byrow=TRUE,nrow=3)x <- c(1,1,2)M %*% xx %*% Mx %*% xM %*% Mt(x) %*% M
mean.acomp 117
mean.acomp Mean amounts and mean compositions
Description
Compute the mean in the several approaches of compositional and amount data analysis.
Usage
## S3 method for class 'acomp'mean(x,...,robust=getOption("robust"))
## S3 method for class 'rcomp'mean(x,...,robust=getOption("robust"))
## S3 method for class 'aplus'mean(x,...,robust=getOption("robust"))
## S3 method for class 'rplus'mean(x,...,robust=getOption("robust"))
## S3 method for class 'ccomp'mean(x,...,robust=getOption("robust"))
## S3 method for class 'rmult'mean(x,...,na.action=NULL,robust=getOption("robust"))
Arguments
x a classed dataset of amounts or compositions
... further arguments to mean e.g. trim
na.action na.action
robust A description of a robust estimator. Possible values are FALSE or "pearson" forno robustness, or TRUE or "mcd" for a covMcd based robust location scale esti-mation. Additional control parameters such as list(trim=0.2) or an rrcov.controlobject can be given as an attribute "control".
Details
The different compositional approaches acomp, rcomp, aplus, rplus correpond to different ge-ometries. The mean is calculated in the respective canonical geometry by applying a canonicaltransform (see cdt), taking ordinary meanCol and backtransforming.
The Aitchison geometries imply that mean.acomp and mean.aplus are geometric means, the firstone closed. The real geometry implies that mean.rcomp and mean.rplus are arithmetic means, thefirst one resulting in a closed composition.
In all cases the mean is again an object of the same class.
118 meanrow
Value
The mean is given as a composition or amount vector of the same class as the original dataset.
Missing Policy
For the additive scales (rcomp,rplus) the SZ and BDT are treated as zeros and MAR and MNAR asmissing information. This is not strictly correct for MNAR.For relative scales (acomp,aplus), all four types of missings are treated as missing information. Thiscorresponds to the idea that BDT are truncated values (and have the correspoding effect in takingmeans). For SZ and MAR, only the components in the observed subcomposition are fully relevant.Finally, for MNAR the problem is again that nothing could be done without knowing the MNARmechanism, so the analysis is limited to taking them as MAR, and being careful with the interpreta-tion. Missing and Below Detecion Limit Policy is explained in more detail in compositions.missing.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
clo, meanCol, geometricmean, acomp, rcomp, aplus, rplus
Examples
data(SimulatedAmounts)meanCol(sa.lognormals)mean(acomp(sa.lognormals))mean(rcomp(sa.lognormals))mean(aplus(sa.lognormals))mean(rplus(sa.lognormals))mean(rmult(sa.lognormals))
meanrow The arithmetic mean of rows or columns
Description
Computes the arithmetic mean.
Usage
meanRow(x,..., na.action=get(getOption("na.action")))meanCol(x,..., na.action=get(getOption("na.action")))
Metabolites 119
Arguments
x a numeric vector or matrix of data
... arguments to mean
na.action The na.action to be used: one of na.omit,na.fail,na.pass
Details
Computes the arithmetic means of the rows (meanRow) or columns (meanCol) of x.
Value
The arithmetic means of the rows (meanRow) or columns (meanCol) of x.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
mean.rplus
Examples
data(SimulatedAmounts)meanCol(sa.tnormals)meanRow(sa.tnormals)
Metabolites Steroid metabolite patterns in adults and children
Description
Data shows the urinary excretion (mg/24 hours) of 37 normal adults and 30 normal children of totalcortisol meatbolites, total corticosterone meatbolites, total pregnanetriol and ∆-5-pregnentriol.
Usage
data(Metabolites)
Details
There are 67 cases for 37 adults and 30 children, and 5 columns: Case no., met1, met2, met3 andType, 1 for adults, $-1$ for children. No sum constraint is placed on this data set: since the urinaryexcretion in mg for 24 hours are given.
120 missing.compositions
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name METABOL.DAT, hereincluded under the GNU Public Library Licence Version 2 or newer.
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data, (Data 9), pp14.
missing.compositions The policy of treatment of missing values in the "compositions" pack-age
Description
This help section discusses some general strategies of working with missing valuess in a composi-tional, relative or vectorial context and shows how the various types of missings are represented andtreated in the "compositions" package, according to each strategy/class of analysis of compositionsor amounts.
Usage
is.BDL(x,mc=attr(x,"missingClassifier"))is.SZ(x,mc=attr(x,"missingClassifier"))is.MAR(x,mc=attr(x,"missingClassifier"))is.MNAR(x,mc=attr(x,"missingClassifier"))is.NMV(x,mc=attr(x,"missingClassifier"))is.WMNAR(x,mc=attr(x,"missingClassifier"))is.WZERO(x,mc=attr(x,"missingClassifier"))has.missings(x,...)## Default S3 method:has.missings(x,mc=attr(x,"missingClassifier"),...)## S3 method for class 'rmult'has.missings(x,mc=attr(x,"missingClassifier"),...)SZvalueMARvalueMNARvalueBDLvalue
Arguments
x A vector, matrix, acomp, rcomp, aplus, rplus object for which we would like toknow the missing status of the entries
missing.compositions 121
mc A missing classifier function, giving for each value one of the values BDL (Be-low Detection Limit), SZ (Structural Zero), MAR (Missing at random), MNAR(Missing not at random), NMV (Not missing value) This functions are intro-duced to allow a different coding of the missings.
... further generic arguments
Details
In the context of compositional data we have to consider at least four types of missing and zerovalues:
• MAR (Missing at random) coded by NaN, the amount was not observed or is otherwise miss-ing, in a way unrelated to its actual value. This is the "nice" type of missing.
• MNAR(Missing not at random) coded by NA, the amount was not observed or is otherwisemissing, but it was missed in a way stochastically dependent on its actual value.
• BDL(Below detection limit) coded by 0.0 or a negative number giving the detection limit; theamount was observed but turned out to be below the detection limit and was thus rounded tozero. This is an informative version of MNAR.
• SZ (Structural zero) coded by -Inf, the amount is absolutely zero due to structural reasons.E.g. a soil sample was dried before the analysis, or the sample was preprocessed so that thefraction is removed. Structural zeroes are mainly treated as MAR even though they are a kindof MNAR.Based on these basic missing types, the following extended types are defined:
• NMV(Not Missing Value) coded by a real number, it is just an actually-observed value.
• WMNAR(Wider MNAR) includes BDL and MNAR.
• WZERO(Wider Zero) includes BDL and SZ
Each function of type is.XXX checks the status of its argument according to the XXX type of valuefrom those above.
Different steps of a statistical analysis and different understanding of the data will lead to differentapproaches with respect to missings and zeros.In the first exploratory step, the problem is to keep the methods working and to make the miss-ing structure visible in the analysis. The user should need as less as possible extra thinking aboutmissings, an get nevertheless a true picture of the data. To achieve this we tried to make the basiclayer of computational functions working consitently with missings and propagating the missing-ness character seamlessly. However some of this only works with acomp, where a closed formmissing theories are available (e.g. proportional imputation [e.g. Mart\’in-Fern\’andez, J.A. etal.(2003)]or estimation with missings [Boogaart&Tolosana 2006]). The main graphics should hinttowards missing and try to add missings to the plot by marking the remaining informaion on theaxes. However one again should be clear that this is only reasonably justified in the relative ge-ometries. Unfortunatly the missing subsystem is currently not fully compatible with the robustnesssubsystem.As a second step, the analyst might want to analyse the missing structure for itself. This is prelimi-narly provided by these functions, since their result can be treated as a boolean data set in any otherR function. Additionally a missingSummary provides some a convenience function to provide afast overview over the different types of missings in the dataset.
122 missing.compositions
In the later inferential steps, the problem is to get results valid with respect to a model. One needs tobe able to look through the data on the true processes behind, without being distracted by artifactsstemming from missing values. For the moment, how analyses react to the presence of missingsdepend on the value of the na.action option. If this is set to na.omit (the default), then cases withmissing values on any variable are completely ignored by the analysis. If this is set to na.pass, thensome of the following applies.The policy on how a missing value is to be introduced into the analysis depends on the purpose ofthe analysis, the type of analysis and the model behind. With respect to this issue this package andprobabily the whole science of compositional data analysis is still very preliminary.The four philosophies work with different approaches to these problems:
• rplus For positive real vectors, one can either identify BDL with a true 0 or impute a valuerelative to the detection limit, with a function like zeroreplace. A structural zero can eitherbe seen as a true zero or as a MAR value.
• rcomp and acomp For these relative geometries, a true zero is an alien. Thus a BDL is nothingelse but a small unkown value. We could either decide to replace the value by an imputation,or go through the whole analysis keeping this lack of information in mind. The main problemof imputation is that by closing to 1, the absolute value of the detection limit is lost, andthe detection limit can correspond to very different portions. Raw differences between all,observed or missed, components (the ground of the rcomp geometry) are completely distortedby the replacement. Contrarily, log-ratios between observed components do not change butratios between missed components dramatically depend on the replacement, e.g. typically thecontent of gold is some orders of magnitude smaller than the contend of silver even arounda gold deposit, but far away from the deposit they both might be far under detection limit,leading to a ratio of 1, just because nothing was observed. SZ in compositions might be eitherseen as defining two sub-populations, one fully defined and one where only a subcompositionis defined. But SZ can also very much be like an MAR, if only a subcomposition is measured.Thus, in general we can simply understand that only a subcomposition is available, i.e. aprojection of the true value onto a sub-space: for each observation, this sub-space might bedifferent. For MAR values, this approach is stricly valid, and yields unbiased estimations(because these projections are stochastically independent of the observed phenomenon). ForMNAR values, the projections depend on the actual value, which strictly speaking yieldsbiased estimations.
• aplus Imputation takes place by simple replacement of the value. However this can lead to adramatic change of ratios and should thus be used only with extra care, by the same reasonsexplained before.More information on how missings are actually processed can be found in the help files ofeach individual functions.
Value
A logical vector or matrix with the same shape as x stating wether or not the value is of the giventype of missing.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de, Raimon Tolosana Delgado, Matevz Bren
missingProjector 123
References
Boogaart, K.G. v.d., R. Tolosana-Delgado, M. Bren (2006) Concepts for handling of zeros andmissing values in compositional data, in E. Pirard (ed.) (2006)Proccedings of the IAMG’2006Annual Conference on "Quantitative Geology from multiple sources", September 2006, Liege, Bel-gium, S07-01, 4pages, http://stat.boogaart.de/Publications/iamg06_s07_01.pdf, ISBN:978-2-9600644-0-7
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
Aitchison, J, C. Barcel’o-Vidal, J.J. Egozcue, V. Pawlowsky-Glahn (2002) A consise guide to the al-gebraic geometric structure of the simplex, the sample space for compositional data analysis, TerraNostra, Schriften der Alfred Wegener-Stiftung, 03/2003Billheimer, D., P. Guttorp, W.F. and Fagan (2001) Statistical interpretation of species composition,Journal of the American Statistical Association, 96 (456), 1205-1214
Mart\’in-Fern\’andez, J.A., C. Barcel\’o-Vidal, and V. Pawlowsky-Glahn (2003) Dealing With Ze-ros and Missing Values in Compositional Data Sets Using Nonparametric Imputation. Mathemati-cal Geology, 35(3) 253-278
See Also
compositions-package, missingsInCompositions, robustnessInCompositions, outliersInCompositions,zeroreplace, rmult, ilr, mean.acomp, acomp, plot.acomp
Examples
require(compositions) # load librarydata(SimulatedAmounts) # load data sa.lognormalsdat <- acomp(sa.missings)datvar(dat)mean(dat)plot(dat)boxplot(dat)barplot(dat)
missingProjector Returns a projector the the observed space in case of missings.
Description
Returns projectors on the observed subspace in the presence of missings.
124 missingProjector
Usage
missingProjector(x,...,by="s")## S3 method for class 'acomp'missingProjector(x,has=is.NMV(x),...,by="s")## S3 method for class 'aplus'missingProjector(x,has=is.NMV(x),...,by="s")## S3 method for class 'rcomp'missingProjector(x,has=!(is.MAR(x)|is.MNAR(x)),...,by="s")## S3 method for class 'rplus'missingProjector(x,has=!(is.MAR(x)|is.MNAR(x)),...,by="s")
Arguments
x a dataset or object of the given class
has a boolean matrix of the same size indicating nonmissing values
... additional arguments for generic purpose only
by the name of the dataset dimension on has for tensorial computation with tensorApackage
Details
See the references for details on that function.
Value
A dataset of N square matrices of dimension DxD (with N and D respectively equal to the numberof rows and columns in x). Each of these matrices gives the projection of a data row onto its ob-served sub-space.The function sumMissingProjector takes all these matrices and sums them, generating a "sum-mary" of observed sub-spaces. This matrix is useful to obtain estimates of the mean (and variance,in the future) still unbiased in the presence of lost values (only of type MAR, stricly-speaking, butanyway useful for any type of missing value, when used with care).
Author(s)
K.G.van den Boogaart
References
Boogaart, K.G. v.d. (2006) Concepts for handling of zeros and missing values in compositionaldata, in E. Pirard (ed.) (2006)Proccedings of the IAMG’2006 Annual Conference on "QuantitativeGeology from multiple sources", September 2006, Liege, Belgium, S07-01, 4pages, http://stat.boogaart.de/Publications/iamg06_s07_01.pdf
See Also
missingsInCompositions
missingsummary 125
Examples
data(SimulatedAmounts)x <- acomp(sa.lognormals)xnew <- simulateMissings(x,dl=0.05,MAR=0.05,MNAR=0.05,SZ=0.05)xnewplot(missingSummary(xnew))
missingProjector(acomp(xnew))missingProjector(rcomp(xnew))missingProjector(aplus(xnew))missingProjector(rplus(xnew))
missingsummary Classify and summarize missing values in a dataset
Description
Routines classifies codes of missing valuesas numbers in objects of the compositions package.
Usage
missingSummary(x,..., vlabs = colnames(x),mc=attr(x,"missingClassifier"),values=eval(formals(missingType)$values))
missingType(x,..., mc=attr(x,"missingClassifier"),values=c("NMV", "BDT", "MAR", "MNAR", "SZ", "Err"))
Arguments
x a dataset which might contain missings
... additional arguments for mc
mc optionally in missingSummary, an alternate routine to be used instead of missingType
vlabs labels for the variables
values the names of the different types of missings. "Err" is a value that can not beclassified e.g. Inf.
Details
The function mainly counts the various types of missing values.
Value
missingType returns a character vector/matrix with the same dimension and dimnames as x givingthe type of every value.missingSummary returns a table giving the number of missings of each type for each variable.
126 mix.Read
Author(s)
K. Gerald van den Boogaart
References
Boogaart, K.G., R. Tolosana-Delgado, M. Bren (2006) Concepts for the handling of zeros andmissings in compositional data, Proceedings of IAMG 2006, Liege
See Also
compositions.missing
Examples
data(SimulatedAmounts)x <- acomp(sa.lognormals)xnew <- simulateMissings(x,dl=0.05,MAR=0.05,MNAR=0.05,SZ=0.05)xnewmissingSummary(xnew)
mix.Read Reads a data file in a mixR format
Description
Reads a data file, which is formatted in a simple compositional file including the first row with title,the second with data labels and afterwards the matrix with the data itself. In the first column of thematrix are cases labels. This is the format used in the mixR package.
Usage
mix.Read(file,eps=1e-6)
Arguments
file a file name
eps the epsilon to be used for checking null values.
Details
The data files must have the adequate structure:
• 1the first row with a title of the data set,
• 2the second row with variables names
• 3the data set in a matrix, rows as cases, variables in columns with the firs colum comprisingcases labels.
mix.Read 127
A mixture object ’m’ consists of m\$tit the title, m\$mat the matrix with the data, m\$sum the valueof the rows total, if constant and m\$sta the status of the mixture object with values:
128 mvar
-2 - matrix contains negative elements,-1 - zero row sum exists,0 - matrix contains zero elements,1 - matrix contains positive elements, rows with different row sum(s),2 - matrix with constant row sum and3 - closed mixture, the row sums are all equal to 1.
Value
A mixture object as a data frame with a title, row total, if constant, status (-2, -1, 0, 1, 2 or 3 – seeabove) and class attributes and the data matrix.
See Also
read.geoeas read.geoEAS read.table
Examples
## Not run:mix.Read("GLACIAL.DAT")mix.Read("ACTIVITY.DAT")
## End(Not run)
mvar Metric summary statistics of real, amount or compositional data
Description
Compute the metric variance, covariance, correlation or standard deviation.
Usage
mvar(x,...)mcov(x,...)mcor(x,...)msd(x,...)## Default S3 method:mvar(x,y=NULL,...)## Default S3 method:mcov(x,y=x,...)## Default S3 method:mcor(x,y,...)## Default S3 method:msd(x,y=NULL,...)
mvar 129
Arguments
x a dataset, eventually of amounts or compositions
y a second dataset, eventually of amounts or compositions
... further arguments to var or cov. Typically a robust=TRUE argument. e.g. use
Details
The metric variance (mvar) is defined by the trace of the variance in the natural geometry of thedata, or also by the generalized variance in natural geometry. The natural geometry is equivalentlygiven by the cdt or idt transforms.
The metric standard deviation (msd) is not the square root of the metric variance, but the square rootof the mean of the eigenvalues of the variance matrix. In this way it can be interpreted in units ofthe original natural geometry, as the radius of a sperical ball around the mean with the same volumeas the 1-sigma ellipsoid of the data set.
The metric covariance (mvar) is the sum over the absolute singular values of the covariance of twodatasets in their respective geometries. It is always positive. The metric covariance of a dataset withitself is its metric variance. The interpretation of a metric covariance is quite difficult, but useful inregression problems.
The metric correlation (mcor) is the metric covariance of the datasets in their natural geometry nor-malized to unit variance matrix. It is a number between 0 and the smaller dimension of both naturalspaces. A number of 1 means perfect correlation in 1 dimension, but only partial correlations inhigher dimensions.
Value
a scalar number, informing of the degree of variation/covariation of one/two datasets.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de, Raimon Tolosana-Delgado
References
Daunis-i-Estadella, J., J.J. Egozcue, and V. Pawlowsky-Glahn (2002) Least squares regression inthe Simplex on the simplex, Terra Nostra, Schriften der Alfred Wegener-Stiftung, 03/2003
Pawlowsky-Glahn, V. and J.J. Egozcue (2001) Geometric approach to statistical analysis on thesimplex. SERRA 15(5), 384-398
See Also
var, cov, mean.acomp, acomp, rcomp, aplus, rplus
130 names
Examples
data(SimulatedAmounts)mvar(acomp(sa.lognormals))mvar(rcomp(sa.lognormals))mvar(aplus(sa.lognormals))mvar(rplus(sa.lognormals))
msd(acomp(sa.lognormals))msd(rcomp(sa.lognormals))msd(aplus(sa.lognormals))msd(rplus(sa.lognormals))
mcov(acomp(sa.lognormals5[,1:3]),acomp(sa.lognormals5[,4:5]))mcor(acomp(sa.lognormals5[,1:3]),acomp(sa.lognormals5[,4:5]))mcov(rcomp(sa.lognormals5[,1:3]),rcomp(sa.lognormals5[,4:5]))mcor(rcomp(sa.lognormals5[,1:3]),rcomp(sa.lognormals5[,4:5]))
mcov(aplus(sa.lognormals5[,1:3]),aplus(sa.lognormals5[,4:5]))mcor(aplus(sa.lognormals5[,1:3]),aplus(sa.lognormals5[,4:5]))mcov(rplus(sa.lognormals5[,1:3]),rplus(sa.lognormals5[,4:5]))mcor(rplus(sa.lognormals5[,1:3]),rplus(sa.lognormals5[,4:5]))
mcov(acomp(sa.lognormals5[,1:3]),aplus(sa.lognormals5[,4:5]))mcor(acomp(sa.lognormals5[,1:3]),aplus(sa.lognormals5[,4:5]))
names The names of the parts
Description
The names function provide a transparent way to access the names of the parts regardless of theshape of the dataset or data item.
Usage
## S3 method for class 'acomp'names(x)## S3 method for class 'rcomp'names(x)## S3 method for class 'aplus'names(x)## S3 method for class 'rplus'names(x)## S3 method for class 'rmult'names(x)## S3 method for class 'ccomp'names(x)## S3 replacement method for class 'acomp'
norm 131
names(x) <- value## S3 replacement method for class 'rcomp'names(x) <- value## S3 replacement method for class 'aplus'names(x) <- value## S3 replacement method for class 'rplus'names(x) <- value## S3 replacement method for class 'rmult'names(x) <- value## S3 replacement method for class 'ccomp'names(x) <- value
Arguments
x an amount/amount dataset
value the new names of the parts
... not used, only here for generics
Value
a character vector giving the names of the parts
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
aplus
Examples
data(SimulatedAmounts)tmp <- acomp(sa.lognormals)names(tmp)names(tmp) <- c("x","y","z")tmp
norm Vector space norm
Description
Each of the considered space structures has an associated norm, which is computed for each elementby these functions.
132 norm
Usage
## Default S3 method:norm(X,...)## S3 method for class 'acomp'norm(X,...)## S3 method for class 'rcomp'norm(X,...)## S3 method for class 'aplus'norm(X,...)## S3 method for class 'rplus'norm(X,...)## S3 method for class 'rmult'norm(X,...)
Arguments
X a dataset or a single vector of some type
... currently not used, intended to select a different norm rule in the future
Value
The norms of the given vectors.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
normalize
Examples
data(SimulatedAmounts)tmp <- acomp(sa.lognormals)mvar(tmp)sum(norm( tmp - mean(tmp) )^2)/(nrow(tmp)-1)
normalize 133
normalize Normalize vectors to norm 1
Description
Normalize vectors to norm 1.
Usage
normalize(x,...)## Default S3 method:normalize(x,...)
Arguments
x a dataset or a single vector of some type
... currently not used, intended to select a different norm in the future
Value
The vectors given, but normalized to norm 1.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
norm.rmult
Examples
data(SimulatedAmounts)normalize(c(1,2,3))normalize(acomp(c(1,2,3)))norm(normalize(acomp(sa.groups)))
134 NormalTests
NormalTests Compositional Goodness of fit test
Description
Tests for several groups of additive lognormally distributed compositions.
Usage
acompNormalLocation.test(x, g=NULL, var.equal=FALSE, paired=FALSE,R=ifelse(var.equal,999,0))
Arguments
x a dataset of compositions (acomp) or a list of such
g a factor grouping the data, not used if x is a list already
var.equal a boolean telling wether the variance of the groups should be considered equal
paired true if a paired test should be performed
R number of replicates that should be used to compute p-values. 0 means compar-ing the likelihood statistic with the correponding asymptotic chisq-distribution.
Details
The tests are based on likelihood ratio statistics.
Value
A classical "htest" object
data.name The name of the dataset as specified
method a name for the test used
alternative an empty string
replicates a dataset of p-value distributions under the Null-Hypothesis got from nonpara-metric bootstrap
p.value The p.value computed for this test
Missing Policy
Up to now the tests cannot handle missings.
Note
Do not trust the p-values obtained forcing var.equal=TRUE and R=0. This will include soon equiv-alent spread tests.
oneOrDataset 135
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
See Also
fitDirichlet,rDirichlet, runif.acomp, rnorm.acomp,
Examples
x <- runif.acomp(100,4)y <- runif.acomp(100,4)acompNormalLocation.test(list(x,y))
oneOrDataset Treating single compositions as one-row datasets
Description
A dataset is converted to a data matrix. A single data item (i.e. a simple vector) is converted to aone-row data matrix.
Usage
oneOrDataset(W,B=NULL)
Arguments
W a vector, matrix or dataframe
B an optional second vector, matrix or data frame having the intended number ofrows.
Value
A data matrix containing the same data as W. If W is a vector it is interpreded as a single row. If Bis given and length(dim(B))!= 2 and W is a vector, then W is repeated nrow(B) times.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
136 outlierclassifier
Examples
oneOrDataset(c(1,2,3))oneOrDataset(c(1,2,3),matrix(1:12,nrow=4))oneOrDataset(data.frame(matrix(1:12,nrow=4)))
outlierclassifier Detect and classify compositional outliers.
Description
Detects outliers and classifies them according to different possible explanations.
Usage
OutlierClassifier1(X,...)## S3 method for class 'acomp'OutlierClassifier1(X,...,alpha=0.05,
type=c("best","all","type","outlier","grade"),goodOnly=NULL,corrected=TRUE,RedCorrected=FALSE,robust=TRUE)
Arguments
X the dataset as an acomp object... further arguments to MahalanobisDist/gsi.mahOutlieralpha The confidence level for identifying outliers.type What type of classification should be used: best: Which single component
would best explain the outlier. all: Give a binary coding specifying all com-ponents, which could explain the outlier. type: Is it a a normal observation"ok", a single componentn outlier "1" or can it not be explained by a singlewrong component "?". outlier: All outliers are marked as "outlier", others aremarked as "ok". "grade": Proven Outliers are marked as "outlier"s, suspectedoutliers, detected without correction of the p-value are reported as "extreme",the rest is reported as "ok".
goodOnly an integer vector. Only the specified index of the dataset should be used forestimation of the outlier criteria. This parameter if only a small portion of thedataset is reliable.
corrected logical. Literatur often proposed to compare the Mahalanobis distances withChisq-Approximations of there distributions. However this does not correct formultiple testing. If corrected is true a correction for multiple testing is used. Inany case we do not use the chisq-approximation, but a simulation based proce-dure to compute confidence bounds.
RedCorrected logical. If an outlier is detected we can try to find out wether a single componentwould be sufficient to drop the outlier under the outlier detection limit. Since inthis second case we only check a few outliers no second correction step appliesas long as the number of outliers is not very high.
robust A robustness description as define in var.acomp
outlierclassifier 137
Details
See outliersInCompositions for a comprehensive introduction into the outlier treatment in compo-sitions.
See ClusterFinder1 for an alternative method to classify observations in the context of outliers.
Value
A factor classifying the observations in the dataset as "ok" or some type of outlier.
Note
The package robustbase is required for using the robust estimations.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
outlierplot, ClusterFinder1
Examples
## Not run:tmp<-set.seed(1400)A <- matrix(c(0.1,0.2,0.3,0.1),nrow=2)Mvar <- 0.1*ilrvar2clr(A%*%t(A))Mcenter <- acomp(c(1,2,1))data(SimulatedAmounts)datas <- list(data1=sa.outliers1,data2=sa.outliers2,data3=sa.outliers3,
data4=sa.outliers4,data5=sa.outliers5,data6=sa.outliers6)opar<-par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))tmp<-mapply(function(x,y) {outlierplot(x,type="scatter",class.type="grade");
title(y)},datas,names(datas))
par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))tmp<-mapply(function(x,y) {
myCls2 <- OutlierClassifier1(x,alpha=0.05,type="all",corrected=TRUE)outlierplot(x,type="scatter",classifier=OutlierClassifier1,class.type="best",Legend=legend(1,1,levels(myCls),xjust=1,col=colcode,pch=pchcode),pch=as.numeric(myCls2));legend(0,1,legend=levels(myCls2),pch=1:length(levels(myCls2)))title(y)
},datas,names(datas))
par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))for( i in 1:length(datas) )
outlierplot(datas[[i]],type="ecdf",main=names(datas)[i])
138 outlierplot
par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))for( i in 1:length(datas) )
outlierplot(datas[[i]],type="portion",main=names(datas)[i])par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))for( i in 1:length(datas) )
outlierplot(datas[[i]],type="nout",main=names(datas)[i])par(opar)
## End(Not run)
outlierplot Plot various graphics to analyse outliers.
Description
A collection of plots emphasing different aspects of possible outliers.
Usage
outlierplot(X,...)## S3 method for class 'acomp'outlierplot(X,colcode=colorsForOutliers1,pchcode=pchForOutliers1,type=c("scatter","biplot","dendrogram","ecdf","portion","nout","distdist"),legend.position,pch=19,...,clusterMethod="ward",myCls=classifier(X,alpha=alpha,type=class.type,corrected=corrected),classifier=OutlierClassifier1,alpha=0.05,class.type="best",Legend,pow=1,main=paste(deparse(substitute(X))),corrected=TRUE,robust=TRUE,princomp.robust=FALSE,
mahRange=exp(c(-5,5))^pow,flagColor="red",meanColor="blue",grayColor="gray40",goodColor="green",mahalanobisLabel="Mahalanobis Distance")
Arguments
X The dataset as an acomp object
colcode A color palette for factor given by the myCls, or function to create it from thefactor. Use colorForOutliers2 if class.method="all" is used.
pchcode A function to create a plot character palette for the factor returned by the myClscall
outlierplot 139
type The type of plot to be produced. See details for more precise definitions.legend.position
The location of the legend. Must!!! be given to draw a classical legend.
pch A default plotting char
... Further arguments to the used plotting function
clusterMethod The clustering method for hclust based outlier grouping.
myCls A factor presenting the groups of outliers
classifier The routine to create a factor presenting the groups of outliers heuristically. It isonly used in the default argument to myCls.
alpha The confidence level to be used for outlier classification tests
class.type The type of classification that should be generated by classifier
Legend The content will be substituted and stored as list entry legend in the result of thefunction. It can than be evaluated to actually create a seperate legend on anotherdevice (e.g. for publications).
pow The power of Mahalanobis distances to be used.
main The title of the graphic
corrected Literature typically proposes to compare the Mahalanobis distances with thedistribution of a random Mahalanobis distance. However it would be neededto correct this for (dependent) multiple testing, since we always test the wholedataset, which means comparing against the distribution of the maximum Ma-halanobis distance. This argument switches to this second behavior, giving lessoutliers.
robust A robustness description as define in robustnessInCompositions
princomp.robust
Either a logical determining wether or not the principal component analysisshould be done robustly or a principal component object for the dataset.
mahRange The range of Mahalanobis distances displayed. This is fixed to make views com-parable among datasets. However if the preset default is not enough a warningis issued and a red mark is drawn in the plot
flagColor The color to draw critical situations.
meanColor The color to draw typical curves.
goodColor The color to draw confidence bounds.
grayColor The color to draw less important things.mahalanobisLabel
The axis label to be used for axes displaying Mahalanobis distances.
Details
See outliersInCompositions for a comprehensive introduction into the outlier treatment in compo-sitions.
• type="scatter" Produces an appropriate standard plot such as a tenary diagram with the out-liers marked by there codes according to the given classifier and colorcoding and pch coding.This shows the actual values of the identified outliers.
140 outlierplot
• type="biplot" Creates a biplot based on a nonrobust principal component analysis showingthe outliers classified through outliers in the given color scheme. We use the nonrobust prin-cipal component analyis since it rotates according to a good visibility of the extreme values.This shows the position of the outliers in the usual principal components analysis. Howevernote that a coloredBiplot is used rather than the usual one.
• type="dendrogram" Shows a dendrogram based on robust Mahalanobis distance based hier-achical clustering, where the observations are labeled with the identified outlier classes.This plot can be used to see how good different categories of outliers cluster.
• type="ecdf" This plot provides a cummulated distribution function of the Mahalanobis dis-tances along with an expeced curve and a lower confidence limit. The empirical cdf is plottedin the default color. The expected cdf is displayed in meanColor. The alpha-quantile – i.e.a lower prediction bound – for the cdf is given in goodColor. A line in grayColor show theminium portion of observations above some limit to be outliers, based on the portion of ob-servations necessary to move down to make the empirical distribution function get above itslower prediction limit under the assumption of normality.This plot shows the basic construction for the minimal number of outlier computation done intype="portion".
• type="portion" This plot focusses on numbers of outliers. The horizontal axis give Maha-lanobis distances and the vertical axis number of observations. In meanColor we see a curveof an estimated number of outliers above some limit, generated by estimating the portion ofoutliers with a Mahalanobis distance over the given limit by max(0,1-ecdf/cdf). The minimumnumber of outliers is computed by replacing cdf by its lower confidence limit and displayedin goodColor. The Mahalanobis distances of the individual data points are added as a stackedstripchart, such that the influence of individual observations can be seen.The true problem of outlier detection is to detect "near" outliers. Near outliers are outliersso near to the dataset that they could well be extrem observation. These near outliers wouldprovide no problem unless they are not many showing up in groups. Graphic allows at least tocount them and to show there probable Mahalanobis distance such, however it still does not al-low to conclude that an individual observation is an outlier. However still the outlier candidatescan be identified comparing their mahalanobis distance (returned by the plot as$mahalanobis)with a cutoff inferred from this graphic.
• type="nout" This is a simplification of the previous plot simply providing the number ofoutliers over a given limit.??? MORE DOCUMENTATION NEEDED ???
• type="distdist" Plots a scatterplot of the the classical and robust Mahalanobis distancewith the given classification for colors and plot symbols. Furthermore it plots a horizontal linegiving the 0.95-Quantil of the distribution of the maximum robust Mahalanobis distance ofnormally distributed dataset.
Value
a list respresenting the criteria computed to create the plots. The content of the list depends on theplotting type selected.
Note
The package robustbase is required for using the robust estimations.
outlierplot 141
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
OutlierClassifier1, ClusterFinder1
Examples
## Not run:data(SimulatedAmounts)outlierplot(acomp(sa.outliers5))
datas <- list(data1=sa.outliers1,data2=sa.outliers2,data3=sa.outliers3,data4=sa.outliers4,data5=sa.outliers5,data6=sa.outliers6)
opar<-par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))tmp<-mapply(function(x,y) {outlierplot(x,type="scatter",class.type="grade");
title(y)},datas,names(datas))
par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))tmp<-mapply(function(x,y) {
myCls2 <- OutlierClassifier1(x,alpha=0.05,type="all",corrected=TRUE)outlierplot(x,type="scatter",classifier=OutlierClassifier1,class.type="best",Legend=legend(1,1,levels(myCls),xjust=1,col=colcode,pch=pchcode),pch=as.numeric(myCls2));legend(0,1,legend=levels(myCls2),pch=1:length(levels(myCls2)))title(y)
},datas,names(datas))# To slowpar(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))for( i in 1:length(datas) )
outlierplot(datas[[i]],type="ecdf",main=names(datas)[i])par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))for( i in 1:length(datas) )
outlierplot(datas[[i]],type="portion",main=names(datas)[i])par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))for( i in 1:length(datas) )
outlierplot(datas[[i]],type="nout",main=names(datas)[i])for( i in 1:length(datas) )
outlierplot(datas[[i]],type="distdist",main=names(datas)[i])par(opar)
## End(Not run)
142 outliersInCompositions
outliersInCompositions
Analysing outliers in compositions.
Description
The Philosophy behind outlier treatment in library(compositions).
Details
Outliers are omnipresent in all kinds of data analysis. To avoid catastrophic misinterpreations ro-bust statistics has developed some methods to avoid the distracting influence of the outliers. Theintroduction of robust methods into the compositions package is described in robustnessInCompo-sitions.
However sometimes we are interested directly in the analysis of outliers. The central philosophy ofthe the outlier classification subsystem in compositions is that outlier are in most cases not simplyerroneous observations, but rather products of some systematic anomality. This can e.g. be an errorin an individual component, a secondary process or a minor undetected but different subpopulation.The package provides various concepts to investigate possible reasons for outliers in compositionaldatasets.
• Proven Outliers The package relies on an additive–lognormal reference distribution in the sim-plex (and the correponding normal distribution in each other scale). The central tool for thedetection of outliers is the Mahalanobis distance of the observation from a robustly estimatedcenter based on a robustly estimated covariance. The robust estimation can be influencedby the given robust attributes. An outlier is considered as proven if its Mahalanobis dis-tance is larger that the (1-alpha) quantile of the distribution of the maximum Mahalanobisdistance of a dataset of the same size with a corresponding (additive)(log)normal distribu-tion. This relies heavily on the presumption that the robust estimation is invariant under lineartransformation, but make no assumptions about the actually used robust estimation method.The corresponding distributions are thus only defined with respect to a specific implementa-tion of the robust estimation algorithm. See OutlierClassifier1(...,type="outlier"),outlierplot(...,type=c("scatter","biplot"),class.type="outlier"), qMaxMahalanobis(...).
• Extrem Values / Possible outliers Some cases of the dataset might have unusually high Maha-lanobis distances, e.g. such that we would expect the probility of a random case to have sucha value or higher might be below alpha. In Literature these cases are often rendered as out-liers, because this level is approximated by the correponding chisq-based criterion proposed.However we consider these only as extrem values, but however provide tools to detect and plotthem. See OutlierClassifier1(...,type="grade"), outlierplot(...,type=c("scatter","biplot"),class.type="grade"),qEmpiricalMahalanobis(...)
• Single Component Outliers Some Outliers can be explained by a single component, e.g. be-cause this single measurement error was wrong. These sort of outliers is detected when wereduce the dataset to a subcomposition with one component less and realise that our formeroutlier is now a fairly normal member of the dataset, maybe not even extrem. Thus a out-lier is considered as as single component outlier, when it does not appear extrem in any ofthe subcompositions with one component less. For other outliers we can prove that they are
outliersInCompositions 143
still extrem for all subcomposition with one component removed. Thus these have to be asmulticomponent outliers, that can not be explained by a single measurment error. For remain-ing single component outliers, we can ask which component is able to explain the outlyingcharacter. See OutlierClassifier1(...,type=c("best","type","all")).
• Counting hidden outliers If outliers are not outlying far enough to be detected by the test foroutlyingness are only at first sight harmless. One outlier is within the reasonable bounds ofwhat a normal distribution could have delivered should not harm the analysis and might noteven detectable in any way. However if there is more than one they could akt together to dis-rupt our analysis and more interestingly there might be some joint reason, which than mightmake them an interesting object of investigation in themselfs. Thus the package providesmethods (e.g. outlierplot(...,type="portions")), to prove the existence of such out-liers, to give a lower bound for there number and to provide us with suspects, with an associ-ated outlyingness probability. See outlierplot(...,type="portions"), outlierplot(...,type="nout"),pQuantileMahalanobis(...)
• Finding atypical subpopulations When we assume smaller subpopulation we need a tool find-ing these clusters. However usual cluster analysis tends to ignore the subgroups, split the mainmass and then associate the subgroups prematurely to the next part of the main mass. For thistask we have developed special tools to find clusters of atypical populations clearly inducingsecondary modes, without ripping apart the central nonoutlying mass. See ClusterFinder1.
• Identifying multiple distracting processes Outliers that are not due to a seperate subpopu-lation or due to a single component error, might still belong together for beeing influencedby the same secondary process distorting the composition to a different degrees. Out pro-posal is to cluster the direction of the outliers from the center, e.g. by a command like:take<-OutlierClassifier1(data,type="grade")!="ok" hc<-hclust(dist(normalize(acomp(scale(data)[take,]))),method="compact")and to plot by a command like: plot(hc) and plot(acomp(data[take,]),col=cutree(hc,1.5))
With these tools we hope to provide a systematic approach to identify various types of outliers in aexploratory analysis.
Note
The package robustbase is required for using the robust estimations and the outlier subsystem ofcompositions. To simplify installation it is not listed as required, but it will be loaded, wheneverany sort of outlierdetection or robust estimation is used.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
K. Gerald van den Boogaart, Raimon Tolosana-Delgado, Matevz-Bren (2009) Robustness, classifi-cation and visualization of outliers in compositional data, in prep.
See Also
compositions-package, missingsInCompositions, robustnessInCompositions, outliersInCompositions,outlierplot, OutlierClassifier1, ClusterFinder1
144 outliersInCompositions
Examples
## Not run:# To slowtmp<-set.seed(1400)A <- matrix(c(0.1,0.2,0.3,0.1),nrow=2)Mvar <- 0.1*ilrvar2clr(A%*%t(A))Mcenter <- acomp(c(1,2,1))typicalData <- rnorm.acomp(100,Mcenter,Mvar) # main populationcolnames(typicalData)<-c("A","B","C")data1 <- acomp(rnorm.acomp(100,Mcenter,Mvar))data2 <- acomp(rbind(typicalData+rbinom(100,1,p=0.1)*rnorm(100)*acomp(c(4,1,1))))data3 <- acomp(rbind(typicalData,acomp(c(0.5,1.5,2))))colnames(data3)<-colnames(typicalData)tmp<-set.seed(30)rcauchy.acomp <- function (n, mean, var){
D <- gsi.getD(mean)-1perturbe(ilrInv(matrix(rnorm(n*D)/rep(rnorm(n),D), ncol = D) %*% chol(clrvar2ilr(var))), mean)
}data4 <- acomp(rcauchy.acomp(100,acomp(c(1,2,1)),Mvar/4))colnames(data4)<-colnames(typicalData)data5 <- acomp(rbind(unclass(typicalData)+outer(rbinom(100,1,p=0.1)*runif(100),c(0.1,1,2))))data6 <- acomp(rbind(typicalData,rnorm.acomp(20,acomp(c(4,4,1)),Mvar)))datas <- list(data1=data1,data2=data2,data3=data3,data4=data4,data5=data5,data6=data6)tmp <-c()opar<-par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))tmp<-mapply(function(x,y) {outlierplot(x,type="scatter",class.type="grade");
title(y)},datas,names(datas))
par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))tmp<-mapply(function(x,y) {
myCls2 <- OutlierClassifier1(x,alpha=0.05,type="all",corrected=TRUE)outlierplot(x,type="scatter",classifier=OutlierClassifier1,class.type="best",Legend=legend(1,1,levels(myCls),xjust=1,col=colcode,pch=pchcode),pch=as.numeric(myCls2));legend(0,1,legend=levels(myCls2),pch=1:length(levels(myCls2)))title(y)
},datas,names(datas))
par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))for( i in 1:length(datas) )
outlierplot(datas[[i]],type="ecdf",main=names(datas)[i])par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))for( i in 1:length(datas) )
outlierplot(datas[[i]],type="portion",main=names(datas)[i])par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))for( i in 1:length(datas) )
outlierplot(datas[[i]],type="nout",main=names(datas)[i])par(opar)
pairwiseplot 145
moreData <- acomp(rbind(data3,data5,data6))take<-OutlierClassifier1(moreData,type="grade")!="ok"hc<-hclust(dist(normalize(acomp(scale(moreData)[take,]))),method="complete")plot(hc)plot(acomp(moreData[take,]),col=cutree(hc,1.5))
## End(Not run)
pairwiseplot Creates a paneled plot like pairs for two different datasets.
Description
Creates a plot for each element of two lists or each column of each dataset against each of thesecond.
Usage
pairwisePlot(X,Y,...)## Default S3 method:pairwisePlot(X,Y=X,...,
xlab=deparse(substitute(X)),ylab=deparse(substitute(Y)),nm=c(length(Y),length(X)),panel=plot,add.line=FALSE, line.col=2,add.robust=FALSE,rob.col=4)
Arguments
X a list, a data.frame, or a matrix representing the first set of things to be displayed.
Y a list, a data.frame, or a matrix representing the second set of things to be dis-played.
... furter parameters to the panel function
xlab The sequence of labels for the elements of X. Alternatively the labels can begiven as colnames or names of X. This option takes precedence if specified.
ylab The sequence of labels for the elements of Y. Alternatively the labels can begiven as colnames or names of Y. This option takes precedence if specified.
nm the parameter to be used in the call par(mfrow=nm). If NULL no parameter issetted and a sequence of plots can be generated.
panel The panel function to plot the individual panels. If the panel function admitsa formula interface, it is called as panel(y~x, xlab=xlab,ylab=ylab,...),otherwise as panel(x, y,xlab=xlab,ylab=ylab,...). Thus the panel func-tion must be capable of taking these arguments. It must also set up its own plot.There is no negotiation on coordinate system.
add.line logical, to control the addition of a regression line in each panel
line.col in case the regression line is added, which color should be used? defaults to red.
146 pairwiseplot
add.robust logical, to control the addition of a robust regression line in each panel. Ignoredif covariable is a factor. This is nowadays based on lmrob, but this can changein the future.
rob.col in case the robust regression line is added, which color should be used? Defaultsto blue.
Details
This is a light-weight convenience function to plot several aspects of one dataset against severalaspects of another dataset. It is far more straight-forward than e.g. the pairs function and does notdo any internal computation rather than organizing the names. Of course, the rows of the two datasets must be the same.
The current implementation may display a warning about the function panel dispatching methodsfor generic plot. It can be ignored without harm.
Optionally, classical and/or robust regression lines can be drawn, though only for non-factor covari-ables.
It may be convenient to use par capabilities to fit the device characteristics to the plot, in particulararguments mar and oma.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de, Raimon Tolosana-Delgado
References
Boogaart, K.G. v.d. , R. Tolosana (2008) Mixing Compositions and Other scales, Proceedings ofCodaWork 08.
http://ima.udg.es/Activitats/CoDaWork03
http://ima.udg.es/Activitats/CoDaWork05
http://ima.udg.es/Activitats/CoDaWork08
See Also
plot.aplus, balance, pwlrPlot
Examples
X <- rnorm(100)Y <- rnorm.acomp(100,acomp(c(A=1,B=1,C=1)),0.1*diag(3))+acomp(t(outer(c(0.2,0.3,0.4),X,"^")))
pairs(cbind(ilr(Y),X),panel=function(x,y,...) {points(x,y,...);abline(lm(y~x))})pairs(cbind(balance(Y,~A/B/C),X),
panel=function(x,y,...) {points(x,y,...);abline(lm(y~x))})pairwisePlot(balance(Y,~A/B/C),X)pairwisePlot(X,balance(Y,~A/B/C),
panel=function(x,y,...) {plot(x,y,...);abline(lm(y~x))})
parametricMat 147
pairwisePlot(X,balance01(Y,~A/B/C))
# A function to extract a portion representation of subcompsitions# with two elements:subComps <- function(X,...,all=list(...)) {
X <- oneOrDataset(X)nams <- sapply(all,function(x) paste(x[[2]],x[[3]],sep=","))val <- sapply(all,function(x){
a = X[,match(as.character(x[[2]]),colnames(X)) ]b = X[,match(as.character(x[[2]]),colnames(X)) ]c = X[,match(as.character(x[[3]]),colnames(X)) ]return(a/(b+c))
})colnames(val)<-namsval
}
pairwisePlot(X,subComps(Y,A~B,A~C,B~C))
## using Hydrochemical data set as illustration of mixed possibilitiesdata(Hydrochem)xc = acomp(Hydrochem[,c("Ca","Mg","Na","K")])fk = Hydrochem$RiverpH = -log10(Hydrochem$H)covars = data.frame(pH, River=fk)pairwisePlot(clr(xc), pH)pairwisePlot(clr(xc), pH, col=fk)pairwisePlot(pH, ilr(xc), add.line=TRUE)pairwisePlot(covars, ilr(xc), add.line=TRUE, line.col="magenta")pairwisePlot(clr(xc), covars, add.robust=TRUE)
parametricMat Unique parametrisations for matrices.
Description
Helper functions to parametrize positive semidefinite matrices in multivariate variogram models.
Usage
parametricRank1Mat(p)parametricPosdefMat(p)parameterRank1Mat(A)parameterPosdefMat(A)parametricRank1ClrMat(p)parametricPosdefClrMat(p)parameterRank1ClrMat(A)parameterPosdefClrMat(A)
148 perturbe
Arguments
A a positiv definit matrix of the given type
p a vector of parameters describing the matrix, as returned by the parameter func-tions.
Details
The rank 1 matrix is parametrised by the first eigenvector scaled by the square root of the eigenvalue.The positiv semidefinit matrix the entries of a upper right triangular matrix R with t(R)%*%R==A.The clr matrices are work with the parameters of the corresponding ilr matrix.
Value
A or p, depending on what is not given.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
vgram2lrvgram, CompLinModCoReg, vgmFit
Examples
parametricRank1Mat(c(0,0,2))parametricPosdefMat(c(0,0,1,0,0,0))parameterRank1Mat(matrix(1,nr=3,nc=3))parameterPosdefMat(diag(5))
perturbe Perturbation of compositions
Description
The perturbation is the addition operation in the Aitchison geometry of the simplex.
Usage
perturbe(x,y)## Methods for class "acomp"## x + y## x - y## - x
perturbe 149
Arguments
x compositions of class acomp
y compositions of class acomp
Details
The perturbation is the basic addition operation of the Aitichson simplex as a vector space. It isdefined by:
(x+ y)i = clo((xiyi)i)i
perturbe and + compute this operation. The only difference is that + checks the class of its argu-ment, while perturbe does not check the type of the arguments and can thus directly be applied toa composition in any form (unclassed, acomp, rcomp).The - operation is the inverse of the addition in the usual way and defined by:
(x− y)i := clo((xi/yi)i)i
and as unary operation respectively as:
(−x)i := clo((1/yi)i)i
Value
An acomp vector or matrix.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de, Raimon Tolosana-Delgado
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
See Also
acomp, *.aplus, +.rplus
Examples
tmp <- -acomp(1:3)tmp + acomp(1:3)
150 plot.acomp
plot.acomp Ternary diagrams
Description
Displaying compositions in ternary diagrams
Usage
## S3 method for class 'acomp'plot(x,...,labels=names(x),
aspanel=FALSE,id=FALSE,idlabs=NULL,idcol=2,center=FALSE,scale=FALSE,pca=FALSE,col.pca=par("col"),margin="acomp",add=FALSE,triangle=!add,col=par("col"),axes=FALSE,plotMissings=TRUE,lenMissingTck=0.05,colMissingTck="red",mp=~simpleMissingSubplot(c(0,1,0.95,1),
missingInfo,c("NM","TM",cn)),robust=getOption("robust"))
## S3 method for class 'rcomp'plot(x,...,labels=names(x),
aspanel=FALSE,id=FALSE,idlabs=NULL,idcol=2,center=FALSE,scale=FALSE,pca=FALSE,col.pca=par("col"),margin="rcomp",add=FALSE,triangle=!add,col=par("col"),axes=FALSE,plotMissings=TRUE,lenMissingTck=0.05,colMissingTck="red",mp=~simpleMissingSubplot(c(0,1,0.95,1),
missingInfo,c("NM","TM",cn)),robust=getOption("robust"))
## S3 method for class 'ccomp'plot(x,...)
Arguments
x a dataset of a compositional class
... further graphical parameters passed (see par)
margin the type of marginalisation to be computed, when displaying the individual pan-els. Possible values are: "acomp", "rcomp" and any of the variable names/columnnumbers in the composition. If one of the columns is selected each panel dis-plays a subcomposition given by the row part, the column part and the givenpart. If one of the classes is given the corresponding margin acompmargin orrcompmargin is used.
add a logical indicating whether the information should just be added to an existingplot. If FALSE a new plot is created
triangle a logical indicating whether the triangle should be drawn
plot.acomp 151
col the color to plot the datalabels the names of the partsaspanel logical indicating that only a single panel should be drawn and not the whole
plot. Internal use onlyid logical, if TRUE one can identify the points like with the identify command.idlabs a character vector providing the labels to be used with the identification, when
id=TRUE
idcol color of the idlabs labelscenter a logical indicating whether a the data should be centered prior to the plot. Cen-
tering is done in the choosen geometry. See scale
scale a logical indicating whether a the data should be scaled prior to the plot. Scalingis done in the choosen geometry. See scale
pca a logical indicating whether the first principal component should be displayedin the plot. Currently, the direction of the principal component of the displayedsubcomposition is displayed as a line. In a future, the projected principal com-ponenent of the whole dataset should be displayed.
col.pca The color to draw the principal component.axes Either a logical wether to plot the axes, or numerical enumerating the axes sides
to be used e.g. 1 for only plotting the lower axes, or a list of parameters toternaryAxis.
plotMissings logical indicating that missingness should be represented graphically. Compo-nentes with one missing subcomponent in the plot are represented by tickmarksat the three axis. Components with two or three missing components are onlyrepresented in a special panel drawn according to the mp parameter if missingsare present. Missings of type BDL (below detection limit) are always plotted,even if plotMissings is false, but in this case this fact is not specially marked.In rcomp geometry an actuall 0 in the data is never treated as missing.
lenMissingTck length of the tick-marks to be plotted for missing values. If 0 no tickmarksare plotted. Negative lengths point outside. length 1 draws right through tothe opposit corner. Missing ticks in acomp geometry are inclined showing theline of possible values in acomp geometry. Missingticks in rcomp-geometryare vertical to the axis representing the fact that only the other component isunkown. That these lines can leave the plot is one of the odd consequences ofrcomp geometry.
colMissingTck colors to draw the missing tick-marks. NULL means to take the colors specifiedfor the observations.
mp A formula providing a call to a function plotting informations on the missings.The call is evaluted in the environment of the panel plotting function and hasaccess (among others) to: cn the names of the components in the current plot, xthe dataset of the current plot, y the transformed dataset, (c60,s60) coordinatesof the upper vertex of the triangle. missingInfo is a table giving the numberof observations of the types NM=Non Missing, TM=Totally missing (i.e. atleast two components of the subcomposition are missing), and the three singlecomponent missing possibilities for the three components.
robust A robustness description. See robustnessInCompositions for details. The optionis used for centering, scaling and principle components.
152 plot.acomp
Details
The data is displayed in ternary diagrams. Thus, it does not work for two-part compositions. Com-positions of three parts are displayed in a single ternary diagram. For compositions of more thanthree components, the data is arranged in a scatterplot matrix through the command pairs.In this case, the third component in each of the panels is chosen according to setting of margin=.Possible values of margin= are: "acomp", "rcomp" and any of the variable names/column numbersin the composition. If one of the columns is selected each panel displays a subcomposition givenby the row part, the column part and the given part. If one of the classes is given the correspondingmargin acompmargin or rcompmargin is used.Ternary diagrams can be read in multiple ways. Each corner of the triangle corresponds to an ex-treme composition containing only the part displayed in that corner. Points on the edges correspondto compositions containing only the parts in the adjacent corners. The relative amounts are dis-played by the distance to the opposite corner (so-called barycentric coordinates). The individualportions of any point can be infered by drawing a line through the investigated point, and parallelto the edge opposite to the corner of the part of interest. The portion of this part is constant alongthe line. Thus we can read it on the sides of the ternary diagram, where the line crosses its borders.Note that these isoPortionLines remain straight under an arbitrary perturbation.ccomp ternary diagrams are always jittered to avoid overplotting.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de, Raimon Tolosana-Delgado
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
Aitchison, J, C. Barcel’o-Vidal, J.J. Egozcue, V. Pawlowsky-Glahn (2002) A consise guide to the al-gebraic geometric structure of the simplex, the sample space for compositional data analysis, TerraNostra, Schriften der Alfred Wegener-Stiftung, 03/2003
Billheimer, D., P. Guttorp, W.F. and Fagan (2001) Statistical interpretation of species composition,Journal of the American Statistical Association, 96 (456), 1205-1214
Pawlowsky-Glahn, V. and J.J. Egozcue (2001) Geometric approach to statistical analysis on thesimplex. SERRA 15(5), 384-398
http://ima.udg.es/Activitats/CoDaWork03
http://ima.udg.es/Activitats/CoDaWork05
See Also
plot.aplus, plot3D (for 3D plot), kingTetrahedron (for 3D-plot model export), qqnorm.acomp,boxplot.acomp
plot.aplus 153
Examples
data(SimulatedAmounts)plot(acomp(sa.lognormals))plot(acomp(sa.lognormals),axes=TRUE)plot(rcomp(sa.lognormals))plot(rcomp(sa.lognormals5))
plot(acomp(sa.lognormals5),pca=TRUE,col.pca="red")plot(rcomp(sa.lognormals5),pca=TRUE,col.pca="red",axes=TRUE)
plot.aplus Displaying amounts in scatterplots
Description
This function displays multivariate unclosed amout datasets classes "aplus" and "rplus" in a wayrespecting the choosen geometry eventually in log scale.
Usage
## S3 method for class 'aplus'plot(x,...,labels=colnames(X),cn=colnames(X),
aspanel=FALSE,id=FALSE,idlabs=NULL,idcol=2,center=FALSE,scale=FALSE,pca=FALSE,col.pca=par("col"),add=FALSE,logscale=TRUE,xlim=NULL,ylim=xlim,col=par("col"),plotMissings=TRUE,lenMissingTck=0.05,colMissingTck="red",mp=~simpleMissingSubplot(missingPlotRect,missingInfo,
c("NM","TM",cn)),robust=getOption("robust"))
## S3 method for class 'rplus'plot(x,...,labels=colnames(X),cn=colnames(X),
aspanel=FALSE,id=FALSE,idlabs=NULL,idcol=2,center=FALSE,scale=FALSE,pca=FALSE,col.pca=par("col"),add=FALSE,logscale=FALSE,xlim=NULL,ylim=xlim,col=par("col"),plotMissings=TRUE,lenMissingTck=0.05,colMissingTck="red",mp=~simpleMissingSubplot(missingPlotRect,missingInfo,
c("NM","TM",cn)),robust=getOption("robust"))
## S3 method for class 'rmult'plot(x,...,labels=colnames(X),cn=colnames(X),
aspanel=FALSE,id=FALSE,idlabs=NULL,idcol=2,center=FALSE,scale=FALSE,pca=FALSE,col.pca=par("col"),add=FALSE,logscale=FALSE,col=par("col"),robust=getOption("robust"))
154 plot.aplus
Arguments
x a dataset with class aplus, rplus or rmult
... further graphical parameters passed (see par)
add a logical indicating whether the information should just be added to an existingplot. If FALSE, a new plot is created
col the color to plot the data
plotMissings logical indicating that missingness should be represented graphically. Compo-nentes with one missing subcomponent in the plot are represented by tickmarksat the two axis. Cases with two missing components are only represented ina special panel drawn according to the mp parameter if missings are present.Missings of type BDL (below detection limit) are always plotted in nonlogarit-mic plots, even if plotMissings is false, but in this case this fact is not speciallymarked.
lenMissingTck length of the tick-marks (in portion of the plotting region) to be plotted for miss-ing values. If 0 no tickmarks are plotted. Negative lengths point outside of theplot. A length of 1 runs right through the whole plot.
colMissingTck colors to draw the missing tick-marks. NULL means to take the colors specifiedfor the observations.
mp A formula providing a call to a function plotting informations on the missings.The call is evaluted in the environment of the panel plotting function and hasaccess (among others) to: cn the names of the components in the current plot,x the dataset of the current plot, missingInfo is a table giving the numberof observations of the types NM=Non Missing, TM=Totally missing (i.e. twocomponents of the subcomposition are missing), and the two single componentmissing possibilities.
labels the labels for names of the parts
cn the names of the parts to be used in a single panel. Internal use only
aspanel logical indicating that only a single panel should be drawn and not the wholeplot. Internal use only
id a logical. If TRUE one can identify the points like with the identify command
idlabs A character vector providing the labels to be used with the identification, whenid=TRUE
idcol color of the idlabs labels
center a logical indicating whether the data should be centered prior to the plot. Cen-tering is done in the chosen geometry. See scale
scale a logical indicating whether the data should be scaled prior to the plot. Scalingis done in the chosen geometry. See scale
pca a logical indicating whether the first principal component should be displayedin the plot. Currently, the direction of the principal component of the displayedsubcomposition is displayed as a line. In a future, the projected principal com-ponenent of the whole dataset should be displayed.
col.pca the color to draw the principal component.
logscale logical indicating whether a log scale should be used
plot3D 155
xlim 2xncol(x)-matrix giving the xlims for the columns of x
ylim 2xncol(x)-matrix giving the ylims for the columns of x
robust A robustness description. See robustnessInCompositions for details. The optionis used for centering, scaling and principle components.
Details
TO DO: fix pca bug
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
plot.aplus, qqnorm.acomp,boxplot.acomp
Examples
data(SimulatedAmounts)plot(aplus(sa.lognormals))plot(rplus(sa.lognormals))plot(aplus(sa.lognormals5))plot(rplus(sa.lognormals5))
plot3D plot in 3D based on rgl
Description
3-dimensional plots, which can be rotated and zoomed in/out
Usage
plot3D(x,...)## Default S3 method:plot3D(x,...,add=FALSE,bbox=TRUE,axes=FALSE,
cex=1,size=cex,col=1)
Arguments
x an object to be plotted, e.g. a data frame or a data matrix
... additional plotting parameters as described in rgl.material
add logical, adding or new plot
bbox logical, whether to add a bounding box
axes logical, whether to plot an axes of coordinates
156 plot3Dacomp
cex size of the plotting symbolsize size of the plotting symbol, only size or cex should be usedcol the color used for dots, defaults to black.
Details
The function provides a generic interface for 3-dimensional plotting in analogy to the 2d-plottinginterface of plot, using rgl package.
Value
the 3D plotting coordinates of the objects displayed, returned invisibly
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
points3d, plot, plot3D.rmult,
plot3D.acomp,plot3D.rcomp„ plot3D.aplus,plot3D.rplus
Examples
x <- cbind(rnorm(10),rnorm(10),rnorm(10))plot3D(x)data(SimulatedAmounts)plot3D(sa.lognormals,cex=4,col=1:nrow(sa.lognormals))
plot3Dacomp 3D-plot of compositional data
Description
3D-plot of compositional data. The plot is mainly an exploratory tool, not intended for exact displayof data.
Usage
## S3 method for class 'acomp'plot3D(x, parts=1:min(ncol(X),4),...,
lwd=2, axis.col="gray", add=FALSE, cex=2,vlabs=colnames(x), vlabs.col=axis.col, center=FALSE,scale=FALSE, log=FALSE, bbox=FALSE, axes=TRUE, size=cex,col=1)
## S3 method for class 'rcomp'plot3D(x,parts=1:min(ncol(X),4),...,
lwd=2,axis.col="gray",add=FALSE,cex=2,vlabs=colnames(x),vlabs.col=axis.col,center=FALSE,scale=FALSE,log=FALSE,bbox=FALSE,axes=TRUE,size=cex,col=1)
plot3Dacomp 157
Arguments
x an aplus object to be plotted
parts a numeric xor character vector of length 3 coding the columns to be plotted
... additional plotting parameters as described in rgl.material
add logical, adding or new plot
cex size of the plotting symbols
lwd line width
axis.col color of the axis
vlabs the column names to be plotted, if missing defaults to the column names of theselected columns of X
vlabs.col color of the labels
center logical, should the data be centered
scale logical, should the data be scaled
log logical, indicating wether to plot in log scale
bbox logical, whether to add a bounding box
axes logical, whether plot a coordinate cross
size size of the plotting symbols
col the color used for dots, defaults to black.
Details
The routine behaves different when 3 or four components should be plotted. In case of four com-ponents:If log is TRUE the data is plotted in ilr coordinates. This is the isometric view of the data.If log is FALSE the data is plotted in ipt coordinates and a tetrahedron is plotted around it ifcoors == TRUE. This can be used to do a tetrahedron plot.In case of three components:If log is TRUE the data is plotted in clr coordinates. This can be used to visualize the clr plane.If log is FALSE the data is plotted as is, showing the embedding of the three-part simplex in thethree-dimensional space.In all cases: If coors is true, coordinate arrows are plotted of length 1 in the origin of the space,except in the tetrahedron case.
Value
Called for its side effect of a 3D plot of an acomp object in an rgl plot. It invisibly returns the 3Dplotting coordinates of the objects displayed
Note
The function kingTetrahedron provides an alternate way of tetrahedron plots, based on a moreadvanced viewer, which must be downloaded separately.
158 plot3Daplus
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
kingTetrahedron points3d, plot3D, plot, plot3D.rmult,
plot3D.acomp,plot3D.rcomp„ plot3D.aplus,plot3D.rplus
Examples
data(SimulatedAmounts)plot3D(acomp(sa.lognormals5),1:3,col="green")plot3D(acomp(sa.lognormals5),1:3,log=TRUE,col="green")plot3D(acomp(sa.lognormals5),1:4,col="green")plot3D(acomp(sa.lognormals5),1:4,log=TRUE,col="green")
plot3Daplus 3D-plot of positive data
Description
3D-plot of positive data typically in log-log-log scale. The plot is mainly an exploratory tool, andnot intended for exact display of data.
Usage
## S3 method for class 'aplus'plot3D(x,parts=1:3,...,
vlabs=NULL,add=FALSE,log=TRUE,bbox=FALSE,axes=TRUE,col=1)
Arguments
x an aplus object to be plotted
parts a numeric xor character vector of length 3 coding the columns to be plotted
... additional plotting parameters as described in rgl.material
add logical, adding or new plot
vlabs the column names to be plotted, if missing defaults to the column names of theselected columns of X
log logical, indicating wether to plot in log scale
bbox logical, whether to add a bounding box
axes logical, plot a coordinate system
col the color used for dots, defaults to black.
plot3Drmult 159
Details
If log is TRUE the data is plotted in ilt coordinates. If coors is true, coordinate arrows are plottedof length 1 and in the (aplus-)mean of the dataset.If log is FALSE the data is plotted with plot.rplus
Value
Called for its side effect of a 3D plot of an aplus object in an rgl plot. It invisibly returns the 3Dplotting coordinates of the objects displayed
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
points3d, plot3D, plot, plot3D.rmult,
plot3D.acomp,plot3D.rcomp„ plot3D.aplus,plot3D.rplus
Examples
data(SimulatedAmounts)plot3D(aplus(sa.lognormals),size=2)
plot3Drmult plot in 3D based on rgl
Description
3-dimensional plots, which can be rotated and zoomed in/out
Usage
## S3 method for class 'rmult'plot3D(x,parts=1:3,...,
center=FALSE,scale=FALSE,add=FALSE,axes=!add,cex=2,vlabs=colnames(x),size=cex,bbox=FALSE,col=1)
Arguments
x an object to be plotted, e.g. a data frame or a data matrix
parts the variables in the rmult object to be plotted
... additional plotting parameters as described in rgl.material
center logical, center the data? This might be necessary to stay within the openGL-arithmetic used in rgl.
160 plot3Drplus
scale logical, scale the data? This might be necessary to stay within the openGL-arithmetic used in rgl.
add logical, adding or new plot
bbox logical, whether to add a bounding box
axes logical, whether to plot a coordinate cross
cex size of the plotting symbol (as expanding factor)
vlabs labels for the variables
size size of the plotting symbol, only size or cex should be used
col the color used for dots, defaults to black.
Details
The function provides a generic interface for 3-dimensional plotting in analogy to the 2d-plottinginterface of plot, using rgl package.
Value
the 3D plotting coordinates of the objects displayed, returned invisibly
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
points3d, plot, plot3D.rmult,
plot3D.acomp,plot3D.rcomp, plot3D.aplus,plot3D.rplus
Examples
x <- cbind(rnorm(10),rnorm(10),rnorm(10))plot3D(x)data(SimulatedAmounts)plot3D(rmult(sa.lognormals),cex=4,col=1:nrow(sa.lognormals))
plot3Drplus plot in 3D based on rgl
Description
3-dimensional plots, which can be rotated and zoomed in/out
Usage
## S3 method for class 'rplus'plot3D(x,parts=1:3,...,vlabs=NULL,add=FALSE,bbox=FALSE,
cex=1,size=cex,axes=TRUE,col=1)
plot3Drplus 161
Arguments
x an rplus object to be plotted
parts the variables in the rplus object to be plotted
... additional plotting parameters as described in rgl.material
vlabs the labels used for the variable axes
add logical, adding or new plot
bbox logical, whether to add a bounding box
cex size of the plotting symbol (as character expansion factor)
size size of the plotting symbol, only size or cex should be used
axes logical, whether to plot a coordinate cross
col the color used for dots, defaults to black.
Details
The function plots rplus objects in a 3D coordinate system, in an rgl plot.
Value
the 3D plotting coordinates of the objects displayed, returned invisibly
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
points3d, plot, plot3D.rmult,
plot3D.acomp,plot3D.rcomp„ plot3D.aplus,plot3D
Examples
x <- cbind(rnorm(10),rnorm(10),rnorm(10))plot3D(rplus(exp(x)))data(SimulatedAmounts)plot3D(rplus(sa.lognormals),cex=4,col=1:nrow(sa.lognormals))
162 plotlogratioVariogram
plotlogratioVariogram Empirical variograms for compositions
Description
Plots a logratioVariogram.
Usage
## S3 method for class 'logratioVariogram'plot(x,...,type="l",lrvg=NULL,
fcols=2:length(lrvg),oma=c(4, 4, 4, 4),gap=0,ylim=NULL)
Arguments
x The logratioVariogram created by logratioVariogram
... further parameters for plot.default
type as in plot.default
lrvg a model function for a logratiovariogram or a list of several, to be added to theplot.
fcols the colors for the different lrvg variograms
oma The outer margin of the paneled plot
gap The distance of the plot panals used to determin mar
ylim The limits of the Y-axis. If zero it is automatically computed.
Details
see logratioVariogram
Value
Nothing.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
vgram2lrvgram, CompLinModCoReg
plotmissingsummary 163
Examples
## Not run:data(juraset)X <- with(juraset,cbind(X,Y))comp <- acomp(juraset,c("Cd","Cu","Pb","Co","Cr"))lrv <- logratioVariogram(comp,X,maxdist=1,nbins=10)fff <- CompLinModCoReg(~nugget()+sph(0.5)+R1*exp(0.7),comp)fit <- vgmFit(lrv,fff)fitfff(1:3)plot(lrv,lrvg=vgram2lrvgram(fit$vg))
## End(Not run)
plotmissingsummary Plot a Missing Summary
Description
Plots a missing summary as a barplot
Usage
## S3 method for class 'missingSummary'plot(x,...,main="Missings",legend.text=TRUE,
col=c("gray","lightgray","yellow","red","white","magenta"))as.missingSummary(x,...)
Arguments
x a missingSummary table with columns representing different types of missing
... further graphical parameters to barplot
main as in barplot
legend.text as in barplot
col as in barplot
Details
The different types of missings are drawn in quasi-self-understandable colors: normal gray forNMV, and lightgray as for BDT (since they contain semi-numeric information), yellow (slight warn-ing) for MAR, red (serious warning) for MNAR, white (because they are non-existing) for SZ, andmagenta for the strange case of errors.
Value
called for its side effect. The return value is not defined.
164 PogoJump
Author(s)
K.Gerald van den Boogaart
References
See compositions.missings for more details.
See Also
missingSummary
Examples
data(SimulatedAmounts)x <- acomp(sa.lognormals)xnew <- simulateMissings(x,dl=0.05,MAR=0.05,MNAR=0.05,SZ=0.05)xnewplot(missingSummary(xnew))
PogoJump Honk Kong Pogo-Jumps Championship
Description
Yat, yee, sam measurements (in meters) for the final jumps of the 1985 Honk Kong Pogo-JumpsChampionship, 4 jumps of the 7 finalists.
Usage
data(PogoJump)
Details
The data consist of 28 cases: 4 jumps of the 7 finalists, and 4 variables: Yat, Yee, Sam measurementsin meters, and finalist – 1 to 7.
Pogo-Jumps is similar to the triple jump except that the competitor is mounted on a pogo-stik. Aftera pogo-up towards the starting board the total jump distance achieved in three consecutive bounces,known as the yat, yee and sam, is recorded.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name HKPOGO.DAT, hereincluded under the GNU Public Library Licence Version 2 or newer.
powerofpsdmatrix 165
References
Aitchison, J. (1982) The Statistical Analysis of Compositional Data, (Data 19) pp21.
powerofpsdmatrix power transform of a matrix
Description
Computes the power of a positive semi-definite symmetric matrix.
Usage
powerofpsdmatrix( M , p,...)
Arguments
M a matrix, preferably symmetric
p a single number giving the power
... further arguments to the singular value decomposition
Details
for a symmetric matrix the computed result can actually be considered as a version of the givenpower of the matrix fullfilling the relation:
MpMq = Mp+q
The symmetry of the matrix is not checked.
Value
U%*% D^p %*% t(P) where the UDP is the singular value decomposition of M.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
Examples
data(SimulatedAmounts)d <- ilr(sa.lognormals)var( d %*% powerofpsdmatrix(var(d),-1/2)) # Unit matrix
166 princomp.acomp
princomp.acomp Principal component analysis for Aitchison compositions
Description
A principal component analysis is done in the Aitchison geometry (i.e. clr-transform) of the sim-plex. Some gimics simplify the interpretation of the computed components as compositional per-turbations.
Usage
## S3 method for class 'acomp'princomp(x,...,scores=TRUE,center=attr(covmat,"center"),
covmat=var(x,robust=robust,giveCenter=TRUE),robust=getOption("robust"))
## S3 method for class 'princomp.acomp'print(x,...)## S3 method for class 'princomp.acomp'
plot(x,y=NULL,..., npcs=min(10,length(x$sdev)),type=c("screeplot","variance","biplot","loadings","relative"),main=NULL,scale.sdev=1)
## S3 method for class 'princomp.acomp'predict(object,newdata,...)
Arguments
x a acomp-dataset (in princomp) or a result from princomp.acomp
y not used
scores a logical indicating whether scores should be computed or not
npcs the number of components to be drawn in the scree plot
type type of the plot: "screeplot" is a lined screeplot, "variance" is a boxplot-like screeplot, "biplot" is a biplot, "loadings" displays the loadings as abarplot.acomp
scale.sdev the multiple of sigma to use plotting the loadings
main title of the plot
object a fitted princomp.acomp object
newdata another compositional dataset of class acomp
... further arguments to pass to internally-called functions
covmat provides the covariance matrix to be used for the principle component analysis
center provides the be used for the computation of scores
robust Gives the robustness type for the calculation of the covariance matrix. SeerobustnessInCompositions for details.
princomp.acomp 167
Details
As a metric euclidean space the Aitchison simplex has its own principal component analysis, thatshould be performed in terms of the covariance matrix and not in terms of the meaningless correla-tion matrix.To aid the interpretation we added some extra functionality to a normal princomp(clr(x)). Firstof all the result contains as additional information the compositional representation of the returnedvectors in the space of the data: the center as a composition Center, and the loadings in terms ofa composition to perturbe with, either positively (Loadings) or negatively (DownLoadings). TheUp- and DownLoadings are normalized to the number of parts in the simplex and not to one tosimplify the interpretation. A value of about one means no change in the specific component. Toavoid confusion the meaningless last principal component is removed.The plot routine provides screeplots (type = "s",type= "v"), biplots (type = "b"),plots of the effect of loadings (type = "b") in scale.sdev*sdev-spread, and loadings of pairwise(log-)ratios (type = "r").The interpretation of a screeplot does not differ from ordinary screeplots. It shows the eigenvaluesof the covariance matrix, which represent the portions of variance explained by the principal com-ponents.The interpretation of the biplot strongly differs from a classical one. The relevant variables are notthe arrows drawn (one for each component), but rather the links (i.e., the differences) between twoarrow heads, which represents the log-ratio between the two components represented by the arrows.The compositional loading plot is introduced with this package. The loadings of all component canbe seen as an orthogonal basis in the space of clr-transformed data. These vectors are displayed bya barplot with their corresponding composition. For a better interpretation the total of these com-positons is set to the number of parts in the composition, such that a portion of one means no effect.This is similar to (but not exactly the same as) a zero loading in a real principal component analysis.The loadings plot can work in two different modes: if scale.sdev is set to NA it displays the compo-sition beeing represented by the unit vector of loadings in the clr-transformed space. If scale.sdevis numeric we use this composition scaled by the standard deviation of the respective component.The relative plot displays the relativeLoadings as a barplot. The deviation from a unit bar showsthe effect of each principal component on the respective ratio.
Value
princomp gives an object of type c("princomp.acomp","princomp") with the following content:
sdev the standard deviation of the principal components
loadings the matrix of variable loadings (i.e., a matrix which columns contain the eigen-vectors). This is of class "loadings". The last eigenvector is removed since itshould contain the irrelevant scaling.
center the clr-transformed vector of means used to center the dataset
Center the acomp vector of means used to center the dataset
scale the scaling applied to each variable
n.obs number of observations
scores if scores = TRUE, the scores of the supplied data on the principal components.Scores are coordinates in a basis given by the principal components and thus notcompositions
call the matched call
168 princomp.acomp
na.action not clearly understood
Loadings compositions that represent a perturbation with the vectors represented by theloadings of each of the factors
DownLoadings compositions that represent a perturbation with the inverse of the vectors repre-sented by the loadings of each of the factors
predict returns a matrix of scores of the observations in the newdata dataset. The other routines are mainly called for their side effect of plotting or printing and return theobject x.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
Aitchison, J, C. Barcel’o-Vidal, J.J. Egozcue, V. Pawlowsky-Glahn (2002) A consise guide to the al-gebraic geometric structure of the simplex, the sample space for compositional data analysis, TerraNostra, Schriften der Alfred Wegener-Stiftung, 03/2003
Aitchison, J. and M. Greenacre (2002) Biplots for Compositional Data Journal of the Royal Statis-tical Society, Series C (Applied Statistics) 51 (4) 375-392
http://ima.udg.es/Activitats/CoDaWork03
http://ima.udg.es/Activitats/CoDaWork05
See Also
clr,acomp, relativeLoadings princomp.aplus, princomp.rcomp, barplot.acomp, mean.acomp,var.acomp
Examples
data(SimulatedAmounts)pc <- princomp(acomp(sa.lognormals5))pcsummary(pc)plot(pc) #plot(pc,type="screeplot")plot(pc,type="v")plot(pc,type="biplot")plot(pc,choice=c(1,3),type="biplot")plot(pc,type="loadings")plot(pc,type="loadings",scale.sdev=-1) # Downwardplot(pc,type="relative",scale.sdev=NA) # The directionsplot(pc,type="relative",scale.sdev=1) # one sigma Upwardplot(pc,type="relative",scale.sdev=-1) # one sigma Downwardbiplot(pc)screeplot(pc)
princomp.aplus 169
loadings(pc)relativeLoadings(pc,mult=FALSE)relativeLoadings(pc)relativeLoadings(pc,scale.sdev=1)relativeLoadings(pc,scale.sdev=2)
pc$Loadingspc$DownLoadingsbarplot(pc$Loadings)pc$sdev^2cov(predict(pc,sa.lognormals5))
princomp.aplus Principal component analysis for amounts in log geometry
Description
A principal component analysis is done in the Aitchison geometry (i.e. ilt-transform). Some gimicssimplify the interpretation of the computed components as perturbations of amounts.
Usage
## S3 method for class 'aplus'princomp(x,...,scores=TRUE,center=attr(covmat,"center"),
covmat=var(x,robust=robust,giveCenter=TRUE),robust=getOption("robust"))
## S3 method for class 'princomp.aplus'print(x,...)## S3 method for class 'princomp.aplus'plot(x,y=NULL,..., npcs=min(10,length(x$sdev)),
type=c("screeplot","variance","biplot","loadings","relative"),main=NULL,scale.sdev=1)
## S3 method for class 'princomp.aplus'predict(object,newdata,...)
Arguments
x an aplus dataset (for princomp) or a result from princomp.aplus
y not used
scores a logical indicating whether scores should be computed or not
npcs the number of components to be drawn in the scree plot
type type of the plot: "screeplot" is a lined screeplot, "variance" is a boxplot-like screeplot, "biplot" is a biplot, "loadings" displays the loadings as abarplot.acomp
scale.sdev the multiple of sigma to use when plotting the loadings
main title of the plot
170 princomp.aplus
object a fitted princomp.aplus object
newdata another amount dataset of class aplus
... further arguments to pass to internally-called functions
covmat provides the covariance matrix to be used for the principle component analysis
center provides the be used for the computation of scores
robust Gives the robustness type for the calculation of the covariance matrix. Seevar.rmult for details.
Details
As a metric euclidean space, the positive real space described in aplus has its own principal com-ponent analysis, that can be performed either in terms of the covariance matrix or the correlationmatrix. However, since all parts in a composition or in an amount vector share a natural scaling,they do not need the standardization (which in fact would produce a loss of important information).For this reason, princomp.aplus works on the covariance matrix.To aid the interpretation we added some extra functionality to a normal princomp(ilt(x)). First ofall the result contains as additional information the amount representation of returned vectors in thespace of the data: the center as an amount Center, and the loadings in terms of amounts to perturbewith, either positively (Loadings) or negatively (DownLoadings). The Up- and DownLoadings arenormalized to the number of parts and not to one to simplify the interpretation. A value of aboutone means no change in the specific component.The plot routine provides screeplots (type = "s",type= "v"), biplots (type = "b"), plotsof the effect of loadings (type = "b") in scale.sdev*sdev-spread, and loadings of pairwise (log-)ratios (type = "r").The interpretation of a screeplot does not differ from ordinary screeplots. It shows the eigenvaluesof the covariance matrix, which represent the portions of variance explained by the principal com-ponents.The interpretation of the the biplot uses, additionally to the classical one, a compositional concept:The differences between two arrowheads can be interpreted as log-ratios between the two compo-nents represented by the arrows.The amount loading plot is introduced with this package. The loadings of all component can beseen as an orthogonal basis in the space of ilt-transformed data. These vectors are displayed by abarplot with their corresponding amounts. A portion of one means no change of this part. This isequivalent to a zero loading in a real principal component analysis.The loadings plot can work in two different modes. If scale.sdev is set to NA it displays the amountvector being represented by the unit vector of loadings in the ilt-transformed space. If scale.sdevis numeric we use this amount vector scaled by the standard deviation of the respective component.The relative plot displays the relativeLoadings as a barplot. The deviation from a unit bar showsthe effect of each principal component on the respective ratio. The interpretation of the ratios plotmay only be done in an Aitchison-compositional framework (see princomp.acomp).
Value
princomp gives an object of type c("princomp.acomp","princomp") with the following content:
sdev the standard deviation of the principal components
loadings the matrix of variable loadings (i.e., a matrix which columns contain the eigen-vectors). This is of class "loadings".
princomp.aplus 171
center the ilt-transformed vector of means used to center the dataset
Center the aplus vector of means used to center the dataset
scale the scaling applied to each variable
n.obs number of observations
scores if scores = TRUE, the scores of the supplied data on the principal components.Scores are coordinates in a basis given by the principal components and thus notcompositions
call the matched call
na.action not clearly understood
Loadings vectors of amounts that represent a perturbation with the vectors represented bythe loadings of each of the factors
DownLoadings vectors of amounts that represent a perturbation with the inverses of the vectorsrepresented by the loadings of each of the factors
predict returns a matrix of scores of the observations in the newdata dataset. The other routines are mainly called for their side effect of plotting or printing and return theobject x.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
ilt,aplus, relativeLoadings princomp.acomp, princomp.rplus, barplot.aplus, mean.aplus,
Examples
data(SimulatedAmounts)pc <- princomp(aplus(sa.lognormals5))pcsummary(pc)plot(pc) #plot(pc,type="screeplot")plot(pc,type="v")plot(pc,type="biplot")plot(pc,choice=c(1,3),type="biplot")plot(pc,type="loadings")plot(pc,type="loadings",scale.sdev=-1) # Downwardplot(pc,type="relative",scale.sdev=NA) # The directionsplot(pc,type="relative",scale.sdev=1) # one sigma Upwardplot(pc,type="relative",scale.sdev=-1) # one sigma Downwardbiplot(pc)screeplot(pc)loadings(pc)relativeLoadings(pc,mult=FALSE)relativeLoadings(pc)relativeLoadings(pc,scale.sdev=1)relativeLoadings(pc,scale.sdev=2)
172 princomp.rcomp
pc$Loadingspc$DownLoadingsbarplot(pc$Loadings)pc$sdev^2cov(predict(pc,sa.lognormals5))
princomp.rcomp Principal component analysis for real compositions
Description
A principal component analysis is done in real geometry (i.e. cpt-transform) of the simplex. Somegimics simplify the interpretation of the obtained components.
Usage
## S3 method for class 'rcomp'princomp(x,...,scores=TRUE,center=attr(covmat,"center"),
covmat=var(x,robust=robust,giveCenter=TRUE),robust=getOption("robust"))
## S3 method for class 'princomp.rcomp'print(x,...)## S3 method for class 'princomp.rcomp'
plot(x,y=NULL,...,npcs=min(10,length(x$sdev)),type=c("screeplot","variance","biplot","loadings","relative"),main=NULL,scale.sdev=1)
## S3 method for class 'princomp.rcomp'predict(object,newdata,...)
Arguments
x an rcomp dataset (for princomp) or a result from princomp.rcompy not usedscores a logical indicating whether scores should be computed or notnpcs the number of components to be drawn in the scree plottype type of the plot: "screeplot" is a lined screeplot, "variance" is a boxplot-like
screeplot, "biplot" is a biplot, "loadings" displays the loadings as a barplotscale.sdev the multiple of sigma to use when plotting the loadingsmain title of the plotobject a fitted princomp.rcomp objectnewdata another compositional dataset of class rcomp... further arguments to pass to internally-called functionscovmat provides the covariance matrix to be used for the principle component analysiscenter provides the be used for the computation of scoresrobust Gives the robustness type for the calculation of the covariance matrix. See
var.rmult for details.
princomp.rcomp 173
Details
Mainly a princomp(cpt(x)) is performed. To avoid confusion, the meaningless last principal com-ponent is removed.The plot routine provides screeplots (type = "s",type= "v"), biplots (type = "b"), plots ofthe effect of loadings (type = "b") in scale.sdev*sdev-spread, and loadings of pairwise differ-ences (type = "r").The interpretation of a screeplot does not differ from ordinary screeplots. It shows the eigenvaluesof the covariance matrix, which represent the portions of variance explained by the principal com-ponents.The interpretation of the biplot strongly differs from a classical one. The relevant variables arenot the arrows drawn (one for each component), but rather the links (i.e., the differences) betweentwo arrow heads, which represents the difference between the two components represented by thearrows, or the transfer of mass from one to the other.The compositional loading plot is more or less a standard one. The loadings are displayed by abarplot as positve and negative changes of amounts.The loading plot can work in two different modes: If scale.sdev is set to NA it displays the com-position being represented by the unit vector of loadings in cpt-transformed space. If scale.sdevis numeric we use this composition scaled by the standard deviation of the respective component.The relative plot displays the relativeLoadings as a barplot. The deviation from a unit bar showsthe effect of each principal component on the respective differences.
Value
princomp gives an object of type c("princomp.rcomp","princomp") with the following content:
sdev the standard deviation of the principal components.
loadings the matrix of variable loadings (i.e., a matrix which columns contain the eigen-vectors). This is of class "loadings". The last eigenvalue is removed since itshould contain the irrelevant scaling.
Loadings the loadings as an rmult-object
center the cpt-transformed vector of means used to center the dataset
Center the rcomp vector of means used to center the dataset
scale the scaling applied to each variable
n.obs number of observations
scores if scores = TRUE, the scores of the supplied data on the principal components.Scores are coordinates in a basis given by the principal components and thus notcompositions
call the matched call
na.action not clearly understood
predict returns a matrix of scores of the observations in the newdata dataset. The other routinesare mainly called for their side effect of plotting or printing and return the object x.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
174 princomp.rmult
See Also
cpt,rcomp, relativeLoadings princomp.acomp, princomp.rplus,
Examples
data(SimulatedAmounts)pc <- princomp(rcomp(sa.lognormals5))pcsummary(pc)plot(pc) #plot(pc,type="screeplot")plot(pc,type="v")plot(pc,type="biplot")plot(pc,choice=c(1,3),type="biplot")plot(pc,type="loadings")plot(pc,type="loadings",scale.sdev=-1) # Downwardplot(pc,type="relative",scale.sdev=NA) # The directionsplot(pc,type="relative",scale.sdev=1) # one sigma Upwardplot(pc,type="relative",scale.sdev=-1) # one sigma Downwardbiplot(pc)screeplot(pc)loadings(pc)relativeLoadings(pc,mult=FALSE)relativeLoadings(pc)relativeLoadings(pc,scale.sdev=1)relativeLoadings(pc,scale.sdev=2)
pc$sdev^2cov(predict(pc,sa.lognormals5))
princomp.rmult Principal component analysis for real data
Description
Performs a principal component analysis for datasets of type rmult.
Usage
## S3 method for class 'rmult'princomp(x,cor=FALSE,scores=TRUE,
covmat=var(rmult(x[subset,]),robust=robust,giveCenter=TRUE),center=attr(covmat,"center"), subset = rep(TRUE, nrow(x)),..., robust=getOption("robust"))
Arguments
x a rmult-dataset
... Further arguments to call princomp.default
princomp.rmult 175
cor logical: shall the computation be based on correlations rather than covariances?
scores logical: shall scores be computed?
covmat provides the covariance matrix to be used for the principle component analysis
center provides the be used for the computation of scores
subset A rowindex to x giving the columns that should be used to estimate the variance.
robust Gives the robustness type for the calculation of the covariance matrix. Seevar.rmult for details.
Details
The function just does princomp(unclass(x),...,scale=scale) and is only here for conve-nience.
Value
An object of type princomp with the following fields
sdev the standard deviation of the principal components.
loadings the matrix of variable loadings (i.e., a matrix whose columns contain the eigen-vectors). This is of class "loadings".
center the mean that was substracted from the data set
scale the scaling applied to each variable
n.obs number of observations
scores if scores = TRUE, the scores of the supplied data on the principal components.Scores are coordinates in a basis given by the principal components.
call the matched call
na.action Not clearly understood
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
princomp.rplus
Examples
data(SimulatedAmounts)pc <- princomp(rmult(sa.lognormals5))pcsummary(pc)plot(pc)screeplot(pc)screeplot(pc,type="l")biplot(pc)biplot(pc,choice=c(1,3))
176 princomp.rplus
loadings(pc)plot(loadings(pc))pc$sdev^2cov(predict(pc,sa.lognormals5))
princomp.rplus Principal component analysis for real amounts
Description
A principal component analysis is done in real geometry (i.e. using iit-transform).
Usage
## S3 method for class 'rplus'princomp(x,...,scores=TRUE,center=attr(covmat,"center"),
covmat=var(x,robust=robust,giveCenter=TRUE),robust=getOption("robust"))
## S3 method for class 'princomp.rplus'print(x,...)## S3 method for class 'princomp.rplus'
plot(x,y=NULL,...,npcs=min(10,length(x$sdev)),type=c("screeplot","variance","biplot","loadings","relative"),main=NULL,scale.sdev=1)
## S3 method for class 'princomp.rplus'predict(object,newdata,...)
Arguments
x an rplus-dataset (for princomp) or a result from princomp.rplus
y not used
scores a logical indicating whether scores should be computed or not
npcs the number of components to be drawn in the scree plot
type type of the plot: "screeplot" is a lined screeplot, "variance" is a boxplot-likescreeplot, "biplot" is a biplot, "loadings" displays the loadings as a barplot
scale.sdev the multiple of sigma to use when plotting the loadings
main title of the plot
object a fitted princomp.rplus object
newdata another amount dataset of class rcomp
... further arguments to pass to internally-called functions
covmat provides the covariance matrix to be used for the principle component analysis
center provides the be used for the computation of scores
robust Gives the robustness type for the calculation of the covariance matrix. Seevar.rmult for details.
princomp.rplus 177
Details
Mainly a princomp(iit(x)) is performed. Note all parts in a composition or in an amount vectorshare a natural scaling. Therefore, they do not need any preliminary standardization (which in factwould produce a loss of important information). For this reason, princomp.rplus works on thecovariance matrix.The plot routine provides screeplots (type = "s",type= "v"), biplots (type = "b"), plots ofthe effect of loadings (type = "b") in scale.sdev*sdev-spread, and loadings of pairwise differ-ences (type = "r").The interpretation of a screeplot does not differ from ordinary screeplots. It shows the eigenvaluesof the covariance matrix, which represent the portions of variance explained by the principal com-ponents.The interpretation of the biplot uses, additionally to the classical interperation, a compositional con-cept: the differences between two arrowheads can be interpreted as the shift of mass between thetwo components represented by the arrows.The amount loading plot is more or less a standard loadings plot. The loadings are displayed by abarplot as positive and negative changes of amounts.The loadings plot can work in two different modes: If scale.sdev is set to NA it displays the amountvector being represented by the unit vector of loadings in the iit-transformed space. If scale.sdevis numeric we use this amount vector scaled by the standard deviation of the respective component.The relative plot displays the relativeLoadings as a barplot. The deviation from a unit bar showsthe effect of each principal component on the respective differences.
Value
princomp gives an object of type c("princomp.rcomp","princomp") with the following content:
sdev the standard deviation of the principal components
loadings the matrix of variable loadings (i.e., a matrix which columns contain the eigen-vectors). This is of class "loadings"
Loadings the loadings as an "rmult"-object
center the iit-transformed vector of means used to center the dataset
Center the rplus vector of means used to center the dataset (center and Center haveno difference, except that the second has a class)
scale the scaling applied to each variable
n.obs number of observations
scores if scores = TRUE, the scores of the supplied data on the principal components.Scores are coordinates in a basis given by the principal components and thus notcompositions
call the matched call
na.action not clearly understood
predict returns a matrix of scores of the observations in the newdata dataset.The other routines are mainly called for their side effect of plotting or printing and return the objectx.
178 print.acomp
See Also
iit,rplus, relativeLoadings princomp.rcomp, princomp.aplus,
Examples
data(SimulatedAmounts)pc <- princomp(rplus(sa.lognormals5))pcsummary(pc)plot(pc) #plot(pc,type="screeplot")plot(pc,type="v")plot(pc,type="biplot")plot(pc,choice=c(1,3),type="biplot")plot(pc,type="loadings")plot(pc,type="loadings",scale.sdev=-1) # Downwardplot(pc,type="relative",scale.sdev=NA) # The directionsplot(pc,type="relative",scale.sdev=1) # one sigma Upwardplot(pc,type="relative",scale.sdev=-1) # one sigma Downwardbiplot(pc)screeplot(pc)loadings(pc)relativeLoadings(pc,mult=FALSE)relativeLoadings(pc)relativeLoadings(pc,scale.sdev=1)relativeLoadings(pc,scale.sdev=2)
pc$sdev^2cov(predict(pc,sa.lognormals5))
print.acomp Printing compositional data.
Description
Prints compositional objects with appropriate missing encodings.
Usage
## S3 method for class 'acomp'print(x,...,replace0=TRUE)## S3 method for class 'aplus'print(x,...,replace0=TRUE)## S3 method for class 'rcomp'print(x,...,replace0=FALSE)## S3 method for class 'rplus'print(x,...,replace0=FALSE)
print.acomp 179
Arguments
x a compositional object
... further arguments to print.default
replace0 logical: Shall 0 be treated as "Below detection Limit" with unkown limit.
Details
Missings are displayed with an appropriate encoding:
• MARMissing at random: The value is missing independently of its true value.
• MNARMissing NOT at random: The value is missing dependently of its true value, but with-out a known systematic. Maybe a better name would be: Value dependen missingness.
• BDTbelow detection limit (with unspecified detection limit): The value is missing because itwas below an unkown detection limit.
• <Detectionlimitbelow detection limit (with specified detection limit): The value is below thedisplayed detection limit.
• SZStructural Zero: A true value is either bound to be zero or does not exist for structuralnonrandom reasons. E.g. the portion of pregnant girls at a boys school.
• ERRError: An illegal encoding value was found in the object.
Value
An invisible version of x.
Missing Policy
The policy of treatment of zeroes, missing values and values below detecion limit is explained indepth in compositions.missings.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de, Raimon Tolosana-Delgado
References
Boogaart, K.G. v.d., R. Tolosana-Delgado, M. Bren (2006) Concepts for handling of zeros and miss-ing values in compositional data, in: E. Pirard (ed.) (2006)Proceedings of the IAMG’2006 AnnualConference on "Quantitative Geology from multiple sources", September 2006, Liege, Belgium„S07-01, 4pages, ISBN 978-2-9600644-0-7
See Also
clr,acomp, plot.acomp, boxplot.acomp, barplot.acomp, mean.acomp, var.acomp, variation.acomp,zeroreplace
180 pwlrPlot
Examples
data(SimulatedAmounts)mydata <- simulateMissings(sa.groups5,dl=0.01,knownlimit=TRUE,
MAR=0.05,MNARprob=0.05,SZprob=0.05)mydata[1,1]<-BDLvalueprint(aplus(mydata))print(aplus(mydata),digits=3)print(acomp(mydata))print(rplus(mydata))print(rcomp(mydata))
pwlrPlot Plots of pairwise logratio against a covariable.
Description
Creates a matrix of plots, with each pairwise logratio against a covariable. The covariable can benumeric or factor, and play the role of X or Y axis.
Usage
pwlrPlot(x,y,...,add.line=FALSE,line.col=2,add.robust=FALSE,rob.col=4)
Arguments
x a vector, a column of a data.frame, or an acomp representing the first set ofthings to be displayed. Either x or y must be an acomp object, and the othermust be a covariable. Both factors and continuous covariables allowed here.
y a vector, a column of a data.frame, or an acomp representing the first set ofthings to be displayed. Either x or y must be an acomp object, and the othermust be a covariable. Factors to be used here with caution.
... furter parameters to the panel function
add.line logical, to control the addition of a regression line in each panel. Ignored ifcovariable is a factor.
line.col in case the regression line is added, which color should be used? Defaults to red.
add.robust logical, to control the addition of a robust regression line in each panel. Ignoredif covariable is a factor. This is nowadays based on lmrob, but this can changein the future.
rob.col in case the robust regression line is added, which color should be used? Defaultsto blue.
pwlrPlot 181
Details
This function generates a matrix of plots of all possible pairwise logratios of the acomp argument,plotted against a covariable. The covariable can be a factor or a numeric vector, or a column ofa matrix or data.frame. Covariable and composition can both be represented in X or Y axis: afactor on X axis generates a boxplot; a factor on Y axis generates a spineplot; if the covariable isnumeric, a default scatterplot is generated. All dot arguments are passed to these plotting functions.In any of these cases, the diagram shows the logratio of the component in the row divided by thecomponent in the column. In the case of a numeric covariable, both classical and robust regressionlines can be added.
Author(s)
Raimon Tolosana-Delgado, K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
Boogaart, K.G. v.d. , R. Tolosana (2008) Mixing Compositions and Other scales, Proceedings ofCodaWork 08.
http://ima.udg.es/Activitats/CoDaWork03
http://ima.udg.es/Activitats/CoDaWork05
http://ima.udg.es/Activitats/CoDaWork08
See Also
plot.aplus, pairwisePlot, boxplot, spineplot, plot.default
Examples
data(Hydrochem)xc = acomp(Hydrochem[,c("Ca","Mg","Na","K")])fk = Hydrochem$RiverpH = -log10(Hydrochem$H)## x=acomp, y=factorpwlrPlot(xc, fk, border=2:5)## x=factor, y=acomppwlrPlot(fk,xc, col=2:5)## x=acomp, y=numeric, with colors by riverpwlrPlot(xc, pH, col=as.integer(fk)+1)## x=numeric, y=acomp, with linepwlrPlot(pH, xc, add.robust=TRUE)
182 qqnorm
qqnorm Normal quantile plots for compositions and amounts
Description
The plots allow to check the normal distribution of multiple univaritate marginals by normal quantile-quantile plots. For the different interpretations of amount data a different type of normality is as-sumed and checked. When an alpha-level is given the marginal displayed in each panel is checkedfor normality.
Usage
## S3 method for class 'acomp'qqnorm(y,fak=NULL,...,panel=vp.qqnorm,alpha=NULL)## S3 method for class 'rcomp'qqnorm(y,fak=NULL,...,panel=vp.qqnorm,alpha=NULL)## S3 method for class 'aplus'qqnorm(y,fak=NULL,...,panel=vp.qqnorm,alpha=NULL)## S3 method for class 'rplus'qqnorm(y,fak=NULL,...,panel=vp.qqnorm,alpha=NULL)vp.qqnorm(x,y,...,alpha=NULL)
Arguments
y a dataset
fak a factor to split the dataset, not yet implemented in aplus and rplus
panel the panel function to be used or a list of multiple panel functions
alpha the alpha level of a test for normality to be performed for each of the dis-played marginals. The levels are adjusted for multiple testing with a Bonferroni-correction (i.e. dividing each of the alpha-level by the number of test performed)
... further graphical parameters
x used by pairs only. Internal use
Details
qqnorm.rplus and qqnorm.rcomp display qqnorm plots of individual amounts (on the diagonal),of pairwise differences of amounts (above the diagonal) and of pairwise sums of amounts (belowthe diagonal).qqnorm.aplus displays qqnorm-plots of individual log-amounts (on the diagonal), of pairwise log-ratios of amounts (above the diagonal) and of pairwise sums of log amount (below the diagonal).qqnorm.acomp displays qqnorm-plots of pairwise log-ratios of amounts in all of diagonal panels.Nothing is displayed on the diagonal.In all cases a joint normality of the original data in the selected framework would imply normalityin all displayed marginal distributions (although the reciprocal is in general not true!).The marginal normality can be checked in each of the plots using a shapiro.test, by specifying an
R2 183
alpha level. The alpha level is corrected for multiple testing. Plots displaying a marginal distributionsignificantly deviating from a normal distribution at that alpha level are marked by a red exclamationmark.vp.qqnorm is internally used as a panel function to make high dimensional plots.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
plot.acomp, boxplot.acomp, rnorm.acomp, rnorm.rcomp, rnorm.aplus, rnorm.rplus
Examples
data(SimulatedAmounts)qqnorm(acomp(sa.lognormals),alpha=0.05)qqnorm(rcomp(sa.lognormals),alpha=0.05)qqnorm(aplus(sa.lognormals),alpha=0.05)qqnorm(rplus(sa.lognormals),alpha=0.05)
R2 R square
Description
The R2 measure of determination for linear models
Usage
R2(object,...)## S3 method for class 'lm'R2(object,...,adjust=TRUE,ref=0)## Default S3 method:R2(object,...,ref=0)
Arguments
object a statistical model
... further not yet used parameters
adjust Logical, whether the estimate of R2 should be adjusted for the degrees of free-dom of the model.
ref A reference model for computation of a relative R2.
184 rAitchison
Details
The R2 measure of determination is defined as:
R2 = 1− var(residuals)
var(data)
and provides the portion of variance explained by the model. It is a number between 0 and 1, where1 means the model perfectly explains the data and 0 means that the model has no better explanationof the data than a constant mean. In case of multivariate models metric variances are used.
If a reference model is given by ref, the variance of the residuals of that models rather than thevariance of the data is used. The value of such a relative R2 estimates how much of the residualvariance is explained.
If adjust=TRUE the unbiased estiamators for the variances are used, to avoid the automatisme thata more parameters automatically lead to a higher R2.
Value
The R2 measure of determination.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
lm, mvar, AIC
Examples
data(Orange)R2(lm(circumference~age,data=Orange))R2(lm(log(circumference)~log(age),data=Orange))
rAitchison Aitchison Distribution
Description
The Aitchison distribution is a class of distributions the simplex, containing the normal and theDirichlet as subfamilies.
rAitchison 185
Usage
dAitchison(x,theta=alpha+sigma %*% clr(mu),beta=-1/2*gsi.svdinverse(sigma),alpha=mean(theta),mu=clrInv(c(sigma%*%(theta-alpha))),sigma=-1/2*gsi.svdinverse(beta),grid=30,realdensity=FALSE,expKappa=AitchisonDistributionIntegrals(theta,beta,
grid=grid,mode=1)$expKappa)rAitchison(n,
theta=alpha+sigma %*% clr(mu),beta=-1/2*gsi.svdinverse(sigma),alpha=mean(theta),mu=clrInv(c(sigma%*%(theta-alpha))),sigma=-1/2*gsi.svdinverse(beta), withfit=FALSE)
AitchisonDistributionIntegrals(theta=alpha+sigma %*% clr(mu),beta=-1/2*gsi.svdinverse(sigma),alpha=mean(theta),mu=clrInv(c(sigma%*%(theta-alpha))),sigma=-1/2*gsi.svdinverse(beta),grid=30,mode=3)
Arguments
x acomp-compositions the density should be computed for.
n integer: number of datasets to be simulated
theta numeric vector: Location parameter vector
beta matrix: Spread parameter matrix (clr or ilr)
alpha positiv scalar: departure from normality parameter (positive scalar)
mu acomp-composition, normal reference mean parameter composition
sigma matrix: normal reference variance matrix (clr or ilr)
grid integer: number of discretisation points along each side of the simplex
realdensity logical: if true the density is given with respect to the Haar measure of the realsimplex, if false the density is given with respect to the Aitchison measure ofthe simplex.
mode integer: desired output: -1: Compute nothing, only transform parameters,0: Compute only oneIntegral and kappaIntegral,1: compute also the clrMean,2: compute also the clrSqExpectation,3: same as 2, but compute clrVar instead of clrSqExpectation
186 rAitchison
expKappa The closing divisor of the density
withfit Should a pre-spliting of the Aitchison density be used for simulation?
Details
The Aitchison Distribution is a joint generalisation of the Dirichlet Distribution and the additivelog-normal distribution (or normal on the simplex). It can be parametrized by Ait(theta,beta) orby Ait(alpha,mu,Sigma). Actually, beta and Sigma can be easily transformed into each other, suchthat only one of them is necessary. Parameter theta is a vector in RD, alpha is its sum, mu is acomposition in SD, and beta and sigma are symmetric matrices, which can either be expressed inilr or clr space. The parameters are transformed as
β = −1/2Σ−1
θ = clr(µ)Σ + α(1, . . . , 1)t
The distribution exists, if either, α ≥ 0 and Sigma is positive definite (or beta negative definite) inilr-coordinates, or if each theta is strictly positive and Sigma has at least one positive eigenvalue (orbeta has at least one negative eigenvalue). The simulation procedure currently only works with thefirst case.AitchisonDistributionIntegral is a convenience function to compute the parameter transformationand several functions of these parameters. This is done by numerical integration over a multinomialsimplex lattice of D parts with grid many elements (see xsimplex).The density of the Aitchison distribution is given by:
f(x, θ, β) = exp((θ − 1)t log(x) + ilr(x)tβilr(x))/exp(κAit(θ,β))
with respect to the classical Haar measure on the simplex, and as
f(x, θ, β) = exp(θt log(x) + ilr(x)tβilr(x))/exp(κAit(θ,β))
with respect to the Aitchison measure of the simplex. The closure constant expKappa is computednumerically, in AitchisonDistributionIntegrals.
The random composition generation is done by rejection sampling based on an optimally fittedadditive logistic normal distribution. Thus, it only works if the correponding Sigma in ilr would bepositive definite.
Value
dAitchison Returns the density of the Aitchison distribution evaluated at x as a numericvector.
rAitchison Returns a sample of size n of simulated compostions as an acomp object.AitchisondistributionIntegrals
Returns a list with
• thetathe theta parameter given or computed• betathe beta parameter given or computed• alphathe alpha parameter given or computed• muthe mu parameter given or computed• sigmathe sigma parameter given or computed
ratioLoadings 187
• expKappathe integral over the density without closing constant. I.e. theinverse of the closing constant and the exp of κAit(θ,β)
• kappaIntegralThe expected value of the mean of the logs of the componentsas numerically computed
• clrMeanThe mean of the clr transformed random variable, computed nu-merically
• clrSqExpectationThe expectation of clr(X)clr(X)t computed numerically.• clrVarThe variance covariance matrix of clr(X), computed numerically
Note
The simulation procedure currently only works with a positive definite Sigma. You need a relativelyhigh grid constant for precise values in the numerical integration.
Author(s)
K.Gerald v.d. Boogaart, R. Tolosana-Delgado http://www.stat.boogaart.de
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
See Also
runif.acomp, rnorm.acomp, rDirichlet.acomp
Examples
(erg<-AitchisonDistributionIntegrals(c(-1,3,-2),ilrvar2clr(-diag(c(1,2))),grid=20))
(myvar<-with(erg, -1/2*ilrvar2clr(solve(clrvar2ilr(beta)))))(mymean<-with(erg,myvar%*%theta))
with(erg,myvar-clrVar)with(erg,mymean-clrMean)
ratioLoadings Loadings of relations of two amounts
Description
In a compositional dataset the relation of two objects can be interpreted safer than a single amount.These functions compute, display and plot the corresponding pair-information for the various prin-cipal component analysis results.
188 ratioLoadings
Usage
relativeLoadings(x,...)## S3 method for class 'princomp.acomp'relativeLoadings(x,...,log=FALSE,scale.sdev=NA,
cutoff=0.1)## S3 method for class 'princomp.aplus'relativeLoadings(x,...,log=FALSE,scale.sdev=NA,
cutoff=0.1)## S3 method for class 'princomp.rcomp'relativeLoadings(x,...,scale.sdev=NA,
cutoff=0.1)## S3 method for class 'princomp.rplus'relativeLoadings(x,...,scale.sdev=NA,
cutoff=0.1)## S3 method for class 'relativeLoadings.princomp.acomp'print(x,...,cutoff=attr(x,"cutoff"),
digits=2)## S3 method for class 'relativeLoadings.princomp.aplus'print(x,...,cutoff=attr(x,"cutoff"),
digits=2)## S3 method for class 'relativeLoadings.princomp.rcomp'print(x,...,cutoff=attr(x,"cutoff"),
digits=2)## S3 method for class 'relativeLoadings.princomp.rplus'print(x,...,cutoff=attr(x,"cutoff"),
digits=2)## S3 method for class 'relativeLoadings.princomp.acomp'plot(x,...)## S3 method for class 'relativeLoadings.princomp.aplus'plot(x,...)## S3 method for class 'relativeLoadings.princomp.rcomp'plot(x,...)## S3 method for class 'relativeLoadings.princomp.rplus'plot(x,...)
Arguments
x a result from an amount PCA princomp.acomp/princomp.aplus/princomp.rcomp/princomp.rplus
log a logical indicating to use log-ratios instead of ratios
scale.sdev if not NA, a number specifying the multiple of a standard deviation, used to scalethe components
cutoff a single number. Changes under that (log)-cutoff are not displayed
digits the number of digits to be displayed
... further parameters to internally-called functions
rcomp 189
Details
The relative loadings of components allow a direct interpretation of the effects of principal com-ponents. For acomp/aplus classes the relation is induced by a ratio, which can optionally be log-transformed. For the rcomp/rplus-classes the relation is induced by a difference, which is meaning-less when the units are different.
Value
The value is a matrix of type "relativeLoadings.princomp.*", containing the ratios in the com-positions represented by the loadings (optionally scaled by the standard deviation of the componentsand scale.sdev).
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
princomp.acomp, princomp.aplus, princomp.rcomp, princomp.rplus, barplot
Examples
data(SimulatedAmounts)pc <- princomp(acomp(sa.lognormals5))pcsummary(pc)relativeLoadings(pc,log=TRUE)relativeLoadings(pc)relativeLoadings(pc,scale.sdev=1)relativeLoadings(pc,scale.sdev=2)
plot(relativeLoadings(pc,log=TRUE))plot(relativeLoadings(pc))plot(relativeLoadings(pc,scale.sdev=1))plot(relativeLoadings(pc,scale.sdev=2))
rcomp Compositions as elements of the simplex embedded in the D-dimensional real space
Description
A class providing a way to analyse compositions in the philosophical framework of the Simplex assubset of the RD.
190 rcomp
Usage
rcomp(X,parts=1:NCOL(oneOrDataset(X)),total=1,warn.na=FALSE,detectionlimit=NULL,BDL=NULL,MAR=NULL,MNAR=NULL,SZ=NULL)
Arguments
X composition or dataset of compositions
parts vector containing the indices xor names of the columns to be used
total the total amount to be used, typically 1 or 100
warn.na should the user be warned in case of NA,NaN or 0 coding different types ofmissing values?
detectionlimit a number, vector or matrix of positive numbers giving the detection limit of allvalues, all columns or each value, respectively
BDL the code for ’Below Detection Limit’ in X
SZ the code for ’Structural Zero’ in X
MAR the code for ’Missing At Random’ in X
MNAR the code for ’Missing Not At Random’ in X
Details
Many multivariate datasets essentially describe amounts of D different parts in a whole. This hassome important implications justifying to regard them as a scale on its own, called a "composition".The functions around the class "rcomp" follow the traditional (often statistically inconsistent) ap-proach regarding compositions simply as a multivariate vector of positive numbers summing upto 1. This space of D positive numbers summing to 1 is traditionally called the D-1-dimensionalsimplex.
The compositional scale was in-depth analysed by Aitchison (1986) and he found serious reasonswhy compositional data should be analysed with a different geometry. The functions around theclass "acomp" follow his approach. However the Aitchison approach based on log-ratios is some-times criticized (e.g. Rehder and Zier, 2002). It cannot deal with absent parts (i.e. zeros). It issensitive to large measurement errors in small amounts. The Aitchison operations cannot representsimple mixture of different compositions. The used transformations are not uniformly continuous.Straight lines and ellipses in Aitchison space look strangely in ternary diagrams. As all uncriticalstatistical analysis, blind application of logratio-based analysis is sometimes misleading. Thereforeit is sometimes useful to analyse compositional data directly as a multivariate dataset of portionssumming to 1. However a clear warning must be given that the utilisation of almost any kind ofclassical multivariate analysis introduce some kinds of artifacts (e.g. Chayes 1960) when appliedto compositional data. So, extra care and considerable expert knowlegde is needed for the properinterpretation of results achieved in this non-Aitchison approach. The package tries to lead the useraround these artifacts as much as possible and gives hints to major pitfalls in the help. Howevermeaningless results cannot be fully avoided in this (rather inconsistent) approach.A side effect of the procedure is to force the compositions to sum to one, which is done by theclosure operation clo .The classes rcomp, acomp, aplus, and rplus are designed in a fashion as similar as possible, in orderto allow direct comparison between results achieved by the different approaches. Especially the
rcomp 191
acomp logistic transforms clr, alr, ilr are mirrored by analogous linear transforms cpt, apt, iptin the rcomp class framework.
Value
a vector of class "rcomp" representing a closed composition or a matrix of class "rcomp" repre-senting multiple closed compositions, by rows.
Missing Policy
Missing and Below Detecion Limit Policy is explained in deeper detail in compositions.missing.
Author(s)
Raimon Tolosana-Delgado, K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
Rehder, S. and U. Zier (2001) Letter to the Editor: Comment on “Logratio Analysis and Composi-tional Distance” by J. Aitchison, C. Barcel\’o-Vidal, J.A. Mart\’in-Fern\’a ndez and V. Pawlowsky-Glahn, Mathematical Geology, 33 (7), 845-848.
Zier, U. and S. Rehder (2002) Some comments on log-ratio transformation and compositional dis-tance, Terra Nostra, Schriften der Alfred Wegener-Stiftung, 03/2003
van den Boogaart, K.G. and R. Tolosana-Delgado (2008) "compositions": a unified R package to an-alyze Compositional Data, Computers & Geosciences, 34 (4), pages 320-338, doi:10.1016/j.cageo.2006.11.017.
See Also
cpt, apt, ipt, acomp, rplus, princomp.rcomp, plot.rcomp, boxplot.rcomp, barplot.rcomp,mean.rcomp, var.rcomp, variation.rcomp, cov.rcomp, msd, convex.rcomp, +.rcomp
Examples
data(SimulatedAmounts)plot(rcomp(sa.tnormals))
192 rcomparithm
rcomparithm Arithmetic operations for compositions in a real geometry
Description
The real compositions form a manifold of the real vector space. The induced operations +,-,*,/ giveresults valued in the real vector space, but possibly outside the simplex.
Usage
convex.rcomp(x,y,alpha=0.5)## Methods for class "rcomp"## x+y## x-y## -x## x*r## r*x## x/r
Arguments
x an rcomp composition or dataset of compositionsy an rcomp composition or dataset of compositionsr a numeric vector of size 1 or nrow(x)alpha a numeric vector of size 1 or nrow(x) with values between 0 and 1
Details
The functions behave quite like +.rmult.The convex combination is defined as: x*alpha + (1-alpha)*y
Value
rmult-objects containing the given operations on the simplex as subset of the RD. Only the convexcombination convex.rcomp results in an rcomp-object again, since only this operation is closed.
Note
For * the arguments x and y can be exchanged.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
+.rmult, +.acomp,cpt, rcomp, rmult
rcompmargin 193
Examples
rcomp(1:5)* -1 + rcomp(1:5)data(SimulatedAmounts)cdata <- rcomp(sa.lognormals)plot( tmp <- (cdata-mean(cdata))/msd(cdata) )class(tmp)mean(tmp)msd(tmp)var(tmp)plot(convex.rcomp(rcomp(c(1,1,1)),sa.lognormals,0.1))
rcompmargin Marginal compositions in real geometry
Description
Compute marginal compositions by amalgamating the rest (additively).
Usage
rcompmargin(X,d=c(1,2),name="+",pos=length(d)+1,what="data")
Arguments
X composition or dataset of compositions
d vector containing the indices xor names of the columns to be kept
name The new name of the amalgamation column
pos The position where the new amalgamation column should be stored. This de-faults to the last column.
what The role of X either "data" for data (or means) to be transformed or "var" forvariances to be transformed.
Details
The amalgamation column is simply computed by adding the non-selected components after closingthe composition. This is consistent with the rcomp approach and is widely used because of its easyinterpretation. However, it often leads to difficult-to-read ternary diagrams and is inconsistent withthe acomp approach.
With the argument what="var" the function transformes an rcomp variance to the resulting varianceof the resulting composition.
Value
A closed compositions with class "rcomp" containing the selected variables given by d and the theamalgamation column.
194 rDirichlet
Missing Policy
MNAR has the highest priority, MAR next and WZERO (BDL,SZ),- values are considered as 0 andreported as BDL in the End.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de, Raimon olosana-Delgado
References
References missing
See Also
acompmargin, rcomp
Examples
data(SimulatedAmounts)plot.rcomp(sa.tnormals5,margin="rcomp")plot.rcomp(rcompmargin(sa.tnormals5,c("Cd","Zn")))plot.rcomp(rcompmargin(sa.tnormals5,c(1,2)))
rDirichlet Dirichlet distribution
Description
The Dirichlet distribution on the simplex.
Usage
rDirichlet.acomp(n,alpha)rDirichlet.rcomp(n,alpha)
Arguments
n number of datasets to be simulated
alpha parameters of the Dirichlet distribution
Details
The Dirichlet distribution is the result of closing a vector of equally-scaled Gamma-distributed vari-ables. It the conjugate prior distribution for a vector of probabilities of a multinomial distribution.Thus, it generalizes the beta distribution for more than two parts.
Read standard data files 195
Value
a generated random dataset of class "acomp" or "rcomp", drawn from a Dirichlet distribution withthe given parameter alpha. The names of alpha are used to name the parts.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de, Raimon Tolosana-Delgado
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
Mateu-Figueras, G.; Pawlowsky-Glahn, V. (2005). The Dirichlet distribution with respect to theAitchison measure on the simplex, a first approach. In: Mateu-Figueras, G. and Barcel\’o-Vidal, C.(Eds.) Proceedings of the 2nd International Workshop on Compositional Data Analysis, Universitatde Girona, ISBN 84-8458-222-1, http://ima.udg.es/Activitats/CoDaWork05
See Also
rnorm.acomp
Examples
tmp <- rDirichlet.acomp(10,alpha=c(A=2,B=0.2,C=0.2))plot(tmp)
Read standard data files
Reads a data file in a geoeas format
Description
Reads a data file, which must be formatted either as a geoEAS file (described below).
Usage
read.geoeas(file)read.geoEAS(file)
Arguments
file a file name, with a specific format
196 Read standard data files
Details
The data files must be in the adequate format: "read.geoEAS" and "read.geoeas" read geoEASformat.
The geoEAS format has the following structure:
• a first row with a description of the data set
• the number of variables (=nvars)
• "nvars" rows, each containing the name of a variable
• the data set, in a matrix of "nvars" columns and as many rows as individuals
Value
A data set, with a "title" attribute.
Note
Labels and title should not contain tabs. This might produce an error when reading.
Author(s)
Raimon Tolosana-Delgado
References
Missing references
See Also
read.table
Examples
## Files can be found in the test-subdirectory of the package### Not run:
read.geoeas("TRUE.DAT")read.geoEAS("TRUE.DAT")
## End(Not run)
replot 197
replot Modify parameters of compositional plots.
Description
Display only a subset of the plots.
Usage
replot(...,dev=dev.cur(),plot=TRUE,envir=NULL,add=FALSE)replotable(expr,add=FALSE)noreplot(expr,dev=dev.cur())
Arguments
expr A (unquoted) expression that does the plotting. replotable will make the gen-erated plot replotable and noreplot will do the inverse and avoid that the plotsoverwrites the current database entry.
... Plot parameters to be modified. E.g. onlyPaneldev The device that currently contains the plot. It will be plotted in the current
device.plot logical or call or list of calls. If plot is TRUE, the new version of the plot is
plotted in the current environment (and typically stores itself here).If plot is FALSE the modified plot is simply stored, rather than actually plotted(in its own old plotting environment).If the parameter is something else, it is stored to the internal plot database forthe given device (but not plotted or evaluated).
envir a new enviroment to be assigned to the plot. Rarely needed.add either a logical to indicating that the plot adds something to the plot. Or a num-
ber / name of the added thing to be modified.
Details
Some of the plot routines of compositions internally store their call as a mean for replaying theplot when information is added or parameters are modified. The stored call can be modified by thisfunction, which pretty much works like a simplified version of update.
replot allows to redo the plot typically in a different device or with different parameters. Thefunction provides this functionallity at a totally different level than dev.copy and allows for themodification of high level parameters on the fly.
Plots can be stored in the internal database by calling replot with a parameter plot set to thecall of that plot. Plotting functions without this functionallity can be filtered through replotable().However in this case all parameter names should be given explicitly.
There are actually three levels of possible replay: The dev.copy level on which graphic actions arereplayed. The gsi.pairs function level that organizes panels plots and uses an internal replottingfacility to allow modification of the parameter, e.g. addings lines .... And than there is the high levelof the actual function call generating the plot.
198 replot
Value
replot returns an invisible copy of the modified call. replotable and noreplot return the resultof expression.
Note
The function works by revaluating the call in its environment. Thus the plot will change!!! if thedata has changed.
The function always handles the latest plot from the package. If another plot ignorant of the replotsystem has meanwhile be used it will be ignored.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
plot.acomp, plot.aplus, boxplot.acomp
Examples
data(SimulatedAmounts)plot(acomp(sa.lognormals5))straight(acomp(c(1,1,1,1,1)),acomp(c(1,2,3,4,5)))replot(onlyPanel=c(2,3))oldPlot <- replot(plot=FALSE) # get the plotting callreplotable(plot(x=1:10)) # To make a graphic replottablereplot(col=1:10)replot(plot=oldPlot) # Restore the old plot (without plotting)replot(onlyPanel=NULL) # View the whole plot againreplot(pch=20) # Acctually plot itreplot(col=20) # since the actual plot is gsi.pairs not a plot.acomp
## Not run:# The following line in a plotting function stores the plot for replotting.replot(plot=match.call()) # Store current call as plotreplot() # simply plot once againreplot(dev=otherdev) # redo a plot from an other device here.replot(onlyPanel=c(3,4)) # modify the plot (and replot it)replot(onlyPanel=c(3,4),dev=7,plot=FALSE) # modify a stored plot
## End(Not run)
rlnorm 199
rlnorm The multivariate lognormal distribution
Description
Generates random amounts with a multivariate lognormal distribution, or gives the density of thatdistribution at a given point.
Usage
rlnorm.rplus(n,meanlog,varlog)dlnorm.rplus(x,meanlog,varlog)
Arguments
n number of datasets to be simulated
meanlog the mean-vector of the logs
varlog the variance/covariance matrix of the logs
x vectors in the sample space
Value
rlnorm.rplus gives a generated random dataset of class "rplus" following a lognormal distribu-tion with logs having mean meanlog and variance varlog.dlnorm.rplus gives the density of the distribution with respect to the Lesbesgue measure on R+ asa subset of R.
Note
The main difference between rlnorm.rplus and rnorm.aplus is that rlnorm.rplus needs a loggedmean. The additional difference for the calculation of the density by dlnorm.rplus and dnorm.aplusis the reference measure (a log-Lebesgue one in the second case).
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de, Raimon Tolosana-Delgado
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
See Also
rnorm.acomp
200 rMahalanobis
Examples
MyVar <- matrix(c(0.2,0.1,0.0,0.1,0.2,0.0,0.0,0.0,0.2),byrow=TRUE,nrow=3)MyMean <- c(1,1,2)
plot(rlnorm.rplus(100,log(MyMean),MyVar))plot(rnorm.aplus(100,MyMean,MyVar))x <- rnorm.aplus(5,MyMean,MyVar)dnorm.aplus(x,MyMean,MyVar)dlnorm.rplus(x,log(MyMean),MyVar)
rMahalanobis Compute distributions of empirical Mahalanobis distances based onsimulations
Description
Decissions about outliers are often made based on Mahalanobis distances with respect to robustlyestimated variances. These function deliver the necessary distributions.
Usage
rEmpiricalMahalanobis(n,N,d,...,sorted=FALSE,pow=1,robust=TRUE)pEmpiricalMahalanobis(q,N,d,...,pow=1,replicates=100,resample=FALSE,
robust=TRUE)qEmpiricalMahalanobis(p,N,d,...,pow=1,replicates=100,resample=FALSE,
robust=TRUE)rMaxMahalanobis(n,N,d,...,pow=1,robust=TRUE)pMaxMahalanobis(q,N,d,...,pow=1,replicates=998,resample=FALSE,
robust=TRUE)qMaxMahalanobis(p,N,d,...,pow=1,replicates=998,resample=FALSE,
robust=TRUE)rPortionMahalanobis(n,N,d,cut,...,pow=1,robust=TRUE)pPortionMahalanobis(q,N,d,cut,...,replicates=1000,resample=FALSE,pow=1,
robust=TRUE)qPortionMahalanobis(p,N,d,cut,...,replicates=1000,resample=FALSE,pow=1,
robust=TRUE)pQuantileMahalanobis(q,N,d,p,...,replicates=1000,resample=FALSE,
ulimit=TRUE,pow=1,robust=TRUE)
Arguments
n Number of simulations to do.
q A vector giving quantiles of the distribution
rMahalanobis 201
p A vector giving probabilities. (only a single probility for pQuantileMahalanobis)
N Number of cases in the dataset.
d degrees of freedom (i.e. dimension) of the dataset.
cut A cutting limit. The random variable is the portion of Mahalanobis distanceslower equal to the cutting limit.
... further arguments passed to MahalanobisDist
pow the power of the Mahalanobis distance to be used. Higher powers can be usedto stretch the outlierregion visually.
robust logical or a robust method description (see robustnessInCompositions) speci-fiying how the center and covariance matrix are estimated,if not given.
sorted Specifies a transformation to be applied to the whole sequence of Mahalanobisdistances: FALSE is no transformation, TRUE sorts the entries in ascending or-der, a numeric vector picks the given entries from the entries sorted in ascendingorder; alternatively a function such as max can be given to directly transform thedata.
replicates the number of datasets in the Monte-Carlo-Computations used in these routines.
resample a logical forcing a resampling of the Monte-Carlo-Sampling. See details.
ulimit logical: is this an upper limit of a joint confidence bound or a lower limit.
Details
All the distribution correspond to the distribution under the Null-Hypothesis of multivariate jointGaussian distribution of the dataset.
The set of empirically estimated Mahalanobis distances of a dataset is in the first step a randomvector with exchangable but dependent entries. The distribution of this vector is given by therEmpiricalMahalanobis if no sorted argument is given. Please be advised that this is not a fixeddistribution in a mathematical sense, but an implementation dependent distribution incorporatingthe performance of underlying robust spread estimator. As long as no sorted argument is givenpEmpiricalMahalanobis and qEmpiricalMahalanobis represent the distribution function and thequantile function of a randomly picked element of this vector.If a sorted attribute is given, it specifies a transformation is applied to each of the vector prior to pro-cessing. Three important special cases are provided by seperate functions. The MaxMahalanobisfunctions correspond to picking only the larges value. The PortionMahalanobis functions corre-spond to reporting the portion of Mahalanobis distances over a cutoff. The QuantileMahalanobisdistribution correponds to the distribution of the p-quantile of the dataset.The Monte-Carlo-Simulations of these distributions are rather slow, since for each datum we needto simulate a whole dataset and to apply a robust covariance estimator to it, which typically itselfinvolves Monte-Carlo-Algorithms. Therefore each type of simulations is only done the first timeneeded and stored for later use in the environment gsi.pStore. With the resampling argument aresampling of the cashed dataset can be forced.
Value
The r* functions deliver a vector (or a matrix of row-vectors) of simulated value of the given distri-butions. A total of n values (or row vectors) is returned.The p* functions deliver a vector (of the same length as x) of probabilities for random variable of
202 rmult
the given distribution to be under the given quantil values q.The q* functions deliver a vector of quantiles corresponding to the length of the vector p providingthe probabilities.
Note
Unlike the mahalanobis function this function does not be default compute the square of the ma-halanobis distance. The pow option is provided if the square is needed.The package robustbase is required for using the robust estimations.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
dist, OutlierClassifier1
Examples
rEmpiricalMahalanobis(10,25,2,sorted=TRUE,pow=1,robust=TRUE)pEmpiricalMahalanobis(qchisq(0.95,df=10),11,1,pow=2,replicates=1000)(xx<-pMaxMahalanobis(qchisq(0.95,df=10),11,1,pow=2))qEmpiricalMahalanobis(0.95,11,2)rMaxMahalanobis(10,25,4)qMaxMahalanobis(xx,11,1)
rmult Simple treatment of real vectors
Description
A class to analyse real multivariate vectors.
Usage
rmult(X,parts=1:NCOL(oneOrDataset(X)),orig=attr(X,"orig"),missingProjector=attr(X,"missingProjector"))
## S3 method for class 'rmult'print(x,...)
Arguments
X vector or dataset of numbers considered as elements of a R-vector
parts vector containing the indices xor names of the columns to be used
x an rmult object
orig the original untransformed dataset
rmultarithm 203
missingProjector
the Projector on the observed subspace
... further generic arguments passed to print.default
Details
The rmult class is a simple convenience class to treat data in the scale of real vectors just like datain the scale of real numbers. A major aspect to take into account is that the internal arithmetic of Ris different for these vectors.
Value
a vector of class "rmult" representing one vector or a matrix of class "rmult", representing multi-ple vectors by rows.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
+.rmult, scalar, norm.rmult, %*%.rmult, rplus, acomp,
Examples
plot(rnorm.rmult(30,mean=0:4,var=diag(1:5)+10))
rmultarithm vectorial arithmetic for datasets in a classical vector scale
Description
vector space operations computed for multiple vectors in parallel
Usage
## Methods for class "rmult"## x+y## x-y## -x## x*r## r*x## x/r
204 rmultmatmult
Arguments
x an rmult vector or dataset of vectors
y an rmult vector or dataset of vectors
r a numeric vector of size 1 or nrow(x)
Details
The operators try to mimic the parallel operation of R on vectors of real numbers on vectors ofvectors represented as matrices containing the vectors as rows.
Value
an object of class "rmult" containing the result of the corresponding operation on the vectors.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
rmult, %*%.rmult
Examples
x <- rmult(matrix( sqrt(1:12), ncol= 3 ))xx+xx + rmult(1:3)x * 1:41:4 * xx / 1:4x / 10
rmultmatmult inner product for datasets with vector scale
Description
An rmult object is considered as a sequence of vectors. The %*% is considered as the inner multi-plication. An inner multiplication with another vector is the scalar product. an inner multiplicationwith a matrix is a matrix multiplication, where the rmult-vectors are either considered as row or ascolumn vector.
Usage
## S3 method for class 'rmult'x %*% y
rmultmatmult 205
Arguments
x an rmult vector or dataset of vectors, a numeric vector of length (gsi.getD(y)),or a matrix
y an rmult vector or dataset of vectors , a numeric vector of length (gsi.getD(x)),or a matrix
Details
The operators try to minic the behavior of %*% on c()-vectors as inner product applied in parallelto all vectors of the dataset. Thus the product of a vector with another rmult object or unclassedvector v results in the scalar product. For the multiplication with a matrix each vector is consideredas a row or column, whatever is more appropriate.
Value
an object of class "rmult" or a numeric vector containing the result of the corresponding innerproducts.
Note
The product x %*% A %*% y is associative.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
rmult, %*%.rmult
Examples
x <- rmult(matrix( sqrt(1:12), ncol= 3 ))x%*%xA <- matrix( 1:9,nrow=3)x %*% A %*% xx %*% AA %*% xx %*% 1:3x %*% 1:31:3 %*% x
206 rnorm
rnorm Normal distributions on special spaces
Description
rnorm.X generates multivariate normal random variates in the space X .
Usage
rnorm.acomp(n,mean,var)rnorm.rcomp(n,mean,var)rnorm.aplus(n,mean,var)rnorm.rplus(n,mean,var)rnorm.rmult(n,mean,var)rnorm.ccomp(n,mean,var,lambda)dnorm.acomp(x,mean,var)dnorm.aplus(x,mean,var)dnorm.rmult(x,mean,var)
Arguments
n number of datasets to be simulated
mean The mean of the dataset to be simulated
var The variance covariance matrix
lambda The expected total count
x vectors in the sampling space
Details
The normal distributions in the variouse spaces dramatically differ. The normal distribution inthe rmult space is the commonly known multivariate joint normal distribution. For rplus thisdistribution has to be somehow truncated at 0. This is here done by setting negative values to 0.The normal distribution of rcomp is seen as a normal distribution within the simplex as a geometricalportion of the real vector space. The variance is thus forced to be singular and restricted to the affinesubspace generated by the simplex. The necessary truncation of negative values is currently doneby setting them explicitly to zero and reclosing afterwards.The "acomp" and "aplus" are itself metric vector spaces and thus a normal distribution is definedin them just as in the real space. The resulting distribution corresponds to a multivariate lognormalin the case of "aplus" and in Aitchisons normal distribution in the simplex in the case of "acomp"(TO DO: Is that right??).For the vector spaces rmult, aplus, acomp it is further possible to provide densities wiht repect totheir Lebesgue measure. In the other cases this is not possible since the resulting distributions arenot absolutly continues with respect to such a measure due to the truncation.For count compositions ccomp a rnorm.acomp is realized and used as a parameter to a Poissondistribution (see rpois.ccomp).
rnorm 207
Value
a random dataset of the given class generated by a normal distribution with the given mean andvariance in the given space.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
Pawlowsky-Glahn, V. and J.J. Egozcue (2001) Geometric approach to statistical analysis on thesimplex. SERRA 15(5), 384-398
Aitchison, J, C. Barcel’o-Vidal, J.J. Egozcue, V. Pawlowsky-Glahn (2002) A consise guide to the al-gebraic geometric structure of the simplex, the sample space for compositional data analysis, TerraNostra, Schriften der Alfred Wegener-Stiftung, 03/2003
See Also
runif.acomp, rlnorm.rplus, rDirichlet.acomp
Examples
MyVar <- matrix(c(0.2,0.1,0.0,0.1,0.2,0.0,0.0,0.0,0.2),byrow=TRUE,nrow=3)MyMean <- c(1,1,2)
plot(rnorm.acomp(100,MyMean,MyVar))plot(rnorm.rcomp(100,MyMean,MyVar))plot(rnorm.aplus(100,MyMean,MyVar))plot(rnorm.rplus(100,MyMean,MyVar))plot(rnorm.rmult(100,MyMean,MyVar))x <- rnorm.aplus(5,MyMean,MyVar)dnorm.acomp(x,MyMean,MyVar)dnorm.aplus(x,MyMean,MyVar)dnorm.rmult(x,MyMean,MyVar)
208 robustnessInCompositions
robustnessInCompositions
Handling robustness issues and outliers in compositions.
Description
The seamless transition to robust estimations in library(compositions).
Details
A statistical method is called nonrobust if an arbitrary contamination of a small portion of thedataset can produce results radically different from the results without the contamination. In thissense many classical procedures relying on distributional models or on moments like mean andvariance are highly nonrobust.We consider robustness as an essential prerequierement of all statistical analysis. However in thecontext of compositional data analysis robustness is still in its first years.As of Mai 2008 we provide a new approach to robustness in the package. The central idea is thatrobustness should be more or less automatic and that there should be no necessity to change the codeto compare results obtained from robust procedures and results from there more efficient nonrobustcounterparts.To achieve this all routines that rely on distributional models (such as e.g. mean, variance, principlecomponent analysis, scaling) and routines relying on those routines get a new standard argument ofthe form:fkt(...,robust=getOption("robust"))which defaults to a new option "robust". This option can take several values:
• FALSEThe classical estimators such as arithmetic mean and persons product moment varianceare used and the results are to be considered nonrobust.
• TRUEThe default for robust estimation in the package is used. At this time this is covMcd inthe robustbase-package. This default might change in future.
• "pearson"This is a synonym for FALSE and explicitly states that no robustness should be used.
• "mcd"Minimum Covariance Determinant. This option explicitly selects the use of covMcd inthe robustbase-package as the main robustness engine.
More options might follow later. To control specific parameters of the model the string can getan attribute named "control" which contains additional options for the robustness engine used. Inthis moment the control attribute of mcd is a control object of covMcd. The control argument of"pearson" is a list containing addition options to the mean, like trim.The standard value for getOption("robust") is FALSE to avoid situation in which the user thinks heuses a classical technique. Robustness must be switched on explicitly. Either by setting the optionwith options(robust=TRUE) or by giving the argument. This default might change later if theauthors come to the impression that robust estimation is now considered to be the default.
For those not only interested in avoiding the influence of the outliers, but in an analysis of theoutliers we added a subsystem for outlier classification. This subsystem is described in outliersIn-Compositions and also relies on the robust option. However evidently for these routines the factory
rplus 209
default for the robust option is always TRUE, because it is only applicable in an outlieraware con-text.
We hope that in this way we can provide a seamless transition from nonrobust analysis to a robustanalysis.
Note
IMPORTANT: The robust argument only works with the classes of the package. Only your compo-sitional analysis is suddenly robust.The package robustbase is required for using the robust estimations and the outlier subsystem ofcompositions. To simplify installation it is not listed as required, but it will be loaded, wheneverany sort of outlierdetection or robust estimation is used.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
var.acomp, mean.acomp, robustbase, compositions-package, missings, outlierplot, OutlierClassifier1,ClusterFinder1
Examples
A <- matrix(c(0.1,0.2,0.3,0.1),nrow=2)Mvar <- 0.1*ilrvar2clr(A%*%t(A))Mcenter <- acomp(c(1,2,1))typicalData <- rnorm.acomp(100,Mcenter,Mvar) # main populationcolnames(typicalData)<-c("A","B","C")data5 <- acomp(rbind(unclass(typicalData)+outer(rbinom(100,1,p=0.1)*runif(100),c(0.1,1,2))))
mean(data5)mean(data5,robust=TRUE)var(data5)var(data5,robust=TRUE)Mvarbiplot(princomp(data5))biplot(princomp(data5,robust=TRUE))
rplus Amounts i.e. positive numbers analysed as objects of the real vectorspace
Description
A class to analyse positive amounts in a classical (non-logarithmic) framework.
210 rplus
Usage
rplus(X, parts=1:NCOL(oneOrDataset(X)), total=NA, warn.na=FALSE,detectionlimit=NULL, BDL=NULL, MAR=NULL, MNAR=NULL, SZ=NULL)
Arguments
X vector or dataset of positive numbers considered as amounts
parts vector containing the indices xor names of the columns to be used
total a numeric vectors giving the total amount of each dataset
warn.na should the user be warned in case of NA,NaN or 0 coding different types ofmissing values?
detectionlimit a number, vector or matrix of positive numbers giving the detection limit of allvalues, all columns or each value, respectively
BDL the code for ’Below Detection Limit’ in X
SZ the code for ’Structural Zero’ in X
MAR the code for ’Missing At Random’ in X
MNAR the code for ’Missing Not At Random’ in X
Details
Many multivariate datasets essentially describe amounts of D different parts in a whole. When thewhole is large in relation to the considered parts, such that they do not exclude each other, and whenthe total amount of each componenten is actually determined by the phenomenon under investiga-tion and not by sampling artifacts (such as dilution or sample preparation) then the parts can betreated as amounts rather than as a composition (cf. rcomp, aplus).In principle, amounts are just real-scaled numbers with the single restriction that they are nonnega-tive. Thus they can be analysed by any multivariate analysis method. This class provides a simpleaccess interface to do so. It tries to keep in mind the positivity property of amounts and the specialpoint zero. However there are strong arguments why an analyis based on log-scale might be muchmore adapted to the problem. This log-approach is provided by the class aplus.
The classes rcomp, acomp, aplus, and rplus are designed in a fashion as similar as possible in orderto allow direct comparison between results obtained by the different approaches. In particular,the aplus logistic transform ilt is mirrored by the simple identity transform iit. In terms ofcomputer science, this identity mapping is actually mapping an object of type "rplus" to a class-lessdatamatrix.
Value
a vector of class "rplus" representing a vector of amounts or a matrix of class "rplus" representingmultiple vectors of amounts, by rows.
Missing Policy
Missing and Below Detecion Limit Policy is in mored detailed explained in compositions.missing.
rplusarithm 211
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
van den Boogaart, K.G. and R. Tolosana-Delgado (2008) "compositions": a unified R package to an-alyze Compositional Data, Computers & Geosciences, 34 (4), pages 320-338, doi:10.1016/j.cageo.2006.11.017.
See Also
iit,rcomp, aplus, princomp.rplus, plot.rplus, boxplot.rplus, barplot.rplus, mean.rplus,var.rplus, variation.rplus, cov.rplus, msd
Examples
data(SimulatedAmounts)plot(rplus(sa.lognormals))
rplusarithm vectorial arithmetic for data sets with rplus class
Description
The positive quadrant forms a manifold of the real vector space. The induced operations +,-,*,/ giveresults valued in this real vector space (not necessarily inside the manifold).
Usage
mul.rplus(x,r)## Methods for class rplus## x+y## x-y## -x## x*r## r*x## x/r
Arguments
x an rplus composition or dataset of compositions
y an rplus composition or dataset of compositions
r a numeric vector of size 1 or nrow(x)
212 rpois
Details
The functions behave quite like +.rmult.
Value
rmult-objects containing the given operations on the rcomp manifold as subset of theRD. Only theaddition and multiplication with positive numbers are internal operation and results in an rplus-object again.
Note
For * the arguments x and y can be exchanged.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
+.rmult, +.acomp,cpt, rcomp, rmult
Examples
rplus(1:5)* -1 + rplus(1:5)data(SimulatedAmounts)cdata <- rplus(sa.lognormals)plot( tmp <- (cdata-mean(cdata))/msd(cdata) )class(tmp)mean(tmp)msd(tmp)var(tmp)
rpois Simulate count compositions without overdispersion
Description
Generates multinomial or multi-Poission random variates based on an Aitchison composition.
Usage
rpois.ccomp(n,p,lambda)rmultinom.ccomp(n,p,N)
runif 213
Arguments
n number of datasets to be simulated
p The composition representing the probabilites/portions of the individual parts
lambda scalar or vector giving the expected total count
N scalar or vector giving the total count
Details
A count composition is a realisation of a multinomial or multivariate Poisson distribution.
Value
a random dataset ccount dataset
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
rnorm.ccomp
Examples
p <- acomp(c(3,3,3))rpois.ccomp(10,p,40)rmultinom.ccomp(10,p,40)
runif The uniform distribution on the simplex
Description
Generates random compositions with a uniform distribution on the (rcomp) simplex.
Usage
runif.acomp(n,D)runif.rcomp(n,D)
Arguments
n number of datasets to be simulated
D number of parts
214 scalar
Value
a generated random dataset of class "acomp" or "rcomp" drawn from a uniform distribution on thesimplex of D parts.
Note
The only difference between both routines is the class of the dataset returned.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
See Also
rDirichlet.acomp
Examples
plot(runif.acomp(10,3))plot(runif.rcomp(10,3))
scalar Parallel scalar products
Description
Computes scalar products of datasets of vectors or vectorial quantities.
Usage
scalar(x,y)## Default S3 method:scalar(x,y)
Arguments
x a vector or a matrix with rows considered as vectors
y a vector or a matrix with rows considered as vectors
scale 215
Details
The scalar product of two vectors is defined as:
scalar(x, y) :=∑
(xiyi)
Value
a numerical vector containing the scalar products of the vectors given by x and y. If both x andy contain more than one vector the function uses parallel operation like it would happen with anordinary product of vectors.
Note
The computation of the scalar product implicitly applies the cdt transform, which implies thatthe scalar products corresponding to the given geometries are returned for acomp, rcomp, aplus,rplus-objects. Even a useful scalar product for factors is induced in this way.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
Examples
scalar(acomp(c(1,2,3)),acomp(c(1,2,3)))scalar(rmult(c(1,2,3)),rmult(c(1,2,3)))
scale Normalizing datasets by centering and scaling
Description
The dataset is standardized by optional scaling and centering.
Usage
scale(x, center = TRUE, scale = TRUE,...)## Default S3 method:scale(x,center=TRUE, scale=TRUE,...)## S3 method for class 'acomp'scale(x,center=TRUE, scale=TRUE,...,robust=getOption("robust"))## S3 method for class 'rcomp'scale(x,center=TRUE, scale=TRUE,...,robust=getOption("robust"))## S3 method for class 'aplus'scale(x,center=TRUE, scale=TRUE,...,robust=getOption("robust"))## S3 method for class 'rplus'scale(x,center=TRUE, scale=TRUE,...,robust=getOption("robust"))## S3 method for class 'rmult'scale(x,center=TRUE, scale=TRUE,...,robust=getOption("robust"))
216 scale
Arguments
x a dataset or a single vector of some type
center logical value or the center to be substracted.
scale logical value or a scaling factor to for multiplication.
robust A robustness description. See robustnessInCompositions for details.
... added for generic generality
Details
scaling is defined in different ways for the different data types. It is always performed as an opera-tion in the enclosing vector space. In all cases an independent scaling of the different coordinates isnot always appropriate. This is only done for rplus and rmult geometries. The other three geome-tries are treated with a global scaling, keeping the relative variations of every part/amount.
The scaling factors can be a matrix (for cdt or idt space), a scalar, or for the r* geometries vectorfor scaling the entries individually. However scaling the entries individually does not make sensein the a* geometries. The operation achieve in the r*-geometries is indeed the centering of thea*-geometries.
Value
a vector or data matrix, as x and with the same class, but acordingly transformed.
Note
Note that the "rcomp" and "rplus" objects does not preserve their geometry during scaling and aretherefore reported as "rmult" objects.
See the documentation in package base for details on scale and scale.default. These functionsare only modified to allow the additional robustness parameter.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
split{base}
Examples
data(SimulatedAmounts)plot(scale(acomp(sa.groups)))## Not run:
plot(scale(rcomp(sa.groups)))
## End(Not run)plot(scale(aplus(sa.groups)))## Not run:
plot(scale(rplus(sa.groups)))
Sediments 217
## End(Not run)plot(scale(rmult(sa.groups)))
Sediments Proportions of sand, silt and clay in sediments specimens
Description
Data provide sand, silt and clay compositions of 21 sediments specimens, 10 of which are identifiedas offshore, 7 as near shore and 4 new samples.
Usage
data(Sediments)
Format
• sandnumeric: the portion of sand
• siltnumeric: the protion of silt
• claynumeric: the portion of clay
• typenumeric: 1 for offshore, 2 for near shore and 3 for new samples
Details
The data comprise 21 cases: 10 offshore, 7 near shore and 4 new samples, and 4 variables: sand,silt and clay proportions and in addition the type of sediments specimens – 1 for offshore, 2 for nearshore and 3 for new samples.
All 3-part compositions sum to one.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name YATQUAD.DAT, hereincluded under the GNU Public Library Licence Version 2 or newer.
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data, (Data 20) pp17.
218 segments
segments Draws straight lines from point to point.
Description
The function draws lines from a points x to a point y in the given geometry.
Usage
## S3 method for class 'acomp'segments(x0,y,...,steps=30,aspanel=FALSE)
## S3 method for class 'rcomp'segments(x0,y,...,steps=30,aspanel=FALSE)
## S3 method for class 'aplus'segments(x0,y,...,steps=30,aspanel=FALSE)
## S3 method for class 'rplus'segments(x0,y,...,steps=30,aspanel=FALSE)
## S3 method for class 'rmult'segments(x0,y,...,steps=30,aspanel=FALSE)
Arguments
x0 dataset of points (of the given type) to draw the line from
y dataset of points (of the given type) to draw the line to
... further graphical parameters
steps the number of discretisation points to draw the segments, since the representa-tion might not visually be a straight line
aspanel Logical, indicates use as slave to do acutal drawing only.
Details
The functions add lines to the graphics generated with the corresponding plot functions.
Adding to multipaneled plots redraws the plot completely, and is only possible when the plot hasbeen created with the plotting routines from this library.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
plot.acomp,lines.acomp
SerumProtein 219
Examples
data(SimulatedAmounts)
plot(acomp(sa.lognormals))segments.acomp(acomp(c(1,2,3)),acomp(c(2,3,1)),col="red")segments.rcomp(acomp(c(1,2,3)),acomp(c(2,3,1)),col="blue")
plot(aplus(sa.lognormals[,1:2]))segments.aplus(aplus(c(10,20)),aplus(c(20,10)),col="red")segments.rplus(rplus(c(10,20)),rplus(c(20,10)),col="blue")
plot(rplus(sa.lognormals[,1:2]))segments.aplus(aplus(c(10,20)),aplus(c(20,10)),col="red")segments.rplus(rplus(c(10,20)),rplus(c(20,10)),col="blue")
SerumProtein Serum Protein compositions of blood samples
Description
Data recording the proportions of the 4 serum proteins from blood samples of 30 patients, 14 withknown disease A, 16 with known disease B, and 6 new cases.
Usage
data(SerumProtein)
Format
• anumeric a protein type
• bnumeric a protein type
• cnumeric a protein type
• dnumeric a protein type
• Type1 deasease A, 2 disease B, 3 new cases
Details
The data consist of 36 cases: 14 with known disease A, 16 with known disease B, and 6 new casesand 5 v variables: a, b, c, and d for 4 serum proteins and Type for the diseases: 1 for disease A,2 for disease B, and 3 for new cases. All row serum proteins proportions sums to 1 except somerounding errors.
220 ShiftOperators
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name SERPROT.DAT, hereincluded under the GNU Public Library Licence Version 2 or newer.
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data (Data 16) pp20.
ShiftOperators Shifts of machine operators
Description
Compsitions of eight-hours shifts of 27 machine operators.
Usage
data(ShiftOperators)
Details
A study of the activities of 27 machine operators during their eight-hours shifts has been conducted,and proportions of time spend in the following categories:
A: high-quality production,B: low-quality production,C: machine setting,D: machine repair,
are recorded. Of particular interest are any insights which such data might give of relationshipsbetween productive and nonproductive parts of such shifts. All compositions sum up one except forrounding error.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name SHIFT.DAT, here in-cluded under the GNU Public Library Licence Version 2 or newer.
simplemissingplot 221
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data, (Data 22), pp22.
simplemissingplot Ternary diagrams
Description
Displaying compositions in ternary diagrams
Usage
simpleMissingSubplot(loc, portions, labels=NULL,col=c("white","yellow","red","green","blue"),..., border="gray60", vertical=NULL, xpd=NA)
Arguments
loc a vector of the form c(x1,x2,y1,y2) giving the drawing rectangle for the subplotin coordinates as in par("usr"). I.e. if the plot is logrithmic the base 10 logarithmis to be used:
• x1left boundary of drawing rectangle• y1lower boundary of drawing rectangle• x2right boundary of drawing rectangle• y2upper boundary of drawing rectangle
portions The portions of different missing categories
labels The labels for the categories.
col The colors to plot the different categories.
... further graphical parameters passed to text
border The color to draw the borders of the rectangles.
vertical Should a horizontal or a vertical plot be produced. If NULL the choice is doneautomatically according to the size of the recangle provided.
xpd extended plot region. See par("xpd").
Details
This function is typically not called directly,however it could in principle be used to add to plots.The user will modify the function call only to modify the appearance of the missing plot.
The labels are only plotted for nonzero portions. In this way it is always possible to realize thepresence of a given missing type, even if it is a too small portion to be actually displayed. In caseof overplotting of different labels a further investigation using missingSummary should be used.
222 SimulatedAmounts
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
plot.aplus
Examples
data(SimulatedAmounts)plot(acomp(sa.missings))plot(acomp(sa.missings),mp=~simpleMissingSubplot(c(0,0.1,0.2,1),
missingInfo[c(1,3:5,2)],c("Not Missing",paste("Missing Only:",cn),"Totally Missing"),col=c("gray","red","green","blue","darkgray"))
)ms <- missingSummary(sa.missings)for( i in 1:3 )
simpleMissingSubplot(c(0.9+0.03*(i-1),0.9+0.03*i,0.2,1), ms[i,])
SimulatedAmounts Simulated amount datasets
Description
Several simulated datasets intended as reference examples for various conceptual and statisticalmodels of compositions and amounts.
Usage
data(SimulatedAmounts)
Format
Data matrices with 60 cases and 3 or 5 variables.
Details
The statistical analysis of amounts and compositions is set to discussion. Four essentially differ-ent approaches are provided in this package around the classes "rplus", "aplus", "rcomp", "acomp".There is no absolutely "right" approach, since there is a conection between these approaches and theprocesses originating the data. We provide here simulated standard datasets and the correspondingsimulation procedures following these several models to provide "good" analysis examples and toshow how these models actually look like in data.
The data sets are simulated according to correlated lognormal distributions (sa.lognormals, sa.lognormal5),winsorised correlated normal distributions (sa.tnormals, sa.tnormal5), Dirichlet distribution on thesimplex (sa.dirichlet, sa.dirichlet5), uniform distribution on the simplex (sa.uniform, sa.uniform5),
SimulatedAmounts 223
and a grouped dataset (sa.groups, sa.groups5) with three groups (given in sa.groups.area and sa.groups5.area)all distributed accordingly with a lognormal distribution with group-dependent means.
We can imagine that amounts evolve in nature e.g. in part of the soil they are diluted and transportedin a transport medium, usually water, which comes from independent source (the rain, for instance)and this new composition is normalized by taking a sample of standard size. For each of the datasetssa.X there is a corresponding sa.X.dil dataset which is build by simulating exactly that processon the corresponding sa.X dataset . The amounts in the sa.X.dil are given in ppm. This idea of atransport medium is a major argument for a compositional approach, because the total amount givenby the sum of the parts is induced by the dilution given by the medium and thus non-informativefor the original process investigated.
If we imagine now these amounts flowing into a river and sedimenting, the different contributionsare accumulated along the river and renormalized to a unit portion on taking samples again. Foreach of the dataset sa.X.dil there is a corresponding sa.X.mix dataset which is built from thecorresponding sa.X dataset by simulating exactly that accumulation process. Mixing of differentcompositions is a major argument against the log based approaches (aplus, acomp) since mixing isa highly nonlinear operation in terms of log-ratios.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
Source
The datasets are simulated for this package and are under the GNU Public Library Licence Version2 or newer.
References
http://www.stat.boogaart.de/compositions/data
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
Rehder, S. and U. Zier (2001) Letter to the Editor: Comment on ”Logratio Analysis and Composi-tional Distance” by J. Aitchison, C. Barcel\’o -Vidal, J.A. Mart\’in-Fern\’andez and V. Pawlowsky-Glahn, Mathematical Geology, 33 (7), 845-848.
Zier, U. and S. Rehder (2002) Some comments on log-ratio transformation and compositional dis-tance, Terra Nostra, Schriften der Alfred Wegener-Stiftung, 03/2003
Examples
data(SimulatedAmounts)plot.acomp(sa.lognormals)plot.acomp(sa.lognormals.dil)
224 SimulatedAmounts
plot.acomp(sa.lognormals.mix)plot.acomp(sa.lognormals5)plot.acomp(sa.lognormals5.dil)plot.acomp(sa.lognormals5.mix)
plot(acomp(sa.missings))plot(acomp(sa.missings5))
#library(MASS)plot.rcomp(sa.tnormals)plot.rcomp(sa.tnormals.dil)plot.rcomp(sa.tnormals.mix)plot.rcomp(sa.tnormals5)plot.rcomp(sa.tnormals5.dil)plot.rcomp(sa.tnormals5.mix)
plot.acomp(sa.groups,col=as.numeric(sa.groups.area),pch=20)plot.acomp(sa.groups.dil,col=as.numeric(sa.groups.area),pch=20)plot.acomp(sa.groups.mix,col=as.numeric(sa.groups.area),pch=20)plot.acomp(sa.groups5,col=as.numeric(sa.groups.area),pch=20)plot.acomp(sa.groups5.dil,col=as.numeric(sa.groups.area),pch=20)plot.acomp(sa.groups5.mix,col=as.numeric(sa.groups.area),pch=20)
plot.acomp(sa.uniform)plot.acomp(sa.uniform.dil)plot.acomp(sa.uniform.mix)plot.acomp(sa.uniform5)plot.acomp(sa.uniform5.dil)plot.acomp(sa.uniform5.mix)
plot.acomp(sa.dirichlet)plot.acomp(sa.dirichlet.dil)plot.acomp(sa.dirichlet.mix)plot.acomp(sa.dirichlet5)plot.acomp(sa.dirichlet5.dil)plot.acomp(sa.dirichlet5.mix)
# The data was simulated with the following commands:
#library(MASS)dilution <- function(x) {clo(cbind(x,exp(rnorm(nrow(x),5,1))))[,1:ncol(x)]*1E6}seqmix <- function(x) {clo(apply(x,2,cumsum))*1E6}
vars <- c("Cu","Zn","Pb")vars5 <- c("Cu","Zn","Pb","Cd","Co")
sa.lognormals <- structure(exp(matrix(rnorm(3*60),ncol=3) %*%chol(matrix(c(1,0.8,-0.2,0.8,1,
-0.2,-0.2,-0.2,1),ncol=3))+matrix(rep(c(1:3),each=60),ncol=3)),
dimnames=list(NULL,vars))
SimulatedAmounts 225
plot.acomp(sa.lognormals)pairs(sa.lognormals)
sa.lognormals.dil <- dilution(sa.lognormals)plot.acomp(sa.lognormals.dil)pairs(sa.lognormals.dil)
sa.lognormals.mix <- seqmix(sa.lognormals.dil)plot.acomp(sa.lognormals.mix)pairs(sa.lognormals.mix)
sa.lognormals5 <- structure(exp(matrix(rnorm(5*60),ncol=5) %*%chol(matrix(c(1,0.8,-0.2,0,0,
0.8,1,-0.2,0,0,-0.2,-0.2,1,0,0,0,0,0,5,4.9,0,0,0,4.9,5),ncol=5))+
matrix(rep(c(1:3,-2,-2),each=60),ncol=5)),dimnames=list(NULL,vars5))
plot.acomp(sa.lognormals5)pairs(sa.lognormals5)
sa.lognormals5.dil <- dilution(sa.lognormals5)plot.acomp(sa.lognormals5.dil)pairs(sa.lognormals5.dil)
sa.lognormals5.mix <- seqmix(sa.lognormals5.dil)plot.acomp(sa.lognormals5.mix)pairs(sa.lognormals5.mix)
sa.groups.area <- factor(rep(c("Upper","Middle","Lower"),each=20))sa.groups <- structure(exp(matrix(rnorm(3*20*3),ncol=3) %*%
chol(0.5*matrix(c(1,0.8,-0.2,0.8,1,-0.2,-0.2,-0.2,1),ncol=3))+
matrix(rep(c(1,2,2.5,2,2.9,5,4,2,5),each=20),ncol=3)),
dimnames=list(NULL,c("clay","sand","gravel")))
plot.acomp(sa.groups,col=as.numeric(sa.groups.area),pch=20)pairs(sa.lognormals,col=as.numeric(sa.groups.area),pch=20)
sa.groups.dil <- dilution(sa.groups)plot.acomp(sa.groups.dil,col=as.numeric(sa.groups.area),pch=20)pairs(sa.groups.dil,col=as.numeric(sa.groups.area),pch=20)
sa.groups.mix <- seqmix(sa.groups.dil)plot.acomp(sa.groups.mix,col=as.numeric(sa.groups.area),pch=20)pairs(sa.groups.mix,col=as.numeric(sa.groups.area),pch=20)
226 SimulatedAmounts
sa.groups5.area <- factor(rep(c("Upper","Middle","Lower"),each=20))sa.groups5 <- structure(exp(matrix(rnorm(5*20*3),ncol=5) %*%
chol(matrix(c(1,0.8,-0.2,0,0,0.8,1,-0.2,0,0,-0.2,-0.2,1,0,0,0,0,0,5,4.9,0,0,0,4.9,5),ncol=5))+
matrix(rep(c(1,2,2.5,2,2.9,5,4,2.5,0,-2,-1,-1,-1,-2,-3),
each=20),ncol=5)),dimnames=list(NULL,
vars5))
plot.acomp(sa.groups5,col=as.numeric(sa.groups5.area),pch=20)pairs(sa.groups5,col=as.numeric(sa.groups5.area),pch=20)
sa.groups5.dil <- dilution(sa.groups5)plot.acomp(sa.groups5.dil,col=as.numeric(sa.groups5.area),pch=20)pairs(sa.groups5.dil,col=as.numeric(sa.groups5.area),pch=20)
sa.groups5.mix <- seqmix(sa.groups5.dil)plot.acomp(sa.groups5.mix,col=as.numeric(sa.groups5.area),pch=20)pairs(sa.groups5.mix,col=as.numeric(sa.groups5.area),pch=20)
sa.tnormals <- structure(pmax(matrix(rnorm(3*60),ncol=3) %*%chol(matrix(c(1,0.8,-0.2,0.8,1,
-0.2,-0.2,-0.2,1),ncol=3))+matrix(rep(c(0:2),each=60),ncol=3),0),
dimnames=list(NULL,c("clay","sand","gravel")))
plot.rcomp(sa.tnormals)pairs(sa.tnormals)
sa.tnormals.dil <- dilution(sa.tnormals)plot.acomp(sa.tnormals.dil)pairs(sa.tnormals.dil)
sa.tnormals.mix <- seqmix(sa.tnormals.dil)plot.acomp(sa.tnormals.mix)pairs(sa.tnormals.mix)
sa.tnormals5 <- structure(pmax(matrix(rnorm(5*60),ncol=5) %*%chol(matrix(c(1,0.8,-0.2,0,0,
0.8,1,-0.2,0,0,
SimulatedAmounts 227
-0.2,-0.2,1,0,0,0,0,0,0.05,0.049,0,0,0,0.049,0.05),ncol=5))+
matrix(rep(c(0:2,0.1,0.1),each=60),ncol=5),0),dimnames=list(NULL,
vars5))
plot.rcomp(sa.tnormals5)pairs(sa.tnormals5)
sa.tnormals5.dil <- dilution(sa.tnormals5)plot.acomp(sa.tnormals5.dil)pairs(sa.tnormals5.dil)
sa.tnormals5.mix <- seqmix(sa.tnormals5.dil)plot.acomp(sa.tnormals5.mix)pairs(sa.tnormals5.mix)
sa.dirichlet <- sapply(c(clay=0.2,sand=2,gravel=3),rgamma,n=60)colnames(sa.dirichlet) <- vars
plot.acomp(sa.dirichlet)pairs(sa.dirichlet)
sa.dirichlet.dil <- dilution(sa.dirichlet)plot.acomp(sa.dirichlet.dil)pairs(sa.dirichlet.dil)
sa.dirichlet.mix <- seqmix(sa.dirichlet.dil)plot.acomp(sa.dirichlet.mix)pairs(sa.dirichlet.mix)
sa.dirichlet5 <- sapply(c(clay=0.2,sand=2,gravel=3,humus=0.1,plant=0.1),rgamma,n=60)colnames(sa.dirichlet5) <- vars5
plot.acomp(sa.dirichlet5)pairs(sa.dirichlet5)
sa.dirichlet5.dil <- dilution(sa.dirichlet5)plot.acomp(sa.dirichlet5.dil)pairs(sa.dirichlet5.dil)
sa.dirichlet5.mix <- seqmix(sa.dirichlet5.dil)plot.acomp(sa.dirichlet5.mix)pairs(sa.dirichlet5.mix)
sa.uniform <- sapply(c(clay=1,sand=1,gravel=1),rgamma,n=60)colnames(sa.uniform) <- vars
228 SimulatedAmounts
plot.acomp(sa.uniform)pairs(sa.uniform)
sa.uniform.dil <- dilution(sa.uniform)plot.acomp(sa.uniform.dil)pairs(sa.uniform.dil)
sa.uniform.mix <- seqmix(sa.uniform.dil)plot.acomp(sa.uniform.mix)pairs(sa.uniform.mix)
sa.uniform5 <- sapply(c(clay=1,sand=1,gravel=1,humus=1,plant=1),rgamma,n=60)colnames(sa.uniform5) <- vars5
plot.acomp(sa.uniform5)pairs(sa.uniform5)
sa.uniform5.dil <- dilution(sa.uniform5)plot.acomp(sa.uniform5.dil)pairs(sa.uniform5.dil)
sa.uniform5.mix <- seqmix(sa.uniform5.dil)plot.acomp(sa.uniform5.mix)pairs(sa.uniform5.mix)
tmp<-set.seed(1400)A <- matrix(c(0.1,0.2,0.3,0.1),nrow=2)Mvar <- 0.1*ilrvar2clr(A %*% t(A))Mcenter <- acomp(c(1,2,1))typicalData <- rnorm.acomp(100,Mcenter,Mvar) # main populationcolnames(typicalData)<-c("A","B","C")# A dataset without outlierssa.outliers1 <- acomp(rnorm.acomp(100,Mcenter,Mvar))# A dataset with 10% data with a large error in the first componentsa.outliers2 <- acomp(rbind(typicalData+rbinom(100,1,p=0.1)*rnorm(100)*acomp(c(4,1,1))))# A dataset with a single outliersa.outliers3 <- acomp(rbind(typicalData,acomp(c(0.5,1.5,2))))colnames(sa.outliers3)<-colnames(typicalData)tmp<-set.seed(30)rcauchy.acomp <- function (n, mean, var){
D <- gsi.getD(mean)-1perturbe(ilrInv(matrix(rnorm(n*D)/rep(rnorm(n),D), ncol = D) %*% chol(clrvar2ilr(var))), mean)
}# A dataset with a Cauchy type distributionsa.outliers4 <- acomp(rcauchy.acomp(100,acomp(c(1,2,1)),Mvar/4))colnames(sa.outliers4)<-colnames(typicalData)# A dataset with like sa.outlier2 but a differently strong distortionssa.outliers5 <- acomp(rbind(unclass(typicalData)+outer(rbinom(100,1,p=0.1)*runif(100),c(0.1,1,2))))# A dataset with a second populationsa.outliers6 <- acomp(rbind(typicalData,rnorm.acomp(20,acomp(c(4,4,1)),Mvar)))
simulatemissings 229
# Missingssa.missings <- simulateMissings(sa.lognormals,dl=0.05,MAR=0.05,MNAR=0.05,SZ=0.05)sa.missings[5,2]<-BDLvalue
sa.missings5 <- simulateMissings(sa.lognormals5,dl=0.05,MAR=0.05,MNAR=0.05,SZ=0.05)sa.missings5[5,2]<-BDLvalue
objects(pattern="sa.*")
simulatemissings Artifical simulation of various kinds of missings/polluted data
Description
These are simulation mechanisms to check that missing techniques perform in sensible ways. Theyjust generate additional missings of the various types in a given dataset, according to a specificprocess.
Usage
simulateMissings(x, dl=NULL, knownlimit=FALSE,MARprob=0.0, MNARprob=0.0, mnarity=0.5, SZprob=0.0)
observeWithAdditiveError(x, sigma=dl/dlf, dl=sigma*dlf, dlf=3,keepObs=FALSE, digits=NA, obsScale=1,class="acomp")
Arguments
x a dataset that should get the missings
dl the detection limit described in clo, to impose an artificial detection limit
knownlimit a boolean indicating wether the actual detection limit is still known in the dataset.
MARprob the probability of occurence of ’Missings At Random’ values
MNARprob the probability of occurrence of ’Missings Not At Random’. The tendency isthat small values have a higher probability to be missed.
mnarity a number between 0 and 1 giving the strength of the influence of the actual valuein becoming a MNAR. 0 means a MAR like behavior and 1 means that it is justthe smallest values that is lost
SZprob the probability to obtain a structural zero. This is done at random like a MAR.
sigma the standard deviation of the normal distributed extra additive error
dlf the distance from 0 at which a datum will be considered BDL
keepObs should the (closed) data without additive error be returned as an attribute?
digits rounding to be applied to the data with additive error (see Details)
230 simulatemissings
obsScale rounding to be applied to the data with additive error (see Details). Should be apower of 10.
class class of the output object
Details
Without any additional parameters no missings are generated. The procedure to generate MNARaffects all variables.
Function "simulateMissings" is a multipurpose simulator, where each class of missing value istreated separately, and where detection limits are specified as thresholds.
Function "observeWithAdditiveError" simulates data within a very specific framework, where anadditive error of sd=sigma is added to the input data x, and BDLs are generated if a datum is lessthan dfl times sigma. Afterwards, the resulting data are rounded as round(data/obsScale,digits)*obsScale,i.e. a certain observation scale obsScale is chosen, and at that scale, only some digits are kept.This framework is typical of chemical analyses, and it generates both BDLs and pollution/roundingof (apparently) "right" data.
Value
A dataset like x but with some additional missings.
Author(s)
K.Gerald van den Boogaart
References
van den Boogaart, K., R. Tolosana-Delgado, and M. Bren (2011). The Compositional Meaning of aDetection Limit. In Proceedings of the 4th International Workshop on Compositional Data Analysis(2011).
van den Boogaart, K.G., R. Tolosana-Delgado and M. Templ (2014) Regression with compositionalresponse having unobserved components or below detection limit values. Statistical Modelling (inpress).
See compositions.missings for more details.
See Also
compositions.missings
Examples
data(SimulatedAmounts)x <- acomp(sa.lognormals)xnew <- simulateMissings(x,dl=0.05,MAR=0.05,MNAR=0.05,SZ=0.05)acomp(xnew)plot(missingSummary(xnew))
SkyeAFM 231
Skulls Measurement of skulls
Description
Measurement in degrees of the angles N, A, B in skulls of English seventeenth-century people andNaquada people.
Usage
data(Skulls)
Details
As a part of a study of seventeenth-century English skulls three angles of a triangle in the cranium
N: nasial angle,A: alveolar angle,B: basilar angle,
were measured for 22 female and 29 male skulls. These, together with similar measurement of 22female and 29 male skulls of the Naqada race are presented. The general objective is to investigatepossible sex and race differences in skull shape. The angles sums in the row are all equal to 180degrees.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name SKULLS.DAT, here in-cluded under the GNU Public Library Licence Version 2 or newer.
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data, (Data 24), pp22.
SkyeAFM AFM compositions of 23 aphyric Skye lavas
Description
AFM compositions of 23 aphyric Skye lavas. AFM diagrams formed from the relative proportionsof A: alkali or Na2O + K2O, F: Fe2O3, and M: MgO, are common in geochemistry.
232 split
Usage
data(SkyeAFM)
Details
AFM compositions of 23 aphyric Skye lavas. AFM diagrams formed from the relative proportionsof tive proportions of A: alkali or Na2O + K2O, F: Fe2O3, and M: MgO. Adapted from Thompson,Esson and Duncan: Major element chemical variations in the Eocene lavas of the Isle of Skye,Scotland. All row percentage sums to 100.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name SKYEAFM.DAT, hereincluded under the GNU Public Library Licence Version 2 or newer.
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data, (Data 6) pp12.
Thompson, Esson and Duncan: Major element chemical variations in the Eocene lavas of the Isleof Skye, Scotland. 1972, J.Petrology, 13, 219-235.
See Also
Skye
split Splitting datasets in groups given by factors
Description
Splits data sets of compositions in groups given by factors, and gives the same class as the data tothe result.
Usage
## S3 method for class 'acomp'split(x,f,drop=FALSE,...)## S3 method for class 'rcomp'split(x,f,drop=FALSE,...)## S3 method for class 'aplus'split(x,f,drop=FALSE,...)## S3 method for class 'rplus'split(x,f,drop=FALSE,...)
straight 233
## S3 method for class 'rmult'split(x,f,drop=FALSE,...)## S3 method for class 'ccomp'split(x,f,drop=FALSE,...)
Arguments
x a dataset or a single vector of some type
f a factor that defines the grouping or a list of factors
drop drop=FALSE also gives (empty) datsets for empty categories
... Further arguments passed to split.default. Currently (and probably) without anyuse.
Value
a list of objects of the same type as x.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
split
Examples
data(SimulatedAmounts)split(acomp(sa.groups),sa.groups.area)lapply( split(acomp(sa.groups),sa.groups.area), mean)
straight Draws straight lines.
Description
The function draws lines in a given direction d through points x.
Usage
straight(x,...)## S3 method for class 'acomp'
straight(x,d,...,steps=30,aspanel=FALSE)## S3 method for class 'rcomp'
straight(x,d,...,steps=30,aspanel=FALSE)## S3 method for class 'aplus'
234 straight
straight(x,d,...,steps=30,aspanel=FALSE)## S3 method for class 'rplus'
straight(x,d,...,steps=30,aspanel=FALSE)## S3 method for class 'rmult'
straight(x,d,...,steps=30,aspanel=FALSE)
Arguments
x dataset of points of the given type to draw the line through
d dataset of directions of the line
... further graphical parameters
steps the number of discretisation points to draw the segments, since the representa-tion might not visually be a straight line
aspanel Logical, indicates use as slave to do acutal drawing only.
Details
The functions add lines to the graphics generated with the corresponding plot functions.Adding to multipaneled plots redraws the plot completely, and is only possible when the plot hasbeen created with the plotting routines from this library.Lines end when they leave the space (e.g. the simplex), which sometimes leads to the impressionof premature end (specially in rcomp geometry).
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
plot.acomp,lines.acomp
Examples
data(SimulatedAmounts)
plot(acomp(sa.lognormals))straight(mean(acomp(sa.lognormals)),
princomp(acomp(sa.lognormals))$Loadings[1,],col="red")
straight(mean(rcomp(sa.lognormals)),princomp(rcomp(sa.lognormals))$loadings[,1],col="blue")
plot(aplus(sa.lognormals[,1:2]))straight(mean(aplus(sa.lognormals[,1:2])),
princomp(aplus(sa.lognormals[,1:2]))$Loadings[1,],col="red")
straight(mean(rplus(sa.lognormals[,1:2])),princomp(rplus(sa.lognormals[,1:2]))$loadings[,1],
summary.acomp 235
col="blue")
plot(rplus(sa.lognormals[,1:2]))straight(mean(aplus(sa.lognormals[,1:2])),
princomp(aplus(sa.lognormals[,1:2]))$Loadings[1,],col="red")
straight(mean(rplus(sa.lognormals[,1:2])),princomp(rplus(sa.lognormals[,1:2]))$loadings[,1],col="blue")
summary.acomp Summarizing a compositional dataset in terms of ratios
Description
Summaries in terms of compositions are quite different from classical ones. Instead of analysingeach variable individually, we must analyse each pair-wise ratio in a log geometry.
Usage
## S3 method for class 'acomp'summary( object, ... ,robust=getOption("robust"))
Arguments
object a data matrix of compositions, not necessarily closed
... not used, only here for generics
robust A robustness description. See robustnessInCompositions for details. The pa-rameter can be null for avoiding any estimation.
Details
It is quite difficult to summarize a composition in a consistent and interpretable way. We tried toprovide such a summary here, based on the idea of the variation matrix.
Value
The result is an object of type "summary.acomp"
mean the mean.acomp composition
mean.ratio a matrix containing the geometric mean of the pairwise ratios
variation the variation matrix of the dataset ({variation.acomp})
expsd a matrix containing the one-sigma factor for each ratio, computed as exp(sqrt(variation.acomp(W))).To obtain a two-sigma-factor, one has to take its squared value (power 1.96, ac-tually).
236 summary.aplus
invexpsd the inverse of the preceding one, giving the reverse bound. Additionally, it canbe "almost" intepreted as a correlation coefficient, with values near one indicat-ing high proportionality between the components.
min a matrix containing the minimum of each of the pairwise ratios
q1 a matrix containing the 1-Quartile of each of the pairwise ratios
median a matrix containing the median of each of the pairwise ratios
q1 a matrix containing the 3-Quartile of each of the pairwise ratios
max a matrix containing the maximum of each of the pairwise ratios
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de, R. Tolosana-Delgado
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
See Also
acomp
Examples
data(SimulatedAmounts)summary(acomp(sa.lognormals))
summary.aplus Summaries of amounts
Description
Summary of a vector of amounts, according to its underlying geometry.
Usage
## S3 method for class 'aplus'summary( object, ...,
digits=max(3, getOption("digits")-3), robust=NULL)## S3 method for class 'rplus'
summary( object, ..., robust=NULL)## S3 method for class 'rmult'
summary( object, ..., robust=NULL)
summary.rcomp 237
Arguments
object an aplus/rplus set of amounts
digits the number of significant digits to be used. The argument can also be used withrplus/rmult.
... not used, only here for generics
robust A robustness description. See robustnessInCompositions for details. The op-tion is currently not supported. If support is added the default will change togetOption(robust).
Details
The obtained value is the same as for the classical summary summary, although in the case of aplusobjects, the statistics have been computed in a logarithmic geometry, and exponentiated afterwards(which just changes the mean, equivalent to the geometric mean of the data set).
Value
A matrix containing summary statistics (minimum, the three quantiles, the mean and the maximum)of each component.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
aplus,rplus,summary.acomp, summary.rcomp
Examples
data(SimulatedAmounts)summary(aplus(sa.lognormals))summary(aplus(sa.tnormals))summary(rplus(sa.lognormals))summary(rplus(sa.tnormals))summary(rmult(sa.lognormals))
summary.rcomp Summary of compositions in real geometry
Description
Compute a summary of a composition based on real geometry.
238 sumprojector
Usage
## S3 method for class 'rcomp'summary( object, ... ,robust=NULL)
Arguments
object an rcomp dataset of compositions
... further arguments to summary
robust A robustness description. See robustnessInCompositions for details. The op-tion is currently not supported. If support is added the default will change togetOption(robust).
Details
The data is applied a clo operation before the computation. Note that the statistics obtained willnot keep any consistency if computed with all the parts available or only with a subcomposition.
Value
A matrix containing summary statistics. The value is the same as for the classical summary summaryapplied to a closed dataset.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
rcomp, summary.aplus, summary.acomp
Examples
data(SimulatedAmounts)summary(rcomp(sa.lognormals))summary(rcomp(sa.tnormals))
sumprojector Compute the global projector to the observed subspace.
Description
Routines to compute the global projector to the observed subspace, down-weighting the subspaceswith more missing values.
sumprojector 239
Usage
sumMissingProjector(x,...)## S3 method for class 'acomp'sumMissingProjector(x,has=is.NMV(x),...)## S3 method for class 'aplus'sumMissingProjector(x,has=is.NMV(x),...)## S3 method for class 'rcomp'sumMissingProjector(x,has=!(is.MAR(x)|is.MNAR(x)),...)## S3 method for class 'rplus'sumMissingProjector(x,has=!(is.MAR(x)|is.MNAR(x)),...)## S3 method for class 'rmult'sumMissingProjector(x,has=is.finite(x),...)
Arguments
x a dataset of some type containing missings
has the values to be regarded as non missing
... further generic arguments that might be useful for other functions.
Details
The function missingProjector generates a list of N square matrices of dimension DxD (with Nand D respectively equal to the number of rows and columns in x). Each of these matrices gives theprojection of a data row onto its observed sub-space. Then, the function sumMissingProjectortakes all these matrices and sums them in a efficient way, generating a "summary" of observedsub-spaces.
Value
The matrix of rotation/re-weighting of the original data set, down-weighting the subspaces withmore missing values. This matrix is useful to obtain estimates of the mean (and variance, in thefuture) still unbiased in the presence of lost values (only of type MAR, stricly-speaking, but anywayuseful for any type of missing value, when used with care). This matrix is the Fisher Information inthe presence of missing values.
Missing Policy
No missing policy is given by the routine itself. Its treatment of missing values depends on the"has" argument.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de, Raimon Tolosana-Delgado
240 Supervisor
References
Boogaart, K.G. v.d., R. Tolosana-Delgado, M. Bren (2006) Concepts for handling of zeros and miss-ing values in compositional data, in E. Pirard (ed.) (2006)Proccedings of the IAMG’2006 AnnualConference on "Quantitative Geology from multiple sources", September 2006, Liege, Belgium,S07-01, 4pages, http://stat.boogaart.de/Publications/iamg06_s07_01.pdf
See Also
missingProjector, clr,rcomp, aplus, princomp.acomp, plot.acomp, boxplot.acomp, barplot.acomp,mean.acomp, var.acomp, variation.acomp, cov.acomp, msd
Examples
data(SimulatedAmounts)sumMissingProjector(acomp(sa.lognormals))sumMissingProjector(acomp(sa.tnormals))
Supervisor Proportions of supervisor’s statements assigned to different categories
Description
The results of a study of a single supervisor in his relationship to three supervisee are recorded.Instructions in a technical subject took place in sessions of one hour and with only one supervisee atthe time. Each supervisee attended six sessions (once every two weeks in a twelve-week period). Allof 18 sessions were recorded and for each session the ’statements’ of the supervisor were classifiedinto four categories. Thus for each session the proportion of statements in the four categories areset out in a two-way table according to the fortnight (6) and the supervisee (3).
Usage
data(Supervisor)
Format
A 18x13 matrix
Details
For each session the ’statements’ of the supervisor were classified into four categories
• C:commanding, posing a specific instruction to the supervisee,• D:demanding, posing a specific question to the supervisee,• E:exposing, providing the supervisee with an explanation,• F:faulting, pointing out faulty technique to the supervisee.
Thus for each session the proportion of statements in the four categories are set out in a two-waytable according to the fortnight (6) and the supervisee (3). The C, D, E, F values in the rows summostly to 1, except for some rounding errors.
ternaryAxis 241
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name SUPERVIS.DAT, hereincluded under the GNU Public Library Licence Version 2 or newer.
References
Aitchison, J. (1986): The Statistical Analysis of Compositional Data, (Data 7) pp12.
ternaryAxis Axis for ternary diagrams
Description
Displaying compositions in ternary diagrams
Usage
ternaryAxis(side=1:3,at=seq(0.2,0.8,by=0.2),labels=if(is.list(at)) lapply(at,format) else format(at),...,tick=TRUE,pos=0,font.axis=par("font.axis"),font.lab=par("font.lab"),lty="solid",lwd=1,len.tck=0.025,dist.lab=0.03,dist.axis=0.03,lty.tck="solid",col.axis=par("col.axis"),col.lab=par("col.lab"),cex.axis=par("cex.axis"),cex.lab=par("cex.lab"),Xlab=NULL,Ylab=NULL,Zlab=NULL,small=TRUE,xpd=NA,aspanel=FALSE)
Arguments
side a vector giving the sides to draw the axis on. 1=under the plot, 2=the upperright axis, 3=the upper left axis. -1 is the portion axis of the first component, -2is the portion axis of the second component, -3 is the portion axis of the thirdcomponent. An empty vector or 0 suppresses axis plotting, but still plots theXlab, Ylab and Zlab parameters.
at a vector or a list of vectors giving the positions of the tickmarks.
242 ternaryAxis
labels a vector giving the labels or a list of things that can serve as graphics annotations.Each element of the list is than sean as the labels for one of axes. IMPORTANT:if plotting formulae enclose the list of labels into a list.
tick a logical whether to draw the tickmark lines
pos the portion of the opposite component to draw the axis on. Proportion axssshrinks, when pos>0 !
font.axis the font for the axis annotations
font.lab the font for the variable labels
lty the line type of the axis line. (see par). NA supresses plotting.
lty.tck the line type of the tickmarks. NA suppresses plotting.
len.tck the line length of the tickmarks.
dist.axis the distance of the variable labels from the axes. Positve values point outwardfrom the plot.
dist.lab the distance of the axes labels from the axes. Positve values point outward fromthe plot.
lwd the line widths of axis line and tickmarks. (see par)
col.axis the color to plot the axis line, the tickmarks and the axes labels.
col.lab the color to plot the variable labels.
cex.axis The character size to plot the axes labels. (see par)
cex.lab The character size for the variable labels
Xlab the label for the lower left component.
Ylab the label for the lower right component.
Zlab the label for the upper component.
small wether to plot the lower labels under the corners
xpd Extended plotting region. See (see par).
aspanel Is this called as a slave to acutally plot the axis (TRUE), or as a user level func-tion to instatiate the axis (FALSE).
... further graphical that might be of use for other functions, but are silently ignoredhere
Details
This function has two uses. If called with aspanel=TRUE it acutally draws the axes to a panel. Inother cases it tries to modify the axes argument of the current plot to add the axis. I.e. it will forcea replotting of the plot with the new axes settings. Thus an old axes is removed.
To ensure that various axes can be drawn with various parameters most of the arguments can take avector or list of the same length as side providing the different parameters for each of the axes tobe drawn.
There are two types of axes: Proportion axes (1:3) and portions axes (-1:-3). The best place to drawa Proportion axes is pos=0, which is the standard for axis in ternary diagrams. Portion axes are bestdrawn at pos=0.5 in the middle of the plot.
totals 243
Author(s)
K.Gerald v.d. Boyogaart http://www.stat.boogaart.de, Raimon Tolosana-Delgado
See Also
plot.aplus, plot3D (for 3D plot), kingTetrahedron (for 3D-plot model export), qqnorm.acomp,boxplot.acomp
Examples
data(SimulatedAmounts)plot(acomp(sa.lognormals),axes=TRUE)ternaryAxis(side=1:3,pos=0,col.axis="red",col.lab="green")ternaryAxis(side=1:3,at=1:9/10,
labels=expression(9:1,4:1,7:3,3:2,1:1,2:3,3:7,1:4,1:9),pos=0,col.axis="red",col.lab="green")
ternaryAxis(side=rep(-1:-3,3),labels=paste(seq(20,80,by=20),"%"),pos=rep(c(0,0.5,1),each=3),col.axis=1:3,col.lab="green")
ternaryAxis(side=rep(1:3,3),at=1:9/10,labels=expression(9:1,4:1,7:3,3:2,1:1,2:3,3:7,1:4,1:9),pos=rep(c(0,0.5,1),each=3))
plot(acomp(sa.lognormals5),axes=TRUE)ternaryAxis(side=1:3,pos=0,col.axis="red",col.lab="green")ternaryAxis(side=1:3,at=1:9/10,
labels=expression(9:1,4:1,7:3,3:2,1:1,2:3,3:7,1:4,1:9),pos=0,col.axis="red",col.lab="green")
totals Total sum of amounts
Description
Calculates the total amount by summing the individual parts.
Usage
totals(x,...)## S3 method for class 'acomp'totals(x,...,missing.ok=TRUE)## S3 method for class 'rcomp'totals(x,...,missing.ok=TRUE)## S3 method for class 'aplus'totals(x,...,missing.ok=TRUE)## S3 method for class 'rplus'totals(x,...,missing.ok=TRUE)## S3 method for class 'ccomp'totals(x,...,missing.ok=TRUE)
244 transformations from ’mixtures’ to ’compositions’ classes
Arguments
x an amount/amount dataset
... not used, only here for generic purposes
missing.ok if TRUE ignores missings; if FALSE issues an error if the total cannot be calcu-lated due to missings.
Value
a numeric vector of length equal to ncol(x) containing the total amounts
Missing Policy
if missing.ok=TRUE missings are just regarded as 0, if missing.ok=FALSE WZERO values is stillregarded as 0 and other sorts lead to NA in the respective totals.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
aplus
Examples
data(SimulatedAmounts)totals(acomp(sa.lognormals))totals(rcomp(sa.lognormals,total=100))totals(aplus(sa.lognormals))totals(rplus(sa.lognormals))aplus(acomp(sa.lognormals),total=totals(aplus(sa.lognormals)))
transformations from ’mixtures’ to ’compositions’ classes
Transformations from ’mixtures’ to ’compositions’ classes
Description
Transformations from ’mixtures’ of the "mixR" library to ’compositions’ classes ’aplus’, ’acomp’,’rcomp’, ’rplus’ and ’rmult’.
tryDebugger 245
Usage
mix.2aplus(X)mix.2acomp(X)mix.2rcomp(X)mix.2rplus(X)mix.2rmult(X)
Arguments
X mixture object to be converted
Details
A ’compositions’ object is obtained from the mixtute object m, having the same data matrix asmixture object m i.e. m\$mat.
Value
A ’compositions’ object of the class ’aplus’, ’acomp’, ’rcomp’, ’rplus’ or ’rmult’.
See Also
aplus acomp rcomp rplus rmult
Examples
## Not run:m <- mix.Read("Glac.dat") # reads the Glacial data set from Aitchison (1986)m <- mix.Extract(m,c(1,2,3,4)) # mix object with closed four parts subcompositionap <- mix.2aplus(m) # ap is a 'compositions' object of the aplus classac <- mix.2acomp(m) # ac is a 'compositions' object of the acomp class
## End(Not run)
tryDebugger Empirical variograms for compositions
Description
An R-debugger that also works with errors in parameters.
Usage
tryDebugger(dump = last.dump)
246 ult
Arguments
dump An R dump object created by ’dump.frames’.
Details
Works like debugger, with the small exception that it also works in situations of nasty errors, likerecursive parameter evaluation, missing parameters, and additional errors in arguments.
Value
Nothing.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
debugger
Examples
## Not run:f <- function(x,y=y) {y}f(1)tryDebugger() # worksdebugger() # Does not allow to browse anything
## End(Not run)
ult Uncentered log transform
Description
Compute the uncentered log ratio transform of a (dataset of) composition(s) and its inverse.
Usage
ult( x ,...)ultInv( z ,...)Kappa( x ,...)
var.acomp 247
Arguments
x a composition or a data matrix of compositions, not necessarily closed
z the ult-transform of a composition or clr-transforms of compositions (or a datamatrix), not necessarily centered
... for generic use only
Details
The ult-transform is simply the elementwise log of the closed composition. The ult has someimportant properties in the scope of Information Theory of probability vectors (but might be mostlymisleading for exploratory analysis of compositions).
Value
ult gives the uncentered log transform,ultInv gives closed compositions with the given ult/clr-transformsKappa gives the difference between the clr and the ult transforms. It is quite linked to informationmeasures.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
ilr,alr,apt
Examples
(tmp <- ult(c(1,2,3)))ultInv(tmp)ultInv(tmp) - clo(c(1,2,3)) # 0data(Hydrochem)cdata <- Hydrochem[,6:19]pairs(ult(cdata),pch=".")Kappa(c(1,2,3))
var.acomp Variances and covariances of amounts and compositions
Description
Compute the (co)variance matrix in the several approaches of compositional and amount data anal-ysis.
248 var.acomp
Usage
var(x,...)## Default S3 method:
var(x, y=NULL, na.rm=FALSE, use, ...)## S3 method for class 'acomp'
var(x,y=NULL,...,robust=getOption("robust"),use="all.obs",giveCenter=FALSE)
## S3 method for class 'rcomp'var(x,y=NULL,...,robust=getOption("robust"),
use="all.obs",giveCenter=FALSE)## S3 method for class 'aplus'
var(x,y=NULL,...,robust=getOption("robust"),use="all.obs",giveCenter=FALSE)
## S3 method for class 'rplus'var(x,y=NULL,...,robust=getOption("robust"),
use="all.obs",giveCenter=FALSE)## S3 method for class 'rmult'
var(x,y=NULL,...,robust=getOption("robust"),use="all.obs",giveCenter=FALSE)
cov(x,y=x,...)## Default S3 method:
cov(x, y=NULL, use="everything",method=c("pearson", "kendall", "spearman"), ...)
## S3 method for class 'acomp'cov(x,y=NULL,...,robust=getOption("robust"),
use="all.obs",giveCenter=FALSE)## S3 method for class 'rcomp'
cov(x,y=NULL,...,robust=getOption("robust"),use="all.obs",giveCenter=FALSE)
## S3 method for class 'aplus'cov(x,y=NULL,...,robust=getOption("robust"),
use="all.obs",giveCenter=FALSE)## S3 method for class 'rplus'
cov(x,y=NULL,...,robust=getOption("robust"),use="all.obs",giveCenter=FALSE)
## S3 method for class 'rmult'cov(x,y=NULL,...,robust=getOption("robust"),
use="all.obs",giveCenter=FALSE)
Arguments
x a dataset, eventually of amounts or compositions
y a second dataset, eventually of amounts or compositions
na.rm see var
use see var
method see cov
var.acomp 249
... further arguments to var e.g. use
robust A description of a robust estimator. FALSE for the classical estimators. SeerobustnessInCompositions for further details.
giveCenter If TRUE the center used in the variance calculation is reported as a "center"attribute. This is especially necessary for robust estimations, where a reasonablecenter can not be computed independently for the me variance calculation.
Details
The basic functions of var, cov are turned to S3-generics. The original versions are copied to thedefault method. This allows us to introduce generic methods to handle variances and covariancesof other data types, such as amounts or compositions.If classed amounts or compositions are involved, they are transformed with their correspondingtransforms, using the centered default transform (cdt). That implies that the variances have to beinterpreded in a log scale level for acomp and aplus.We should be aware that variance matrices of compositions (acomp and rcomp) are singular. Theycan be transformed to the correponding nonsingular variances of ilr or ipt-space by clrvar2ilr.
In R versions older than v2.0.0, var and cov were defined in package “base” instead of in “stats”.This might produce some misfunction.
Value
The variance matrix of x or the covariance matrix of x and y.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
cdt, clrvar2ilr, clo, mean.acomp, acomp, rcomp, aplus, rplus, variation
Examples
data(SimulatedAmounts)meanCol(sa.lognormals)var(acomp(sa.lognormals))var(rcomp(sa.lognormals))var(aplus(sa.lognormals))var(rplus(sa.lognormals))cov(acomp(sa.lognormals5[,1:3]),acomp(sa.lognormals5[,4:5]))cov(rcomp(sa.lognormals5[,1:3]),rcomp(sa.lognormals5[,4:5]))cov(aplus(sa.lognormals5[,1:3]),aplus(sa.lognormals5[,4:5]))cov(rplus(sa.lognormals5[,1:3]),rplus(sa.lognormals5[,4:5]))cov(acomp(sa.lognormals5[,1:3]),aplus(sa.lognormals5[,4:5]))
svd(var(acomp(sa.lognormals)))
250 variation
variation Variation matrices of amounts and compositions
Description
Compute the variation matrix in the various approaches of compositional and amount data analysis.Pay attention that this is not computing the variance or covariance matrix!
Usage
variation(x,...)## S3 method for class 'acomp'
variation(x, ...,robust=getOption("robust"))## S3 method for class 'rcomp'
variation(x, ...,robust=getOption("robust"))## S3 method for class 'aplus'
variation(x, ...,robust=getOption("robust"))## S3 method for class 'rplus'
variation(x, ...,robust=getOption("robust"))## S3 method for class 'rmult'
variation(x, ...,robust=getOption("robust"))
Arguments
x a dataset, eventually of amounts or compositions
... currently unused
robust A description of a robust estimator. FALSE for the classical estimators. SeerobustnessInCompositions for further details.
Details
The variation matrix was defined in the acomp context of analysis of compositions as the matrixof variances of all possible log-ratios among components (Aitchison, 1986). The generalization torcomp objects is simply to reproduce the variance of all possible differences between components.The amount (aplus, rplus) and rmult objects should not be treated with variation matrices, becausethis was intended to skip the existence of a closure (which does not exist in the case of amounts).
Value
The variation matrix of x.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
variograms 251
See Also
cdt, clrvar2ilr, clo, mean.acomp, acomp, rcomp, aplus, rplus
Examples
data(SimulatedAmounts)meanCol(sa.lognormals)variation(acomp(sa.lognormals))variation(rcomp(sa.lognormals))variation(aplus(sa.lognormals))variation(rplus(sa.lognormals))variation(rmult(sa.lognormals))
variograms Variogram functions
Description
Valid scalar variogram model functions.
Usage
vgram.sph( h , nugget = 0, sill = 1, range= 1,... )vgram.exp( h , nugget = 0, sill = 1, range= 1,... )vgram.gauss( h , nugget = 0, sill = 1, range= 1,... )vgram.cardsin( h , nugget = 0, sill = 1, range= 1,... )vgram.lin( h , nugget = 0, sill = 1, range= 1,... )vgram.pow( h , nugget = 0, sill = 1, range= 1,... )vgram.nugget( h , nugget = 1,...,tol=1E-8 )
Arguments
h a vector providing distances, a matrix of distance vectors in its rows or a data.frameof distance vectors.
nugget The size of the nugget effect (i.e. the limit to 0). At zero itself the value isalways 0.
sill The sill (i.e. the limit to infinity)
range The range parameter. I.e. the distance in which sill is reached or if this does notexist, where the value is in some sense nearly the sill.
... not used
tol The distance that is considered as nonzero.
252 variograms
Details
The univariate variograms are used in the CompLinCoReg as building blocks of multivariate vari-ogram models.
• sphSpherical variogram
• expExponential variogram
• gaussThe Gaussian variogram.
• gaussThe cardinal sine variogram.
• linLinear Variogram. Increases over the sill, which is reached at range.
• powThe power variogram. Increases over the sill, which is reached at range.
• nuggetThe pure nugget effect variogram.
Value
A vector of size NROW(h), giving the variogram values.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
References
Cressie, N.C. (1993) Spatial statistics
Tolosana, van den Boogaart, Pawlowsky-Glahn (2009) Estimating and modeling variograms ofcompositional data with occasional missing variables in R, StatGis09
See Also
vgram2lrvgram, CompLinModCoReg, vgmFit
Examples
## Not run:data(juraset)X <- with(juraset,cbind(X,Y))comp <- acomp(juraset,c("Cd","Cu","Pb","Co","Cr"))lrv <- logratioVariogram(comp,X,maxdist=1,nbins=10)plot(lrv)
## End(Not run)
varmlm 253
varmlm Residual variance of a model
Description
Computes the unbiased estimate for the variance of the residuals of a model.
Usage
## S3 method for class 'mlm'var(x,...)## S3 method for class 'lm'var(x,...)
Arguments
x a linear model object
... Unused, for generic purposes only.
Details
The difference of this command to var(resid(X)) is that this command correctly adjusts for thedegrees of freedom of the model.
Value
var.lm returns a scalar giving the estimated variance of the residuals
var.mlm returns a the estimated variance covariance matrix of the residuals
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de, Raimon Tolosana-Delgado
See Also
vcov
Examples
data(Orange)var(lm(circumference~age,data=Orange))var(lm(cbind(circumference,age)~age,data=Orange))
254 vcovAcomp
vcovAcomp Variance covariance matrix of parameters in compositional regression
Description
The variance covariance tensor structured according of linear models with ilr(acomp(...)) responses.
Usage
vcovAcomp(object,...)
Arguments
object a statistical model
... further optional parameters for vcov
Details
The prediction error in compositional linear regression models is a complicated object. The functionshould help to organize it.
Value
An array with 4 dimensions. The first 2 are the index dimensions of the ilr transform. The later 2are the index of the parameter.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
vcov
Examples
data(SimulatedAmounts)model <- lm(ilr(sa.groups)~sa.groups.area)vcovAcomp(model)[,,1,1]
vgmFit 255
vgmFit Compositional variogram model fitting
Description
Fits a parametric variogram model to an empirical logratio-Variogram
Usage
vgmFit2lrv(emp,vg,...,mode="log",psgn=rep(-1,length(param)),print.level=1)vgmFit(emp,vg,...,mode="log",psgn=rep(-1,length(param)),print.level=1)vgmGof(p = vgmGetParameters(vg), emp, vg, mode = "log")vgmGetParameters(vg,envir=environment(vg))vgmSetParameters(vg,p)
Arguments
emp An empirical logratio-Variogram as e.g. returned by logratioVariogram
vg A compositional clr-variogram (or ilt-vagriogram) model function.
... further parameters to nlm
mode either "ls" or "log" for selection of either using either least squares or leastsquares on logarithmic values.
psgn Contains a parameter code for each of the parameters. -1 means the parametershould be used as is. 0 means the parameter is nonnegativ and 1 means theparameter is striktly positiv. This allows to provide parameter limits if the fittingprocedure fails.
print.level The print.level of nlm. 0 for no printing. 1 for a rough information about thesucess and 2 for step by step printing.
p Is the parameter of the variogram model in linearized form as e.g. returned byvgmGetParameters.
envir The environment the default parameters of the model should be evaluated in.
Details
The function is mainly a wrapper to nlm specifying the an objective function for modell fitting,taking the starting values of fitting procedure from the default arguments and writing the resultsback. Variogram model fitting is more an art than a straight forward procedure. Fitting procedurestypically only find a right optimum if reasonable starting parameters are provided. The fit shouldbe visually checked afterwards.The meaning of psgn is subject to change. We will probably provide a more automatic procedurelater.vgmFit is a copy of vgmFit2lrv, but deprecated. The name will later be used for other functionality.
256 WhiteCells
Value
vgmFit2lrv returns a list of two elements.
nlm The result of nlm containing covergence codes.
vg A version of vg but with default parameters modified according to the fitting.
vgmGof returns a scalar quantifiying the goodness of fit, of a model and an empirical variogram.vgmGetParameters extracts the default values of a variogram model function to a parameter vector.It returns a numeric vector.vgmSetParameters does the inverse operation and modifies the default according to the new valuesin p. It returns vg with modifiend default parameter values.
Author(s)
K.Gerald v.d. Boogaart http://www.stat.boogaart.de
See Also
vgram2lrvgram, CompLinModCoReg, logratioVariogram
Examples
## Not run:data(juraset)X <- with(juraset,cbind(X,Y))comp <- acomp(juraset,c("Cd","Cu","Pb","Co","Cr"))lrv <- logratioVariogram(comp,X,maxdist=1,nbins=10)fff <- CompLinModCoReg(~nugget()+sph(0.5)+R1*exp(0.7),comp)fit <- vgmFit(lrv,fff)fitfff(1:3)plot(lrv,lrvg=vgram2lrvgram(fit$vg))
## End(Not run)
WhiteCells White-cell composition of 30 blood samples by two different methods
Description
In 30 blood samples portions of three kinds of white cells
• G:granulocytes,
• L:lymphocytes,
• M:monocytes,
were determined with two methods, time-consuming microscopic and automatic image analysis.The resulting 30 pairs of 3-part compositions are recorded.
Yatquat 257
Usage
data(WhiteCells)
Format
A 30x6 matrix
Details
In an experiment each of 30 blood samples was halved, one half being assigned randomly to onemethod, the other half to the other method. We have 60 cases of 3-part compositions but theseare essentially 30 pairs of related compositions. All 3-part portions sums to one, except for somerounding errors.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name WCELLS.DAT, hereincluded under the GNU Public Library Licence Version 2 or newer.
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data, 1986 (Data 9) pp16.
Yatquat Yatquat fruit evaluation
Description
The quality of yatquat tree fruit is assessed in terms of relative proportions by volume of flesh,skin and stone. In an experiment an arboriculturist uses 40 trees, randomly allocated 20 to thehormone treatment and leaves untreated the remaining 20 trees. Data provides fruit compositionsof the present season and the preceding season, as well as the treatment: 1 for the treated trees, -1for untreated trees.
Usage
data(Yatquat)
Format
A 40x7 data matrix
258 zeroreplace
Details
The yatquat tree produces each season a single large fruit. Data provides fruit compositions of thepresent season, the compositions of the fruit of the same 40 trees for the preceding season whennone of the trees were treated, and in addition the Type: 1 for the treated trees, -1 for untreatedtrees. For each of the 40 cases we have two 3-part composition on flesh, skin and stone. Thecolumn names are:
prFL portion of fruit flesh in the present season,prSK portion of fruit skin in the present season,prST portion of fruit stone in the present season,Type 1 for treated, $-1$ for untreated trees,paFL portion of fruit flesh in the preceding season,paSK portion of fruit skin in the preceding season,paST portion of fruit stone in the preceding season,
All 3-part compositions sum to one.
Note
Courtesy of J. Aitchison
Source
Aitchison: CODA microcomputer statistical package, 1986, the file name YATQUAT.DAT, hereincluded under the GNU Public Library Licence Version 2 or newer.
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data, (Data 12) pp17.
Examples
#data(Yatquat)#plot(acomp(Yatquat[,1:3]),col=as.numeric(Yatquat[,4])+2)#plot(acomp(Yatquat[,5:7]),col=as.numeric(Yatquat[,4])+2)
zeroreplace Zero-replacement routine
Description
A function to automatically replace rounded zeroes/BDLs in a composition.
Usage
zeroreplace(x,d=NULL,a=2/3)
zeroreplace 259
Arguments
x composition or dataset of compositions
d vector containing the detection limits of each part
a fraction of the detection limit to be used in replacement
Details
If d is given, zeroes from each column of x are replaced by the corresponding detection limit con-tained there, scaled down by the value of a (usually a scalar, although if it is a vector it will berecycled with a warning). The variable d should be a vector of length equal to ncol(x) or a matrixof the same shape as x.If d=NULL, then the detection limit is extracted from the data set, if it is available there (i.e., if thereare negative numbers). If no negative number is present in the data set, and no value is given for d,the result will be equal to x. See compositions.missings for more details on the missing policy.
Value
an object of the same class as x, where all WZERO values have been replaced. Output contains afurther attribute (named Losts), with a logical array of the same dimensions as x, showing whichelements were replaced (TRUE) and which were kept unchanged (FALSE).
References
Aitchison, J. (1986) The Statistical Analysis of Compositional Data Monographs on Statistics andApplied Probability. Chapman & Hall Ltd., London (UK). 416p.
Mart\’in-Fern\’andez, J.A.; Barcel\’o-Vidal, C. and Pawlowsky-Glahn, V. (2003) Dealing With Ze-ros and Missing Values in Compositional Data Sets Using Nonparametric Imputation. Mathemati-cal Geology, 35 , 253-278
http://ima.udg.es/Activitats/CoDaWork03
http://ima.udg.es/Activitats/CoDaWork05
See Also
compositions.missings,getDetectionlimit
Examples
data(SimulatedAmounts)x <- acomp(sa.lognormals)xnew <- simulateMissings(x,dl=0.05,knownlimit=FALSE)xnewxrep <- zeroreplace(xnew,0.05)xrep
Index
∗Topic IOmix.Read, 126Read standard data files, 195
∗Topic NAgetdetectionlimit, 83missingProjector, 123missingsummary, 125
∗Topic aplotreplot, 197ternaryAxis, 241
∗Topic classesacomp, 12ccomp, 44is.acomp, 102print.acomp, 178transformations from ’mixtures’
to ’compositions’ classes, 244∗Topic cluster
ClusterFinder1, 56∗Topic datagen
simulatemissings, 229∗Topic datasets
Activity10, 18Activity31, 19AnimalVegetation, 23ArcticLake, 28Bayesite, 36Blood23, 40Boxite, 41ClamEast, 50ClamWest, 51Coxite, 70DiagnosticProb, 72Firework, 78Glacial, 84Hongite, 89HouseholdExp, 91Hydrochem, 91jura, 106
Kongite, 109Metabolites, 119PogoJump, 164Sediments, 217SerumProtein, 219ShiftOperators, 220SimulatedAmounts, 222Skulls, 231SkyeAFM, 231Supervisor, 240WhiteCells, 256Yatquat, 257
∗Topic debuggingtryDebugger, 245
∗Topic distributionrAitchison, 184rMahalanobis, 200rnorm, 206rpois, 212runif, 213
∗Topic dynamickingTetrahedron, 107
∗Topic hplotCoDaDendrogram, 58pairwiseplot, 145plot.acomp, 150pwlrPlot, 180simplemissingplot, 221
∗Topic htestccompgof, 45fitdirichlet, 79fitSameMeanDifferentVarianceModel,
80gausstest, 81gof, 85NormalTests, 134
∗Topic logicbinary, 37
∗Topic multivariate
260
INDEX 261
acomparith, 14acompmargin, 16acompscalarproduct, 17alr, 21aplus, 23aplusarithm, 25apt, 27arrows3D, 29as.data.frame, 30axis3D, 31balance, 32barplot.acomp, 34biplot3D, 39boxplot, 42cdt, 48clo, 52clr, 53clr2ilr, 55coloredBiplot, 61colorsForOutliers, 63CompLinModCoReg, 64compOKriging, 66ConfRadius, 67cor.acomp, 68cpt, 71dist, 73ellipses, 74endmemberCoordinates, 76groupparts, 88HotellingsTsq, 90idt, 93iit, 95ilr, 96ilrBase, 98ilt, 99ipt, 100IsMahalanobisOutlier, 103isoPortionLines, 104lines, 110logratioVariogram, 111lrvgram, 113MahalanobisDist, 114matmult, 116mean.acomp, 117missing.compositions, 120mvar, 128names, 130norm, 131
normalize, 133oneOrDataset, 135outlierclassifier, 136outlierplot, 138parametricMat, 147perturbe, 148plot.aplus, 153plot3D, 155plot3Dacomp, 156plot3Daplus, 158plot3Drmult, 159plot3Drplus, 160plotlogratioVariogram, 162plotmissingsummary, 163powerofpsdmatrix, 165princomp.acomp, 166princomp.aplus, 169princomp.rcomp, 172princomp.rmult, 174princomp.rplus, 176qqnorm, 182R2, 183ratioLoadings, 187rcomp, 189rcomparithm, 192rcompmargin, 193rDirichlet, 194rlnorm, 199rmult, 202rmultarithm, 203rmultmatmult, 204rplus, 209rplusarithm, 211scalar, 214scale, 215segments, 218split, 232straight, 233summary.acomp, 235summary.aplus, 236summary.rcomp, 237sumprojector, 238totals, 243ult, 246var.acomp, 247variation, 250variograms, 251varmlm, 253
262 INDEX
vcovAcomp, 254vgmFit, 255zeroreplace, 258
∗Topic packagecompositions-package, 5
∗Topic robustoutliersInCompositions, 142robustnessInCompositions, 208
∗Topic univargeometricmean, 82meanrow, 118
*.acomp (acomparith), 14*.aplus, 149*.aplus (aplusarithm), 25*.rcomp (rcomparithm), 192*.rmult (rmultarithm), 203*.rplus (rplusarithm), 211+.acomp, 192, 212+.acomp (perturbe), 148+.aplus (aplusarithm), 25+.rcomp, 191+.rcomp (rcomparithm), 192+.rmult, 25, 192, 203, 212+.rmult (rmultarithm), 203+.rplus, 149+.rplus (rplusarithm), 211-.acomp (perturbe), 148-.aplus (aplusarithm), 25-.rcomp (rcomparithm), 192-.rmult (rmultarithm), 203-.rplus (rplusarithm), 211/.acomp (acomparith), 14/.aplus (aplusarithm), 25/.rcomp (rcomparithm), 192/.rmult (rmultarithm), 203/.rplus (rplusarithm), 211%*% (matmult), 116%*%.acomp (acompscalarproduct), 17%*%.aplus (acompscalarproduct), 17%*%.rmult (rmultmatmult), 204%*%, 116%*%.rmult, 18, 26, 116, 203–205
acomp, 10, 12, 15–17, 24, 25, 35, 49, 52, 53,60, 94, 102, 117, 118, 122, 123, 129,149, 167, 168, 179–181, 190, 191,193, 203, 215, 223, 236, 245,249–251
acomparith, 14
acompGOF.test (gof), 85acompmargin, 16, 150, 152, 194acompNormalGOF.test (gof), 85acompNormalLocation.test, 81acompNormalLocation.test (NormalTests),
134acompscalarproduct, 17acompTwoSampleGOF.test (gof), 85Activity10, 18Activity31, 19AIC, 68, 184AitchisonDistributionIntegrals
(rAitchison), 184alr, 13, 15, 21, 22, 24, 28, 54, 97, 191, 247alrInv (alr), 21AnimalVegetation, 23aplus, 10, 14, 23, 35, 49, 74, 89, 94, 95, 100,
102, 117, 118, 122, 129, 131, 170,171, 210, 211, 215, 223, 237, 240,244, 245, 249–251
aplusarithm, 25apt, 22, 27, 54, 71, 72, 97, 101, 191, 247aptInv (apt), 27ArcticLake, 28arrows3D, 29, 31, 32, 40as.data.frame, 30as.missingSummary (plotmissingsummary),
163axis3D, 31
balance, 32, 146balance01 (balance), 32balanceBase, 58, 60balanceBase (balance), 32barplot, 35, 163, 172, 176, 189barplot.acomp, 14, 34, 166, 168, 169, 179,
240barplot.aplus, 25, 171barplot.aplus (barplot.acomp), 34barplot.ccomp, 45barplot.ccomp (barplot.acomp), 34barplot.rcomp, 191barplot.rcomp (barplot.acomp), 34barplot.rplus, 211barplot.rplus (barplot.acomp), 34Bayesite, 36BDL (missing.compositions), 120BDLvalue (missing.compositions), 120binary, 37
INDEX 263
binary2logical (binary), 37biplot, 39, 62biplot3D, 39bit (binary), 37bit<- (binary), 37bitCount (binary), 37Blood23, 40Boxite, 41boxplot, 42, 181boxplot.acomp, 14, 35, 152, 155, 179, 183,
198, 240, 243boxplot.aplus, 25boxplot.rcomp, 191boxplot.rplus, 211bxp, 43
ccomp, 44ccompgof, 45ccompMultinomialGOF.test, 45ccompMultinomialGOF.test (ccompgof), 45ccompPoissonGOF.test, 45ccompPoissonGOF.test (ccompgof), 45cdt, 48, 73, 94, 117, 129, 215, 249, 251cdt.ccomp, 45cdtInv, 94cdtInv (cdt), 48cdtInv.ccomp, 45cgram2vgram (lrvgram), 113ClamEast, 50ClamWest, 51clo, 13, 52, 118, 190, 229, 238, 249, 251clr, 13–15, 22, 24, 33, 49, 53, 53, 72, 97, 99,
157, 168, 179, 191, 240clr2ilr, 55clrInv (clr), 53clrvar2ilr, 249, 251clrvar2ilr (clr2ilr), 55ClusterFinder1, 56, 104, 137, 141, 143, 209CoDaDendrogram, 58coloredBiplot, 61, 140colorsForOutliers, 63colorsForOutliers1 (colorsForOutliers),
63colorsForOutliers2 (colorsForOutliers),
63CompLinModCoReg, 64, 65, 67, 112, 114, 148,
162, 252, 256compOKriging, 66
composition.missing(missing.compositions), 120
composition.missings(missing.compositions), 120
compositions (compositions-package), 5compositions-package, 5, 11, 123, 143, 209compositions.missing, 13, 25, 45, 53, 118,
126, 191, 210compositions.missing
(missing.compositions), 120compositions.missings, 84, 164, 179, 230,
259compositions.missings
(missing.compositions), 120ConfRadius, 67, 90convex.rcomp, 191convex.rcomp (rcomparithm), 192cor, 69cor (cor.acomp), 68cor.acomp, 68cov, 129, 248, 249cov (var.acomp), 247cov.acomp, 14, 240cov.aplus, 25cov.rcomp, 191cov.rplus, 211covMcd, 117, 208Coxite, 70cpt, 27, 28, 49, 71, 101, 102, 174, 191, 192,
212cptInv (cpt), 71
dAitchison (rAitchison), 184Data01 (Hongite), 89Data02 (Kongite), 109Data03 (Boxite), 41Data04 (Coxite), 70Data05 (ArcticLake), 28Data06 (SkyeAFM), 231Data07 (Supervisor), 240DATA08 (HouseholdExp), 91Data09 (Metabolites), 119DATA10 (Activity10), 18Data11 (WhiteCells), 256Data12 (Yatquat), 257Data13 (Firework), 78Data14 (ClamEast), 50Data15 (ClamWest), 51Data16 (SerumProtein), 219
264 INDEX
Data17 (DiagnosticProb), 72DATA18 (Glacial), 84Data19 (PogoJump), 164Data20 (Sediments), 217Data21 (Bayesite), 36Data22 (ShiftOperators), 220Data23 (Blood23), 40Data24 (Skulls), 231Data25 (AnimalVegetation), 23DATA31 (Activity31), 19debugger, 246density, 59dev.copy, 197DiagnosticProb, 72dist, 13, 24, 73, 73, 115, 202dlnorm.rplus (rlnorm), 199dnorm.acomp (rnorm), 206dnorm.aplus (rnorm), 206dnorm.rmult (rnorm), 206
ellipses, 74, 90endmemberCoordinates, 76endmemberCoordinatesInv
(endmemberCoordinates), 76
Firework, 78fitDirichlet, 47, 87, 135fitDirichlet (fitdirichlet), 79fitdirichlet, 79fitSameMeanDifferentVarianceModel, 45,
80
gadic (binary), 37Gauss.test (gausstest), 81gausstest, 81geometricmean, 82, 118geometricmeanCol (geometricmean), 82geometricmeanRow (geometricmean), 82getDetectionlimit, 259getDetectionlimit (getdetectionlimit),
83getdetectionlimit, 83Glacial, 84gof, 85groupparts, 88groupparts.ccomp, 45gsi, 40gsi.acompUniformityGOF.test (gof), 85gsi.geometricmean (geometricmean), 82
gsi.geometricmeanCol (geometricmean), 82gsi.geometricmeanRow (geometricmean), 82gsi.merge2signary, 59gsi.orSum (binary), 37gsi.pStore, 201gsi.pStore (rMahalanobis), 200
has.missings (missing.compositions), 120hclust, 58, 59, 139HEMF (HouseholdExp), 91histogram, 59Hongite, 89HotellingsTsq, 90HouseholdExp, 91Hydrochem, 91
identify, 151, 154idt, 49, 59, 93, 103, 129idt.ccomp, 45idtInv (idt), 93idtInv.ccomp, 45iit, 49, 94, 95, 100, 178, 210, 211iitInv, 94iitInv (iit), 95ilr, 13, 15, 22, 24, 33, 54, 56, 94, 96, 96, 99,
100, 102, 123, 157, 191, 247ilr2clr (clr2ilr), 55ilrBase, 33, 59, 60, 97, 98, 102ilrBaseList (ilrBase), 98ilrInv, 94ilrInv (ilr), 96ilrvar2clr (clr2ilr), 55ilt, 24, 25, 49, 94, 96, 99, 159, 170, 171, 210iltInv, 94iltInv (ilt), 99ipt, 27, 28, 33, 56, 71, 72, 94, 99, 100, 157,
191iptInv, 94iptInv (ipt), 100is.acomp, 102is.aplus (is.acomp), 102is.BDL (missing.compositions), 120is.ccomp, 45is.ccomp (is.acomp), 102is.MAR (missing.compositions), 120is.MNAR (missing.compositions), 120is.MNV (missing.compositions), 120is.NMV (missing.compositions), 120is.rcomp (is.acomp), 102
INDEX 265
is.rmult (is.acomp), 102is.rplus (is.acomp), 102is.SZ (missing.compositions), 120is.WMNAR (missing.compositions), 120is.WZERO (missing.compositions), 120IsMahalanobisOutlier, 103isoPortionLines, 104, 152isoProportionLines (isoPortionLines),
104
jura, 106jura259 (jura), 106juraset (jura), 106
Kappa (ult), 246kingTetrahedron, 107, 152, 157, 158, 243kmeans, 58Kongite, 109
lines, 110lines.acomp, 218, 234lm, 68, 184lmrob, 146, 180logratioVariogram, 111, 114, 162, 255, 256lrvgram, 113
mahalanobis, 115, 202MahalanobisDist, 114, 201MAR (missing.compositions), 120MARvalue (missing.compositions), 120matmult, 116maxBit (binary), 37mcor (mvar), 128mcov (mvar), 128mean, 117, 119mean.acomp, 14, 69, 117, 123, 129, 168, 179,
209, 235, 240, 249, 251mean.aplus, 25, 171mean.aplus (mean.acomp), 117mean.ccomp, 45mean.ccomp (mean.acomp), 117mean.col (meanrow), 118mean.rcomp, 191mean.rcomp (mean.acomp), 117mean.rmult (mean.acomp), 117mean.row (meanrow), 118mean.rplus, 83, 119, 211mean.rplus (mean.acomp), 117meanCol, 117, 118
meanCol (meanrow), 118meanRow (meanrow), 118meanrow, 118Metabolites, 119missing.compositions, 120missingProjector, 123, 239, 240missings, 209missings (missing.compositions), 120missingsInCompositions, 11, 123, 124, 143missingsInCompositions
(missing.compositions), 120missingSummary, 121, 164missingSummary (missingsummary), 125missingsummary, 125missingType, 43missingType (missingsummary), 125mix.2acomp (transformations from
’mixtures’ to ’compositions’classes), 244
mix.2aplus (transformations from’mixtures’ to ’compositions’classes), 244
mix.2rcomp (transformations from’mixtures’ to ’compositions’classes), 244
mix.2rmult (transformations from’mixtures’ to ’compositions’classes), 244
mix.2rplus (transformations from’mixtures’ to ’compositions’classes), 244
mix.Read, 126MNAR (missing.compositions), 120MNARvalue (missing.compositions), 120msd, 14, 25, 191, 211, 240msd (mvar), 128mul.rplus (rplusarithm), 211mvar, 68, 128, 184
na.fail, 119na.omit, 119na.pass, 119names, 130names.ccomp, 45names<-.acomp (names), 130names<-.aplus (names), 130names<-.ccomp (names), 130names<-.rcomp (names), 130names<-.rmult (names), 130
266 INDEX
names<-.rplus (names), 130nlm, 255, 256NMV (missing.compositions), 120noreplot (replot), 197norm, 131norm.rmult, 133, 203normalize, 132, 133NormalTests, 134
observeWithAdditiveError(simulatemissings), 229
observeWithDetectionLimit(simulatemissings), 229
observeWithDetectionlimit(simulatemissings), 229
oneOrDataset, 135outlierclassifier, 136OutlierClassifier1, 63, 64, 103, 104, 115,
141–143, 202, 209OutlierClassifier1 (outlierclassifier),
136outlierplot, 38, 104, 137, 138, 142, 143, 209outliersInCompositions, 11, 57, 103, 123,
137, 139, 142, 143, 208
pairs, 152pairwisePlot, 181pairwisePlot (pairwiseplot), 145pairwiseplot, 145par, 58, 59, 146, 150, 154, 242parameterPosdefClrMat (parametricMat),
147parameterPosdefMat (parametricMat), 147parameterRank1ClrMat (parametricMat),
147parameterRank1Mat (parametricMat), 147parametricMat, 147parametricPosdefClrMat (parametricMat),
147parametricPosdefMat (parametricMat), 147parametricRank1ClrMat (parametricMat),
147parametricRank1Mat (parametricMat), 147pchForOutliers1 (colorsForOutliers), 63pEmpiricalMahalanobis (rMahalanobis),
200perturbe, 13, 148perturbe.aplus, 24perturbe.aplus (aplusarithm), 25
pf, 90pHotellingsTsq (HotellingsTsq), 90plot, 30, 32, 156, 158–161plot.acomp, 14, 35, 44, 62, 75, 106, 109, 111,
123, 150, 179, 183, 198, 218, 234,240
plot.aplus, 25, 146, 152, 153, 155, 181, 198,222, 243
plot.ccomp, 45plot.ccomp (plot.acomp), 150plot.default, 181plot.logratioVariogram
(plotlogratioVariogram), 162plot.missingSummary
(plotmissingsummary), 163plot.princomp.acomp (princomp.acomp),
166plot.princomp.aplus (princomp.aplus),
169plot.princomp.rcomp (princomp.rcomp),
172plot.princomp.rplus (princomp.rplus),
176plot.rcomp, 191plot.rcomp (plot.acomp), 150plot.relativeLoadings.princomp.acomp
(ratioLoadings), 187plot.relativeLoadings.princomp.aplus
(ratioLoadings), 187plot.relativeLoadings.princomp.rcomp
(ratioLoadings), 187plot.relativeLoadings.princomp.rplus
(ratioLoadings), 187plot.rmult (plot.aplus), 153plot.rplus, 211plot.rplus (plot.aplus), 153plot3D, 30, 32, 152, 155, 158, 159, 161, 243plot3D.acomp, 156, 158–161plot3D.acomp (plot3Dacomp), 156plot3D.aplus, 156, 158–161plot3D.aplus (plot3Daplus), 158plot3D.rcomp, 156, 158–161plot3D.rcomp (plot3Dacomp), 156plot3D.rmult, 156, 158–161plot3D.rmult (plot3Drmult), 159plot3D.rplus, 156, 158–160plot3D.rplus (plot3Drplus), 160plot3Dacomp, 156
INDEX 267
plot3Daplus, 158plot3Drmult, 159plot3Drplus, 160plotlogratioVariogram, 162plotmissingsummary, 163pMaxMahalanobis (rMahalanobis), 200PogoJump, 164points3d, 30, 32, 156, 158–161PoissonGOF.test, 45PoissonGOF.test (ccompgof), 45power.acomp, 13power.acomp (acomparith), 14power.aplus, 24power.aplus (aplusarithm), 25powerofpsdmatrix, 165pPortionMahalanobis (rMahalanobis), 200pQuantileMahalanobis, 143pQuantileMahalanobis (rMahalanobis), 200predict.princomp.acomp
(princomp.acomp), 166predict.princomp.aplus
(princomp.aplus), 169predict.princomp.rcomp
(princomp.rcomp), 172predict.princomp.rplus
(princomp.rplus), 176princomp.acomp, 14, 166, 170, 171, 174, 188,
189, 240princomp.aplus, 25, 168, 169, 178, 188, 189princomp.default, 174princomp.rcomp, 168, 172, 178, 188, 189, 191princomp.rmult, 174princomp.rplus, 171, 174, 175, 176, 188,
189, 211print.acomp, 178print.aplus (print.acomp), 178print.princomp.acomp (princomp.acomp),
166print.princomp.aplus (princomp.aplus),
169print.princomp.rcomp (princomp.rcomp),
172print.princomp.rplus (princomp.rplus),
176print.rcomp (print.acomp), 178print.relativeLoadings.princomp.acomp
(ratioLoadings), 187print.relativeLoadings.princomp.aplus
(ratioLoadings), 187print.relativeLoadings.princomp.rcomp
(ratioLoadings), 187print.relativeLoadings.princomp.rplus
(ratioLoadings), 187print.rmult (rmult), 202print.rplus (print.acomp), 178pwlrPlot, 146, 180pwlrplot (pwlrPlot), 180
qEmpiricalMahalanobis, 142qEmpiricalMahalanobis (rMahalanobis),
200qHotellingsTsq (HotellingsTsq), 90qMaxMahalanobis, 142qMaxMahalanobis (rMahalanobis), 200qPortionMahalanobis (rMahalanobis), 200qqnorm, 182qqnorm.acomp, 44, 152, 155, 243
R2, 183rAitchison, 184ratioLoadings, 187rcomp, 10, 14, 24, 35, 49, 52, 53, 60, 94, 102,
117, 118, 122, 129, 173, 174, 189,192–194, 210–212, 215, 234, 238,240, 245, 249, 251
rcomparithm, 192rcompmargin, 17, 150, 152, 193rDirichlet, 47, 80, 87, 135, 194rDirichlet.acomp, 187, 207, 214Read standard data files, 195read.geoEAS, 128read.geoEAS (Read standard data files),
195read.geoeas, 128read.geoeas (Read standard data files),
195read.table, 128, 196relativeLoadings, 167, 168, 170, 171, 173,
174, 177, 178relativeLoadings (ratioLoadings), 187rEmpiricalMahalanobis (rMahalanobis),
200replot, 197replotable (replot), 197rgl.material, 29, 31, 39, 155, 157–159, 161rlnorm, 199rlnorm.rplus, 207
268 INDEX
rlnorm.rplus (rlnorm), 199rMahalanobis, 200rMaxMahalanobis (rMahalanobis), 200rmult, 26, 77, 123, 177, 192, 202, 204, 205,
212, 245rmultarithm, 203rmultinom.ccomp, 45rmultinom.ccomp (rpois), 212rmultmatmult, 204rnorm, 206rnorm.acomp, 47, 80, 87, 135, 183, 187, 195,
199rnorm.acomp (rnorm), 206rnorm.aplus, 183rnorm.aplus (rnorm), 206rnorm.ccomp, 45, 213rnorm.ccomp (rnorm), 206rnorm.rcomp, 183rnorm.rcomp (rnorm), 206rnorm.rmult (rnorm), 206rnorm.rplus, 183rnorm.rplus (rnorm), 206robust (robustnessInCompositions), 208robustnessInCompositions, 11, 103, 114,
123, 139, 142, 143, 151, 155, 166,201, 208, 216, 235, 237, 238, 249,250
rplus, 10, 24, 25, 35, 49, 94–96, 102, 117,118, 122, 129, 177, 178, 191, 203,209, 215, 237, 245, 249–251
rplusarithm, 211rpois, 212rpois.ccomp, 45, 206rpois.ccomp (rpois), 212rPortionMahalanobis (rMahalanobis), 200Rsquare (R2), 183runif, 213runif.acomp, 47, 80, 87, 135, 187, 207
sa.dirichlet (SimulatedAmounts), 222sa.dirichlet5 (SimulatedAmounts), 222sa.groups (SimulatedAmounts), 222sa.groups5 (SimulatedAmounts), 222sa.lognormals (SimulatedAmounts), 222sa.lognormals5 (SimulatedAmounts), 222sa.missings (SimulatedAmounts), 222sa.missings5 (SimulatedAmounts), 222sa.outliers1 (SimulatedAmounts), 222sa.outliers2 (SimulatedAmounts), 222
sa.outliers3 (SimulatedAmounts), 222sa.outliers4 (SimulatedAmounts), 222sa.outliers5 (SimulatedAmounts), 222sa.outliers6 (SimulatedAmounts), 222sa.tnormals (SimulatedAmounts), 222sa.tnormals5 (SimulatedAmounts), 222sa.uniform (SimulatedAmounts), 222sa.uniform5 (SimulatedAmounts), 222scalar, 203, 214scale, 151, 154, 215, 216scale.default, 216Sediments, 217segments, 218SerumProtein, 219shapiro.test, 182ShiftOperators, 220simplemissingplot, 221simpleMissingSubplot
(simplemissingplot), 221SimulatedAmounts, 222simulateMissings (simulatemissings), 229simulatemissings, 229Skulls, 231Skye, 232SkyeAFM, 231solve, 114spineplot, 181split, 216, 232, 233split.ccomp, 45straight, 111, 233stripchart, 140summary, 237, 238summary.acomp, 235, 237, 238summary.aplus, 236, 238summary.rcomp, 237, 237summary.rmult (summary.aplus), 236summary.rplus (summary.aplus), 236sumMissingProjector, 124sumMissingProjector (sumprojector), 238sumprojector, 238Supervisor, 240SZ (missing.compositions), 120SZvalue (missing.compositions), 120
t.test, 82ternaryAxis, 241text, 221totals, 243totals.ccomp, 45
INDEX 269
transformations from ’mixtures’ to’compositions’ classes, 244
tryDebugger, 245
uciptInv (ipt), 100ult, 246ultInv (ult), 246unbinary (binary), 37update, 197
var, 129, 248, 249var (var.acomp), 247var.acomp, 14, 69, 136, 168, 179, 209, 240,
247var.aplus, 25var.lm (varmlm), 253var.mlm (varmlm), 253var.rcomp, 191var.rmult, 170, 172, 175, 176var.rplus, 211variation, 249, 250variation.acomp, 14, 179, 235, 240variation.aplus, 25variation.rcomp, 191variation.rplus, 211variograms, 251varmlm, 253vcov, 253, 254vcovAcomp, 254vgmFit, 65, 67, 112, 114, 148, 252, 255vgmFit2lrv (vgmFit), 255vgmGetParameters (vgmFit), 255vgmGof (vgmFit), 255vgmSetParameters (vgmFit), 255vgram.cardsin (variograms), 251vgram.exp (variograms), 251vgram.gauss (variograms), 251vgram.lin (variograms), 251vgram.nugget (variograms), 251vgram.pow (variograms), 251vgram.sph (variograms), 251vgram2lrvgram, 65, 67, 112, 148, 162, 252,
256vgram2lrvgram (lrvgram), 113vp.boxplot (boxplot), 42vp.logboxplot (boxplot), 42vp.qqnorm (qqnorm), 182
whichBits (binary), 37
WhiteCells, 256
xsimplex, 186
Yatquat, 257
zeroreplace, 84, 122, 123, 179, 258