23
3/26/02 6:24 PM File = /home/website/convert/temp/convert_html/5e5d40fda812a874934c50ff/document.doc 1 John Miyamoto Useful R Commands (Top); Useful R Information Sources (Bottom) Contents (Cntrl-left click on a link to jump to the corresponding section) Section Topic 1 Useful R Functions 2 Notes re plot and lines 3 Regular expressions in R 4 Special characters like "\n" 5 Useful Links and Documents (Bookmarks: 's2', 'Links', 'Docs') 6 Notes re the Built-In R Editor 7 Possible Editors for R Programming 8 Notes on R-Code/Computing (see also 'e:\r\notes' and 'e:\r\ docs') 9. Useful R Functions Function Purpose "imputation" See packages 'cat', 'norm', and 'mix' for various imputation functions. See, also, the 'transcan' function in the 'Hmisc' package. See, also, http://hesweb1.med.virginia.edu/biostat/rms. %in% match If X and Y are vectors, X %in% Y is a vector. length(X %in% Y) = length(X). The i-th element of X %in% Y equals TRUE if X[i] is an element of Y and is FALSE otherwise. The match function serves similar purposes but is more easily modified. .libPaths .libPaths gets/sets the library trees within which packages are looked for. '.libPaths(new)' where 'new' is a character vector with the locations of R library trees. .Platform R.version Sys.info() .Platform is a list with some details of the platform under which R was built. This provides means to write OS portable R code. E.g., '.Platform$Os.type' gives the operating system that the current version was written for. See also 'R.version' and 'Sys.info()'. <<-, ->> The operators <<- and ->> cause a search to made through the environment for an existing definition of the variable being assigned. If such a variable is found then its value is redefined, otherwise assignment takes place globally. Note that their semantics differ from that in the S language, but is useful in conjunction with the scoping rules of R. all.equal all.equal(x,y) is a utility to compare R objects x and y testing ``near equality''. If they are different,

John Miyamoto - University of Washington · Web viewJohn Miyamoto Useful R Commands (Top); Useful R Information Sources (Bottom) Contents (Cntrl-left click on a link to jump to the

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: John Miyamoto - University of Washington · Web viewJohn Miyamoto Useful R Commands (Top); Useful R Information Sources (Bottom) Contents (Cntrl-left click on a link to jump to the

3/26/02 11:24 AM File = /tt/file_convert/5e5d40fda812a874934c50ff/document.doc 1John Miyamoto

Useful R Commands (Top); Useful R Information Sources (Bottom)

Contents (Cntrl-left click on a link to jump to the corresponding section)

Section Topic1 Useful R Functions2 Notes re plot and lines 3 Regular expressions in R4 Special characters like "\n"5 Useful Links and Documents (Bookmarks: 's2', 'Links', 'Docs')6 Notes re the Built-In R Editor7 Possible Editors for R Programming8 Notes on R-Code/Computing (see also 'e:\r\notes' and 'e:\r\docs')

9. Useful R FunctionsFunction Purpose"imputation" See packages 'cat', 'norm', and 'mix' for various imputation

functions. See, also, the 'transcan' function in the 'Hmisc' package. See, also, http://hesweb1.med.virginia.edu/biostat/rms.

%in%match

If X and Y are vectors, X %in% Y is a vector. length(X %in% Y) = length(X). The i-th element of X %in% Y equals TRUE if X[i] is an element of Y and is FALSE otherwise. The match function serves similar purposes but is more easily modified.

.libPaths .libPaths gets/sets the library trees within which packages are looked for. '.libPaths(new)' where 'new' is a character vector with the locations of R library trees.

.PlatformR.versionSys.info()

.Platform is a list with some details of the platform under which R was built. This provides means to write OS portable R code. E.g., '.Platform$Os.type' gives the operating system that the current version was written for. See also 'R.version' and 'Sys.info()'.

<<-, ->> The operators <<- and ->> cause a search to made through the environment for an existing definition of the variable being assigned. If such a variable is found then its value is redefined, otherwise assignment takes place globally. Note that their semantics differ from that in the S language, but is useful in conjunction with the scoping rules of R.

all.equal all.equal(x,y) is a utility to compare R objects x and y testing ``near equality''. If they are different, comparison is still made to some extent, and a report of the differences is returned. Don't use 'all.equal' directly in if expressions—either use 'identical' or combine the two, as shown in the documentation for 'identical'. See also 'identical' and '=='.

AnalyzeFMRI Package for analyzing fMRI images. arrows Draw arrows between pairs of points. Also useful for creating error

bars (set 'angle' to 90). barplot Makes bar plots. Note that 'x.loc <- barplot(....)' stores that x-axis

midpoints of each bar in a vector. barplot2 Makes error bars in barplots (in 'gregmisc' package, 'gplots'

functions)basename 'basename' removes all of the path up to the last path separator (if

Page 2: John Miyamoto - University of Washington · Web viewJohn Miyamoto Useful R Commands (Top); Useful R Information Sources (Bottom) Contents (Cntrl-left click on a link to jump to the

3/26/02 11:24 AM File = /tt/file_convert/5e5d40fda812a874934c50ff/document.doc 2any). See also 'dirname'.

bmpjpegpng

Output a graphics object to file in bmp, jpeg (jpg) and png formats.

body Get or set the body of a function. 'body(v.names)' returns the code that defines the 'v.names' function.tmpfun <- function(a, b=2){}body(tmpfun) <- quote({z <- a + b; z^2})tmpfun(3)[1] 25

bquote bquote quotes its argument except that terms wrapped in .() are evaluated in the specified where environment.

bringToTop bringToTop brings the specified screen device's window to the front of the window stack (and gives it focus). With argument -1, it brings the console to the top. This function is useful for manipulating the screen appearance of R output (including graphics output).

browser Interrupt the execution of an expression and allow the inspection of the environment where browser was called from. Entering 'Q' exits from the browser.

by Function by is an object-oriented wrapper for tapply applied to data frames. A data frame is split by row into data frames subsetted by the values of one or more factors, and function FUN is applied to each subset in term.

choose(n, k) Functions choose and lchoose return binomial coefficients and their logarithms.

choose.files Use a Windows file dialog to choose a list of zero or more files interactively.

citation Gives examples and explanation for how to cite R and R-packages.colorsrainbowheat.colorsterrain.colorstopo.colorscm.colors

'colors' returns the built-in color names which R knows about.The other functions create a vector of n “contiguous” colors. Conceptually, all of these functions actually use (parts of) a line cut out of the 3-dimensional color space, parametrized by hsv(h,s,v, gamma), where gamma=1 for the foo.colors function, and hence, equispaced hues in RGB space tend to cluster at the red, green and blue primaries. Some applications such as contouring require a palette of colors which do not “wrap around” to give a final color close to the starting one.

colSums rowSums colMeansrowMeans

Compute row and column sums and means for numeric arrays.

colSums, rowSums, colMeans, rowMeans

These functions are equivalent to use of apply with FUN = mean or FUN = sum with appropriate margins, but are a lot faster. As they are written for speed, they blur over some of the subtleties of NaN and NA. If na.rm = FALSE and either NaN or NA appears in a sum, the result will be one of NaN or NA, but which might be platform-dependent.

convert.mod.w2o,Win2OpenBUGS

These functions convert a WinBUGS model files to OpenBUGS model files. convert.mod.w2o allows the user to write a comment at the head of the resulting OpenBUGS model file; typically, this comment indicates the WinBUGS model that was the source of the OpenBUGS model file, although the user can write any comment that he or she chooses.

count.fields count.fields counts the number of fields, as separated by 'sep', in each of the lines of file read. This used to be used by 'read.table' . Useful for debugging.

Page 3: John Miyamoto - University of Washington · Web viewJohn Miyamoto Useful R Commands (Top); Useful R Information Sources (Bottom) Contents (Cntrl-left click on a link to jump to the

3/26/02 11:24 AM File = /tt/file_convert/5e5d40fda812a874934c50ff/document.doc 3csimintcsimtestsimintsimtest

Multiple comparison procedures for anova (oneway anova only?) in the package multcomp.

curve Draws a curve corresponding to the given function or expression (in x) over the interval [from,to].

density The (S3) generic function density computes kernel density estimates. Its default method does so with the given kernel and bandwidth for univariate observations. Computes an estimated density function from a univariate sample of data.

diag Extract or replace the diagonal of a matrix, or construct a diagonal matrix.

dir Retrieve file information on a Windows machine. (Like "dir" in Dos.)dirname 'dirname' returns the part of the path up to (but excluding) the last

path separator, or "." if there is no path separator. do.call Executes a function call from the name of the function and a list of

arguments to be passed to it. E.g., do.call("rbind", list.of.frames) applies 'rbind' to the dataframes in the list.

dput Writes an ASCII text representation of an R object to a file or connection, or uses one to recreate the object. dput opens file and deparses the object x into that file. The object name is not written (contrary to dump). If x is a function the associated environment is stripped. Hence scoping information can be lost. See also 'dump'.

dump This function takes a vector of names of R objects and produces text representations of the objects on a file or connection. A dump file can usually be sourced into another R (or S) session. See also 'dput'.

duplicated Determines which elements of a vector or data frame are duplicates of elements with smaller subscripts, and returns a logical vector indicating which elements (rows) are duplicates.

eigen Computes eigenvalues and eigenvectors. escape: \n, \t, \b, \r, \\,\"

escape characters in character strings; '\n' is newline; '\t' is tab; '\b' is backspace; '\r' is carriage return; '\\' is single backslash; '\"' is single quote. See ?Quotes for more escape characters.

evalevalqeval.parentwith

Evaluate an R expression in a specified environment.

eval(parse(text=string)) If 'string' is a character vector that represents a valid R command, e.g., 'string <- "tm1 <- 3*4" ', then 'eval(parse(text=string))' has the same effect as executing the command that is represented syntactically in 'string'. 'source()' builds on 'parse()' and 'eval()', i.e.,parse() : character --> expressioneval () : expression --> [evaluated result].

expand.grid Create a data frame from all combinations of the supplied vectors or factors. E.g., 'expand.grid(1:3, 1:4)' creates a dataframe with 12 rows, containing all combinations of {1, 2, 3} with {1, 2, 3, 4}.

file.choose Choose a file interactively. User specifies file to be opened or created by means of a dialog box.

file.create, file.exists, file.remove, file.rename, file.append, file.copy, dir.create

Functions for creating files, testing for existence of files, removing files, renaming files, appending to files, copying files, and creating directories.

file.exists Checks whether a file exists in a specified directory.

Page 4: John Miyamoto - University of Washington · Web viewJohn Miyamoto Useful R Commands (Top); Useful R Information Sources (Bottom) Contents (Cntrl-left click on a link to jump to the

3/26/02 11:24 AM File = /tt/file_convert/5e5d40fda812a874934c50ff/document.doc 4file.info Utility function to extract information about files on the user's file

systems. file.path Construct the path to a file from components in a platform-

independent way. Useful for writing platform-independent code that includes references to specific files.Usage: file.path(..., fsep = .Platform$file.sep)Arguments: ... = character vectors; fsep = the path separator to use.Value: A character vector of the arguments concatenated term-by-term and separated by fsep if all arguments have positive length; otherwise, an empty character vector. E.g., file.path("c:/mydata", "project.rda")

filter 'ts' package: Computes running means. fitdistr MASS package: Maximum-likelihood fitting of univariate

distributions, allowing parameters to be held fixed if desired. fractions Computes the integers ratio that produces the decimal

representation of the ratio, e.g., fractions(.3333) returns "1/3". get Search for an R object with a given name and return it. Useful for

retrieving an R object from a different environment on the search path or for retrieving an R object with a name that is constructed by a function or program.

getwd() Returns the absolute filename representing the current working directory of the R process.

gl Generate factors by specifying the pattern of their levels. E.g., > gl(3,4) [1] 1 1 1 1 2 2 2 2 3 3 3 3Levels: 1 2 3

glm( ..., family = binomial)

Logistic regression. Note that the summary function gives results in the log odds ratio form. You need to transform back to odds ratios.

head Returns the first or last parts of a vector, matrix, data frame or function. Useful for viewing or extracting a small subset of cases.

head Returns the first few rows of a dataframe or matrix (or vector).help.search Searches the R help system for documentation matching a given

character string in the (file) name, alias, title, concept or keyword entries (or any combination thereof), using either fuzzy matching or regular expression matching. Names and titles of the matched help entries are displayed nicely. See also 'RSiteSearch'.

identical The safe and reliable way to test two objects for being exactly equal. It returns TRUE in this case, FALSE in every other case. See also 'all.equal' and '=='.

ifelse(test, yes, no) 'ifelse' returns a value with the same shape as test which is filled with elements selected from either 'yes' or 'no' depending on whether the element of 'test' is TRUE or FALSE.

image Creates a grid of colored or gray-scale rectangles with colors corresponding to the values in z. This can be used to display three-dimensional or spatial data aka images.

image Creates a grid of colored or gray-scale rectangles with colors corresponding to the values in z. This can be used to display three-dimensional or spatial data aka “images”. This is a generic function. The functions heat.colors, terrain.colors and topo.colors create heat-spectrum (red to white) and topographical color schemes suitable for displaying ordered data, with n giving the number of colors desired.

interaction.plot Makes a line plot for a 2-way anova design with one factor on the X axis and the other factor with separate lines.

isoMDS (in MASS)sammon (in MASS)

Alternative functions that perform nonmetric multidimensional scaling (mds).

Page 5: John Miyamoto - University of Washington · Web viewJohn Miyamoto Useful R Commands (Top); Useful R Information Sources (Bottom) Contents (Cntrl-left click on a link to jump to the

3/26/02 11:24 AM File = /tt/file_convert/5e5d40fda812a874934c50ff/document.doc 5cmdscale (in mva)xgvis (in xgobi)layout Used for making multiple graphs on a single screen (like mfrow or

mfcol). Unlike mfrow or mfcol, row and column heights can be specified for individual rows or columns.

library('package name')library(help = MASS)ls(package:MASS)

Loads 'package name'. Lists all objects in MASS along with short description.Lists all objects in MASS if it is attached.

lines Add lines and points to a plot. See Section 10 for more info. list.files This function produces a list containing the names of files in the

named directory. dir is an alias. key = filenameslogtrans MASS package: Find and optionally plot the marginal likelihood for

alpha for a transformation model of the form log(y + alpha) ~ x1 + x2 + ....

make.names Make syntactically valid names out of character vectors. make.unique Makes the elements of a character vector unique by appending

sequence numbers to duplicates. Useful for creating unique names of variables or factor levels.

match.call match.call returns a call in which all of the arguments are specified by their names. The most common use is to get the call of the current function, with all arguments named, i.e., use 'match.call' if you want to return the current values of the parameters that were input to a function.

merge Analogous to the SPSS command, JOIN MATCH / TABLE. Combines input dataframes on common values for columns with identical names or user-specified columns.

mvrnorm MASS package: Generates multivariate normal random variables.mvtnorm 'mvtnorm' package generate probabilities under a multivariate

normal density function. Can be used to generate data with a given variance/covariance matrix (use 'rmvnorm'). Also, look at 'mvrnorm' in the MASS package.

n.gt.1 JM function that counts the number of cases in each condition (combination of factor levels) in a multifactor between subjects anova. The output is TRUE if the condition has more than 1 observation and FALSE if it has 0 or 1 observation. This function is useful in combination with 'studres' in the MASS library because this function sometimes yields a value of 0 where it should be NA when a cell has only 1 observation.

na.omit Omits cases from a dataframe if any variables have missing data.onecode JM function that creates a oneway anova factor from multiple input

factors. 'onecode' is useful whenever it is easer to treat a multifactor between subects anova as a oneway anova.

oneway.test Computes a oneway anova (between subjects). Default is to NOT assume homogeneity of variance.

optimize The function optimize searches the interval from lower to upper for a minimum or maximum of the function f with respect to its first argument.

p.adjust Multiple comparison procedure: Given a set of p values, returns p values adjusted using Holm method (default), Hochberg method, or Bonferroni method (in the base package).

'package functions'update.packagesavailable.packagesold.packagesnew.packagesdownload.packages

These functions can be used to automatically compare the version numbers of installed packages with the newest available version on the repositories and update outdated packages on the fly.

Page 6: John Miyamoto - University of Washington · Web viewJohn Miyamoto Useful R Commands (Top); Useful R Information Sources (Bottom) Contents (Cntrl-left click on a link to jump to the

3/26/02 11:24 AM File = /tt/file_convert/5e5d40fda812a874934c50ff/document.doc 6install.packagescontrib.urlpackage.skeleton 'package.skeleton' automates some of the setup for a new source

package. It creates directories, saves functions and data to appropriate places, and creates skeleton help files and ‘README’ files describing further steps in packaging.

pairs() Produces all pairs of scatter plots.pairwise.prop.testpairwise.t.testpairwise.tablepairwise.wilcox.test

Multiple comparison procedures for various tests (in the ctest package).

par("mai") A numerical vector of the form c(bottom, left, top, right) which gives the margin size specified in inches. (Use 'mai' or 'mar' but not both.)

par("mar") A numerical vector of the form 'c(bottom, left, top, right)' which gives the number of lines of margin to be specified on the four sides of the plot. The default is 'c(5, 4, 4, 2) + 0.1'.

par("mex") 'mex' is a character size expansion factor which is used to describe coordinates in the margins of plots. Note that this does not change the font size, rather specifies the size of font used to convert between 'mar' and 'mai', and between 'oma' and 'omi'. (Note that JM has been misusing this parameter until 8/21/2005.)

par("usr") Gives the min and max on the x and y axis for the current plot. par("xpd") Logical. If FALSE, all plotting is clipped to the plot region, if TRUE,

all plotting is clipped to the figure region, and if NA, all plotting is clipped to the device region. Can be used to put legend (or any other text) outside of the plotting region.

permn In 'combinat' package. Generates all permutations of the elements of x, in a minimal- change order. If x is a positive integer, returns all permutations of the elements of seq(x). If argument "fun" is not null, applies a function given by the argument to each point. "..." are passed unchanged to the function given by argument fun, if any.

plot Create plots. See Section 11 for more info. power.prop.test Computes power of test for equality of proportions, or determines

parameters to obtain target power. power.t.test Compute power of test, or determine parameters to obtain target

power. proc.time proc.time determines how much time (in seconds) the currently

running R process already consumed. Value: A numeric vector of length 5, containing the user, system, and total elapsed times for the currently running R process, and the cumulative sum of user and system times of any child processes spawned by it. See also 'system.time' and 'gc'.

ptukeyqtukey

Functions for the studentized range distribution.

qr qr computes the QR decomposition of a matrix. The QR decomposition plays an important role in many statistical techniques. In particular it can be used to solve the equation Ax = b for given matrix A, and vector b. It is useful for computing regression coefficients and in applying the Newton-Raphson algorithm. The functions qr.coef, qr.resid, and qr.fitted return the coefficients, residuals and fitted values obtained when fitting y to the matrix with QR decomposition qr. qr.qy and qr.qty return Q %*% y and t(Q) %*% y, where Q is the Q matrix.

range range usually returns the min and max of a vector, but see the following example for an illustration of how range deals with missing

Page 7: John Miyamoto - University of Washington · Web viewJohn Miyamoto Useful R Commands (Top); Useful R Information Sources (Bottom) Contents (Cntrl-left click on a link to jump to the

3/26/02 11:24 AM File = /tt/file_convert/5e5d40fda812a874934c50ff/document.doc 7and infinite elements:x <- c(NA, 1:3, -1:1/0); xrange(x)range(x, na.rm = TRUE)range(x, finite = TRUE)

rank.seq JM function for creating sequential ranks (all tied values get the same rank and ranks are sequential from 1 to length(unique(input)).

read.fwf Read a “table” of fixed width formatted data into a data.frame. read.tableread.csvread.csv2read.delimread.delim2

Reads a file in table format and creates a data frame from it, with cases corresponding to lines and variables to fields in the file. This list includes related functions for reading data from particular types of files (comma separated, tab separated, etc.).

readLines(con = stdin(), n = -1, ok = TRUE)

Read text lines from a connection. If the con is a character string, the function call 'file' to obtain a file connection which is opened for the duration of the function call. ('readLines("clipboard")' reads lines from the clipboard.) If the connection is open it is read from its current position. If it is not open, it is opened for the duration of the call and then closed again. .... Whatever mode the connection is opened in, any of LF, CRLF or CR will be accepted as the EOL marker for a line.

'relevel''relevel(x, ref, ...)'

The levels of a factor are re-ordered so that the level specified by ref is first and the others are moved down. This is useful for contr.treatment contrasts which take the first level as the reference. Also useful to change the number codes for factor levels to something other than an alphabetical increasing number codes.

reshape This function reshapes a data frame between ‘wide’ (multivariate) format with repeated measurements in separate columns of the same record and ‘long’ (univariate) format with the repeated measurements in separate records.

reshape This function reshapes a data frame between ‘wide’ format with repeated measurements in separate columns of the same record and ‘long’ format with the repeated measurements in separate records. Creates a univariate organization for repeated measures organization of files.

return Used to return values or lists of values from within the body of a function. (I think this is the same as asserting the object names of the value as the very last line of the function.)

RSiteSearch Search for key words or phrases in the R-help mailing list archives, or R manuals and help pages, using the search engine at http://search.r-project.org and view them in a web browser. See also 'help.search'.

scale Centers variables and normalizes them, e.g., 'scale(x)' returns the z-scores that correspond to the vector 'x'.

scale Standardize variables, i.e., convert to z-scores.seek Function that returns the current position for reading to or writing

from a file. segments Draw line segments between pairs of points. See also arrows. set.seed.Random.seed

set.seed uses its single integer argument to set as many seeds as are required. It is intended as a simple way to get quite different seeds by specifying small integer arguments, and also as a way to get valid seed sets for the more complicated methods (especially "Mersenne-Twister" and "Knuth-TAOCP"). .Random.seed saves the seed set for the uniform random-number generator, at least for the system generators. It does not necessarily save the state of other generators, and in particular does not save the state of the Box–Muller normal generator. If you want to reproduce work later,

Page 8: John Miyamoto - University of Washington · Web viewJohn Miyamoto Useful R Commands (Top); Useful R Information Sources (Bottom) Contents (Cntrl-left click on a link to jump to the

3/26/02 11:24 AM File = /tt/file_convert/5e5d40fda812a874934c50ff/document.doc 8call set.seed rather than set .Random.seed.

setwd Sets the working directory (the directory under which files will be saved by 'save').

source 'source' causes R to accept its input from the named file (the name must be quoted). Input is read from that file until the end of the file is reached. See also 'sys.source'

split.screenscreenerase.screenclose.screen

split.screen defines a number of regions within the current device which can, to some extent, be treated as separate graphics devices. It is useful for generating multiple plots on a single device. Screens can themselves be split, allowing for quite complex arrangements of plots. screen is used to select which screen to draw in. erase.screen is used to clear a single screen, which it does by filling with the background colour. close.screen removes the specified screen definition(s).

stopifnot(...) If any of the expressions in ... are not TRUE, stop is called, producing an error message indicating the first element of ... which was not true. stopifnot(A, B) is conceptually equivalent to { if(!all(A)) stop(...) ; if(!all(B)) stop(...) }.

str Compactly display the internal structure of an R object, a “diagnostic” function and an alternative to summary (and to some extent, dput). Ideally, only one line for each “basic” structure is displayed. It is especially well suited to compactly display the (abbreviated) contents of (possibly nested) lists.

strls.strlsf.str

Compactly display the internal structure of an R object, a “diagnostic” function and an alternative to summary (and to some extent, dput). ... It is especially well suited to compactly display the (abbreviated) contents of (possibly nested) lists. The idea is to give reasonable output for any R object. It calls args for (non-primitive) function objects. ls.str and lsf.str are useful “versions” of ls, calling str on each object. They are not foolproof and should rather not be used for programming, but are provided for their usefulness.

strftimestrptime

Converts objects to and from the classes "POSIXlt" and "POSIXct" (not strftime) and character vectors that represent times. Not completely sure about this, but I believe that strptime takes a character representation of a date & time, and converts it to a POSIXlt representation that R can use, e.g., to compute time differences. I believe that strftime is the inverse, i.e., it takes a POSIXlt representation of a date & time and converts it to a character representation. The user can specify the type of character representation at input or output. E.g., strftime(Sys.time()) gives the current date and time as a character vector. E.g., strptime(c("06/02/11 12:14 PM", "06/02/09 12:14 AM"), "%m/%d/%y %I:%M %p")

strsplit Split the elements of a character vector x into substrings according to the presence of substring split within them. NB: The output of strsplit is a list.

strtrim Trim character strings to specified display widths. strwrap Each character string in the input is first split into paragraphs (on

lines containing whitespace only). The paragraphs are then formatted by breaking lines at word boundaries. The target columns for wrapping lines and the indentation of the first and all subsequent lines of a paragraph can be controlled independently. (Useful for formatting text to screen.)

subset Return subsets of vectors or data frames which meet conditions. It achieves the same purpose as entering boolean conditions in dimensions of a matrix or dataframe, but it's syntax can be simpler or more transparent. Important: 'subset' drops cases that are missing on the index variable, whereas brackets includes these

Page 9: John Miyamoto - University of Washington · Web viewJohn Miyamoto Useful R Commands (Top); Useful R Information Sources (Bottom) Contents (Cntrl-left click on a link to jump to the

3/26/02 11:24 AM File = /tt/file_convert/5e5d40fda812a874934c50ff/document.doc 9cases as rows of NA's. See:

(tmm <- matrix(cbind(1:6, c(2,2, NA, 3, NA, 4), 5:10), ncol=3))

s.value <- 2tmm[tmm[,2] == s.value,]subset(tmm, tmm[,2] == s.value)

substrsubstring

Extract or replace substrings in a character vector.

summarysummary.lmcoefs

Functions for extracting summary statistical information from 'lm' output.

suppressWarnings(expr) suppressWarnings evaluates its expression in a context that ignores all warnings.

svd Compute the singular-value decomposition of a rectangular matrix. symbols This function draws symbols on a plot. One of six symbols; circles,

squares, rectangles, stars, thermometers, and boxplots, can be plotted at a specified set of x and y coordinates. Specific aspects of the symbols, such as relative size, can be customized by additional parameters.

sys.call(which = 0), sys.frame(which = 0), sys.nframe(), sys.function(n = 0), sys.parent(n = 1), sys.calls(), sys.frames(), sys.parents(), sys.on.exit(), sys.status()

These functions provide access to environments (“frames” in S terminology) associated with functions further up the calling stack. .GlobalEnv is given number 0 in the list of frames. Each subsequent function evaluation increases the frame stack by 1 and the environment for evaluation of that function is returned by sys.frame with the appropriate index.

Sys.getenv(x) Returns the values of the environment variables named by x. E.g. Sys.getenv("R_HOME") returns the value of the R_HOME variable. Sys.getenv() returns the values of all envirnoment variables.

Sys.info() Information about the R program and OS.Sys.sleep Suspend execution of R expressions for a given number of secondssys.source Parses expressions in the given file, and then successively evaluates

them in the specified environment. E.g., create an environment on the search path and populate it:

sys.source("myfuns.R", envir=attach(NULL, name="myfuns"))Sys.time() Sys.time returns the system's idea of the current time and

Sys.timezone returns the current time zone. system.time(expr) Return CPU (and other) times that expr used. See also 'proc.time'.

Use with 'gc'. tolower, toupper Convert characters from lower (upper) to upper (lower) case.traceback Call traceback() immediately after running a function with an error.

traceback() will show the sequence of function calls that lead up to the error (not obvious if it was another function internal to the function that was called at the command line).

try 'try' is a wrapper to run an expression that might fail and allow the user's code to handle error-recovery. 'try' evaluates an expression and traps any errors that occur during the evaluation. 'try' establishes a handler for errors that uses the default error handling protocol. It also establishes a 'tryRestart' restart that can be used by 'invokeRestart'.

TukeyHSD Computes Tukey HSD confidence intervals for a oneway design.

Page 10: John Miyamoto - University of Washington · Web viewJohn Miyamoto Useful R Commands (Top); Useful R Information Sources (Bottom) Contents (Cntrl-left click on a link to jump to the

3/26/02 11:24 AM File = /tt/file_convert/5e5d40fda812a874934c50ff/document.doc 10union(x, y)intersect(x, y)setdiff(x, y)setequal(x, y)is.element(el, set)

Set operations on vectors that are interpreted as sets.

uniroot The function uniroot searches the interval from lower to upper for a root (i.e., zero) of the function f with respect to its first argument.

update update will update and (by default) re-fit a model. It does this by extracting the call stored in the object, updating the call and (by default) evaluating that call. Sometimes it is useful to call update with only one argument, for example if the data frame has been corrected.

vcov(object, ...) Variance/covariance matrix of a fitted object: Returns a matrix of the estimated covariances between the parameter estimates in the linear or non-linear predictor of the model.

which Give the TRUE indices of a logical object, allowing for array indices. E.g.,> which(3 == c(1, 3, 1, 4, 3, 9))[1] 2 5

wilcox.test Performs one and two sample Wilcoxon tests on vectors of data; the latter is also known as ‘Mann-Whitney’ test.

wilcox: dwilcox, pwilcox, qwilcox, rwilcox

Distribution functions for the Wilcoxon rank sum statistic. Note that the description in the R help file, "Distribution of the Wilcoxon Rank Sum Statistic", suggests that qwilcox also gives quantiles for the rank sum (which the Wilcoxon rank sum test is based on). In fact, however, it gives quantiles for the u-statistic (which the Mann-Whitney test is based upon).

with(data, expr, ...) Evaluate an R expression in an environment constructed from data. Example: 'with(mydata, plot(x, y, xlab=label(x), ylab=label(y)))'

write.foreign 'foreign' package: This function exports data frames to other statistical packages by writing the data as free-format text and writing a separate file of instructions for the other package to read the data. Can be used to transfer data to SPSS.

write.table Write dataframe to file with designated delimiter. E.g.write.table('data.frame',file = 'fname', sep='\t', row.names=F)

write.table write.table prints its required argument x (after converting it to a data frame if it is not one already) to file. The entries in each line (row) are separated by the value of sep. Set 'sep = ","' for comma-delimited output, and 'sep="\t"' for tab delimited output. Example:'write.table(x, file="clipboard", sep="\t",

row.names=FALSE, col.names=FALSE)'

End of Table of Useful R Functions

----------------------------------------------------------------------

12. Notes re plot and lines TOC

The type argument controls how the points or lines are drawn1. The main options are:* type="p" plots points* type="l" plots a line (the data must be in the correct order!)* type="n" plots nothing,just creates the axes for later use* type="b" plots both lines and points, lines miss the points (interesting!)* type="o" plot overlaid lines and points* type="h" plots histogram-like vertical lines (interesting!)

1 From R. Ripley's R course, http://www.stats.ox.ac.uk/~ruth/, see her "R Programming 2010-11".

Page 11: John Miyamoto - University of Washington · Web viewJohn Miyamoto Useful R Commands (Top); Useful R Information Sources (Bottom) Contents (Cntrl-left click on a link to jump to the

3/26/02 11:24 AM File = /tt/file_convert/5e5d40fda812a874934c50ff/document.doc 11* type="s" plots step-like lines

----------------------------------------------------------------------

13. Regular expressions in R Helpful source re regular expressions: http://docs.python.org/library/re.html Helpful source re regular expressions: http://manual.calibre-ebook.com/regexp.html

DescriptionThis help page documents the regular expression patterns supported by grep and related functions regexpr, gregexpr, sub and gsub, as well as by strsplit.

DetailsA ‘regular expression’ is a pattern that describes a set of strings. Three types of regular expressions are used in R, extended regular expressions, used by grep(extended = TRUE) (its default), basic regular expressions, as used by grep(extended = FALSE), and Perl-like regular expressions used by grep(perl = TRUE). Other functions which use regular expressions (often via the use of grep) include apropos, browseEnv, help.search, list.files, ls and strsplit. These will all use extended regular expressions, unless strsplit is called with argument extended = FALSE or perl = TRUE. Patterns are described here as they would be printed by cat: do remember that backslashes need to be doubled in entering R character strings from the keyboard.

Extended Regular ExpressionsThis section covers the regular expressions allowed if extended = TRUE in grep, regexpr, gregexpr, sub, gsub and strsplit. They use the glibc 2.3.5 implementation of the POSIX 1003.2 standard. Regular expressions are constructed analogously to arithmetic expressions, by using various operators to combine smaller expressions. The fundamental building blocks are the regular expressions that match a single character. Most characters, including all letters and digits, are regular expressions that match themselves. Any metacharacter with special meaning may be quoted by preceding it with a backslash. The metacharacters are . \ | ( ) [ { ^ $ * + ?. A character class is a list of characters enclosed by [ and ] which matches any single character in that list; if the first character of the list is the caret ^, then it matches any character not in the list. For example, the regular expression [0123456789] matches any single digit, and [^abc] matches anything except the characters a, b or c. A range of characters may be

Page 12: John Miyamoto - University of Washington · Web viewJohn Miyamoto Useful R Commands (Top); Useful R Information Sources (Bottom) Contents (Cntrl-left click on a link to jump to the

3/26/02 11:24 AM File = /tt/file_convert/5e5d40fda812a874934c50ff/document.doc 12specified by giving the first and last characters, separated by a hyphen. (Character ranges are interpreted in the collation order of the current locale.) Certain named classes of characters are predefined. Their interpretation depends on the locale (see locales); the interpretation below is that of the POSIX locale. [:alnum:]

Alphanumeric characters: [:alpha:] and [:digit:]. [:alpha:]

Alphabetic characters: [:lower:] and [:upper:]. [:blank:]

Blank characters: space and tab. [:cntrl:]

Control characters. In ASCII, these characters have octal codes 000 through 037, and 177 (DEL). In another character set, these are the equivalent characters, if any.

[:digit:] Digits: 0 1 2 3 4 5 6 7 8 9.

[:graph:] Graphical characters: [:alnum:] and [:punct:].

[:lower:] Lower-case letters in the current locale.

[:print:] Printable characters: [:alnum:], [:punct:] and space.

[:punct:] Punctuation characters: ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~.

[:space:] Space characters: tab, newline, vertical tab, form feed, carriage return, and space.

[:upper:] Upper-case letters in the current locale.

[:xdigit:] Hexadecimal digits: 0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f.

For example, [[:alnum:]] means [0-9A-Za-z], except the latter depends upon the locale and the character encoding, whereas the former is independent of locale and character set. (Note that the brackets in these class names are part of the symbolic names, and must be included in addition to the brackets delimiting the bracket list.) Most metacharacters lose their special meaning inside lists. To include a literal ], place it first in the list. Similarly, to include a literal ^, place it anywhere but first. Finally, to include a literal -, place it first or last. (Only these and \ remain special inside character classes.) The period . matches any single character. The symbol \w is documented to be synonym for [[:alnum:]] and \W is its negation. However, \w also matches underscore in the GNU grep code used in R. The caret ^ and the dollar sign $ are metacharacters that respectively match the empty string at the beginning and end of a line. The symbols \< and \> respectively match the empty string at the beginning and end of a word. The

Page 13: John Miyamoto - University of Washington · Web viewJohn Miyamoto Useful R Commands (Top); Useful R Information Sources (Bottom) Contents (Cntrl-left click on a link to jump to the

3/26/02 11:24 AM File = /tt/file_convert/5e5d40fda812a874934c50ff/document.doc 13symbol \b matches the empty string at the edge of a word, and \B matches the empty string provided it is not at the edge of a word. A regular expression may be followed by one of several repetition quantifiers: ?

The preceding item is optional and will be matched at most once. *

The preceding item will be matched zero or more times. +

The preceding item will be matched one or more times. {n}

The preceding item is matched exactly n times. {n,}

The preceding item is matched n or more times. {n,m}

The preceding item is matched at least n times, but not more than m times.

Repetition is greedy, so the maximal possible number of repeats is used. Two regular expressions may be concatenated; the resulting regular expression matches any string formed by concatenating two substrings that respectively match the concatenated subexpressions. Two regular expressions may be joined by the infix operator |; the resulting regular expression matches any string matching either subexpression. For example, abba|cde matches either the string abba or the string cde. Note that alternation does not work inside character classes, where | has its literal meaning. Repetition takes precedence over concatenation, which in turn takes precedence over alternation. A whole subexpression may be enclosed in parentheses to override these precedence rules. The backreference \N, where N is a single digit, matches the substring previously matched by the Nth parenthesized subexpression of the regular expression. Before R 2.1.0 R attempted to support traditional usage by assuming that { is not special if it would be the start of an invalid interval specification. (POSIX allows this behaviour as an extension but we no longer support it.)

Basic Regular ExpressionsThis section covers the regular expressions allowed if extended = FALSE in grep, regexpr, gregexpr, sub, gsub and strsplit. In basic regular expressions the metacharacters ?, +, {, |, (, and ) lose their special meaning; instead use the backslashed versions \?, \+, \ {, \|, \(, and \). Thus the metacharacters are . \ [ ^ $ *.

Page 14: John Miyamoto - University of Washington · Web viewJohn Miyamoto Useful R Commands (Top); Useful R Information Sources (Bottom) Contents (Cntrl-left click on a link to jump to the

3/26/02 11:24 AM File = /tt/file_convert/5e5d40fda812a874934c50ff/document.doc 14Perl Regular ExpressionsThe perl = TRUE argument to grep, regexpr, gregexpr, sub, gsub and strsplit switches to the PCRE library that ‘implements regular expression pattern matching using the same syntax and semantics as Perl 5.6 or later, with just a few differences’. For complete details please consult the man pages for PCRE, especially man pcrepattern and man pcreapi) on your system or from the sources at ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/. If PCRE support was compiled from the sources within R, the PCRE version is 6.2 as described here (version >= 4.0 is required even if R is configured to use the system's PCRE library). All the regular expressions described for extended regular expressions are accepted except \< and \>: in Perl all backslashed metacharacters are alphanumeric and backslashed symbols always are interpreted as a literal character. { is not special if it would be the start of an invalid interval specification. There can be more than 9 backreferences. The construct (?...) is used for Perl extensions in a variety of ways depending on what immediately follows the ?. Perl-like matching can work in several modes, set by the options (?i) (caseless, equivalent to Perl's /i), (?m) (multiline, equivalent to Perl's /m), (?s) (single line, so a dot matches all characters, even new lines: equivalent to Perl's /s) and (?x) (extended, whitespace data characters are ignored unless escaped and comments are allowed: equivalent to Perl's /x). These can be concatenated, so for example, (?im) sets caseless multiline matching. It is also possible to unset these options by preceding the letter with a hyphen, and to combine setting and unsetting such as (?im-sx). These settings can be applied within patterns, and then apply to the remainder of the pattern. Additional options not in Perl include (?U) to set ‘ungreedy’ mode (so matching is minimal unless ? is used, when it is greedy). Initially none of these options are set. If you want to remove the special meaning from a sequence of characters, you can do so by putting them between \Q and \E. This is different from Perl in that $ and @ are handled as literals in \Q...\E sequences in PCRE, whereas in Perl, $ and @ cause variable interpolation. The escape sequences \d, \s and \w represent any decimal digit, space character and ‘word’ character (letter, digit or underscore in the current locale) respectively, and their upper-case versions represent their negation. Unlike POSIX and earlier versions of Perl and PCRE, vertical tab is not regarded as a whitespace character. Escape sequence \a is BEL, \e is ESC, \f is FF, \n is LF, \r is CR and \t is TAB. In addition \cx is cntrl-x for any x, \ddd is the octal character ddd (for up to three

Page 15: John Miyamoto - University of Washington · Web viewJohn Miyamoto Useful R Commands (Top); Useful R Information Sources (Bottom) Contents (Cntrl-left click on a link to jump to the

3/26/02 11:24 AM File = /tt/file_convert/5e5d40fda812a874934c50ff/document.doc 15digits unless interpretable as a backreference), and \xhh specifies a character in hex. Outside a character class, \b matches a word boundary, \B is its negation, \A matches at start of a subject (even in multiline mode, unlike ^), \Z matches at end of a subject or before newline at end, \z matches at end of a subject. and \G matches at first matching position in a subject. \C matches a single byte. including a newline. The same repetition quantifiers as extended POSIX are supported. However, if a quantifier is followed by ?, the match is ‘ungreedy’, that is as short as possible rather than as long as possible (unless the meanings are reversed by the (?U) option.) The sequence (?# marks the start of a comment which continues up to the next closing parenthesis. Nested parentheses are not permitted. The characters that make up a comment play no part at all in the pattern matching. If the extended option is set, an unescaped # character outside a character class introduces a comment that continues up to the next newline character in the pattern. The pattern (?:...) groups characters just as parentheses do but does not make a backreference. Patterns (?=...) and (?!...) are zero-width positive and negative lookahead assertions: they match if an attempt to match the ... forward from the current position would succeed (or not), but use up no characters in the string being processed. Patterns (?<=...) and (?<!...) are the lookbehind equivalents: they do not allow repetition quantifiers nor \C in .... Named subpatterns, atomic grouping, possessive qualifiers and conditional and recursive patterns are not covered here.

NotePrior to R 2.1.0 the implementation used was that of GNU grep 2.4.2: as from R 2.1.0 it is that of glibc 2.3.x. The latter is more strictly compliant and rejects some extensions that used to be allowed. The change was made both because bugs were becoming apparent in the previous code and to allow support of multibyte character sets.

Author(s)This help page is based on the documentation of GNU grep 2.4.2 (from which the C code used by R used to be taken) the pcre man page from PCRE 3.9 and the pcrepattern man page from PCRE 4.4.

Page 16: John Miyamoto - University of Washington · Web viewJohn Miyamoto Useful R Commands (Top); Useful R Information Sources (Bottom) Contents (Cntrl-left click on a link to jump to the

3/26/02 11:24 AM File = /tt/file_convert/5e5d40fda812a874934c50ff/document.doc 16See Alsogrep, apropos, browseEnv, help.search, list.files, ls and strsplit. http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html

----------------------------------------------------------------------

14. Special characters like "\n"Single quotes need to be escaped by backslash in single-quoted strings, and double quotes in double-quoted strings.

\n newline\r carriage return\t tab\b backspace\a alert (bell)\f form feed\v vertical tab\\ backslash \

\nnn character with given octal code (1, 2 or 3 digits)\xnn character with given hex code (1 or 2 hex digits)

\unnnn Unicode character with given code (1–4 hex digits)\Unnnnnnnn Unicode character with given code (1–8 hex digits)

\" inserts a quote character, e.g., cat("He said \"Boo.\"\n")

----------------------------------------------------------------------

15. Useful Links and DocumentsTopic Source and CommentfMRI data, how to analyze in R e:\r\news\Rnews 2002.2(1).pdf

See discussion of the AnalyzeFMRI package.Glynn's R tech notes http://research.stowers-institute.org/efg/R/index.htm

Useful notes on R usagegraphics: general info Paul Murrell's book, R Graphics. Go to the link

http://www.stat.auckland.ac.nz/%7Epaul/RGraphics/rgraphics.htmlmath notation in plots e:\r\news\Rnews 2002.2(3).pdf

See Ligges, pages 32-34, re math annotation.R packages, how to create http://www.ugcs.caltech.edu/manuals/math/R-2.2.1-exts/index.html

See also 'e:\r\notes\packages notes.doc'. ROC analysis http://www.bioconductor.org/repository/release1.5/package/Source/

http://www.bioconductor.org/repository/devel/vignette/ROCnotes.pdf

sending messages and grades to every student in a UW course

'd:\r\notes\grading.doc'

Page 17: John Miyamoto - University of Washington · Web viewJohn Miyamoto Useful R Commands (Top); Useful R Information Sources (Bottom) Contents (Cntrl-left click on a link to jump to the

3/26/02 11:24 AM File = /tt/file_convert/5e5d40fda812a874934c50ff/document.doc 17End of Useful Links and Documents----------------------------------------------------------------------

16. Notes re the Built-In R Editoro Execute one command: Either put cursor within the command (one line command), or highlight the entire command (one

line or multiline command). Type Ctrl-R. o Execute multiple commands: Completely highlight all commands. Type Ctrl-R.

17. Possible Editors for R Programming* Page with list of many free, good editors: http://www.sciviews.org/_rgui/projects/Editors.html* Emacs* Crimson (small, simple, free). http://www.crimsoneditor.com* Tinn-R. http://www.sciviews.org/Tinn-R/ (free, sounds good, customized for R)

----------------------------------------------------------------------

18. Notes on R-Code/ComputingKeywords followed by Note'\', escape character, paths in Windowsroger bos wrote:> Sometimes even the easy stuff is difficult (for me)... I want to get> input from different places to paste together an excel filename (so> you know I'm using windows) that I can open with RODBC. I know about> using double "\" since its an escape character, but I get either 2 or> none, I can't get just one "\" where I need it. See example code> below. I am using R 2.1.0, but plan to upgrade soon. Thanks in> advance to anyone who can help.> Roger>> rankPath <- "R:\New Ranks\SMC\SMC"> rankDate <- "20050819"> rankFile <- paste(rankPath,rankDate,".xls", sep="")> rankFile> [1] "R:New RanksSMCSMC20050819.xls">> rankPath <- "R:\\New Ranks\\SMC\\SMC"> rankDate <- "20050819"> rankFile <- paste(rankPath,rankDate,".xls", sep="")> rankFile> [1] "R:\\New Ranks\\SMC\\SMC20050819.xls"

This is perfect, "\" is *printed* escaped, hence for file access you can perfectly use this character vector.Uwe Ligges... and you can see that the "\\" is correct by cat(rankFile) instead of print(rankFile), which is what entering the variable at the prompt actually does.-- Bert Gunter**JM: The '\\' is printed in the second example, but it denotes the correct path and file as far as R is concerned.'update packages'Problem: Has anyone written a script to inspect a previous installation, then get & install the same

Page 18: John Miyamoto - University of Washington · Web viewJohn Miyamoto Useful R Commands (Top); Useful R Information Sources (Bottom) Contents (Cntrl-left click on a link to jump to the

3/26/02 11:24 AM File = /tt/file_convert/5e5d40fda812a874934c50ff/document.doc 18packages into the new installation?# # Solution from Uwe Ligges.x <- installed.packages()[,1] # Run this in the previous version of R.install.packages(x) # Run this in the updated (new) version of R.# # Solution from Ripley:This is one reason we normally recommend that you install into a separatelibrary. Then update.packages(checkBuilt = TRUE) is all that is needed.However,

> foo <- installed.packages()> as.vector(foo[is.na(foo[, "Priority"]), 1])

will give you a character vector which you can feed to install.packages(),so it's not complex to do manually.'Type I sum of squares', 'Type II sum of squares', 'Type III sum of squares'The 'base' function 'anova' computes Type I SS's. The 'car' function 'Anova' computes Type II and Type III SS's.

Keywords followed by Note************************************************************************