Basic R Easy

  • Upload
    10sg

  • View
    223

  • Download
    0

Embed Size (px)

Citation preview

  • 7/28/2019 Basic R Easy

    1/113

    Outline

    UCLA Department of Statistics

    Statistical Consulting Center

    Basic RBrigid Wilson

    [email protected]

    April 5, 2010

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    http://find/http://goback/
  • 7/28/2019 Basic R Easy

    2/113

    Outline

    Outline

    I. Preliminaries

    II. Variable Assignment

    III. Working with Vectors

    IV. Working with Matrices

    V. From Vectors to Matrices

    VI. More on Handling Missing Data

    VII. The Help System

    VIII. Datasets in R

    IX. Overview of Plots

    X. R Environment

    XI. Common Bugs and Fixes

    XII. Online Resources for R

    XIII. Exercises

    XIV. Upcoming Mini-Courses

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    http://find/
  • 7/28/2019 Basic R Easy

    3/113

    Preliminaries

    Part I

    Preliminaries

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    http://find/
  • 7/28/2019 Basic R Easy

    4/113

    Preliminaries

    Software Installation

    Installing R on Mac

    1 Go tohttp://cran.r-project.org

    and select MacOS X.2 Select to download the

    latest version: 2.10.1

    3 Install and Open. The R

    window should look likethis:

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    http://find/http://goback/
  • 7/28/2019 Basic R Easy

    5/113

    Preliminaries

    Software Installation

    Installing R on Windows

    1 Go tohttp://cran.r-project.org

    and select Windows.

    2 Select base to install theR system.

    3 Click on the largedownload link. There isother information

    available on the page.4 Install and Open. The R

    window should look likethis:

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    V bl A

    http://find/
  • 7/28/2019 Basic R Easy

    6/113

    Variable Assignment

    Part II

    Variable Assignment

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    V i bl A i

    http://find/
  • 7/28/2019 Basic R Easy

    7/113

    Variable Assignment

    Creating Variables

    Creating Variables I

    To use R as a calculator, type an equation and hitENTER. (Note how R prints the result.) Your outputshould look like this:

    1 2 + 52 [1] 7

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    V i ble Assi e t

    http://find/
  • 7/28/2019 Basic R Easy

    8/113

    Variable Assignment

    Creating Variables

    Creating Variables II

    To create variables in R, use either

  • 7/28/2019 Basic R Easy

    9/113

    Variable Assignment

    Creating Variables

    Creating Variables III

    Caution!Be careful when using

  • 7/28/2019 Basic R Easy

    10/113

    Variable Assignment

    Creating Variables

    Creating Variables IV

    Use spaces so that R will not be confused. It is better to useparentheses instead.

    1 a

  • 7/28/2019 Basic R Easy

    11/113

    Variable Assignment

    Creating Variables

    Creating Variables V

    Caution!It is important not to name your variables after existingvariables or functions. For example, a bad habit is to nameyour data frames data. data is a function used to load some

    datasets.If you give a variable the same name as an existing

    constant, that constant is overwritten with the value of

    the variable. So, it is possible to define a new value for .

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Variable Assignment

    http://find/
  • 7/28/2019 Basic R Easy

    12/113

    Variable Assignment

    Creating Variables

    Creating Variables VI

    Caution!On the other hand, if you give a variable the same name as anexisting function, R will treat the identifier as a variable if usedas a variable, and will treat it as a function when it is used as

    a function:

    c

  • 7/28/2019 Basic R Easy

    13/113

    g

    Creating Variables

    Creating Variables VII

    Caution!

    As we have seen, you can get away with using the same namefor a variable as with an existing function, but you will be introuble if you give a name to a function and a function withthat name already exists.

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Working with Vectors

    http://find/
  • 7/28/2019 Basic R Easy

    14/113

    Part III

    Working with Vectors

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Working with Vectors

    http://find/
  • 7/28/2019 Basic R Easy

    15/113

    Creating Vectors

    Creating Vectors I

    Scalars are the most basic vectors. To create vectors of lengthgreater than one, use the concatenation function c():

    1 d = c (3 , 4 ,7) ; d

    2

    [1] 3 4 7

    The More You Know...

    The semicolon ; is used to combine multiple statements on

    one line.

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Working with Vectors

    http://find/http://goback/
  • 7/28/2019 Basic R Easy

    16/113

    Creating Vectors

    Creating Vectors II

    To create a null vector:

    1 x =c () ; x

    2 NULL

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Working with Vectors

    http://find/http://goback/
  • 7/28/2019 Basic R Easy

    17/113

    Creating Vectors

    Creating Vectors III

    Creating a vector with equal spacing, use the sequencefunction seq():

    1 e = seq ( f rom = 1 , to =3 , by = 0 .5 ) ; e

    2 [1] 1.0 1.5 2.0 2.5 3.0

    Creating a vector of a given length, use the repeat functionrep():

    1 f = rep ( NA , 6) ; f

    2 [1] NA NA NA NA NA NA

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Working with Vectors

    http://find/http://goback/
  • 7/28/2019 Basic R Easy

    18/113

    Some Vector Functions

    Some Useful Vector Functions I

    To find the length of the vector, use length():

    1 l e n g t h ( d )

    2 [1] 3

    To find the maximum value of the vector, use the maximumfunction max():

    1 m a x ( d )

    2 [1] 7

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Working with Vectors

    http://find/
  • 7/28/2019 Basic R Easy

    19/113

    Some Vector Functions

    Some Useful Vector Functions II

    To find the minimum value of the vector, use the minimumfunction min():

    1 m i n ( d )

    2 [1] 3

    To find the mean of the vector, use mean():

    1 m e a n ( d )

    2 [ 1] 4 . 66 6 66 7

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Working with Vectors

    http://find/http://goback/
  • 7/28/2019 Basic R Easy

    20/113

    Some Vector Functions

    Some Useful Vector Functions III

    To sort the vector, use sort():

    1 g < - c ( 2 , 6 , 7 , 4 , 5 , 2 , 9 , 3 , 6 , 4 , 3 )

    2 s or t ( g , d e cr e a si n g = T R U E )

    3 [1] 9 7 6 6 5 4 4 3 3 2 2

    Caution!

    Although T and F work in place of TRUE and FALSE, it is notrecommended.

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Working with Vectors

    http://find/
  • 7/28/2019 Basic R Easy

    21/113

    Some Vector Functions

    Some Useful Vector Functions IV

    To find the unique elements of the vector, use unique():

    1 u n i q u e ( g )

    2 [1] 2 6 7 4 5 9 3

    Alternatively, to find the elements of the vector that repeat,use duplicated():

    1 d u p l i c a t e d ( g )

    2 [1] FALSE FALSE FALSE FALSE FALSE TRUE3 [7] FALSE FALSE TRUE TRUE TRUE

    Brigid Wilson [email protected] R UCLA SCC

    Working with Vectors

    http://find/http://goback/
  • 7/28/2019 Basic R Easy

    22/113

    Some Vector Functions

    Some Useful Vector Functions V

    To determine if a value is missing (NA), use is.na. This isuseful for finding missing values and removing them, or doingsomething else with them.

    1 a

  • 7/28/2019 Basic R Easy

    23/113

    Some Vector Functions

    Some Useful Vector Functions VI

    Caution!mean(a)

    [1] NA

    mean(a, na.rm=TRUE)

    [1] 1.5

    Brigid Wilson [email protected] R UCLA SCC

    Working with Vectors

    http://find/
  • 7/28/2019 Basic R Easy

    24/113

    Some Vector Functions

    Some Useful Vector Functions VII

    To get the number of missing values in a vector,

    1 s u m ( i s . n a ( a ) )

    2 [1] 1

    There are other ways to handle missing values. See?na.action.

    Brigid Wilson [email protected] R UCLA SCC

    Working with Vectors

    http://find/
  • 7/28/2019 Basic R Easy

    25/113

    Some Vector Functions

    Some Useful Vector Functions VIII

    One final common function you can use on vectors (and otherobjects) is summary.

    1 s u m m a r y ( a )

    Min. 1st Qu. Median Mean 3rd Qu. Max.

    1.00 1.75 2.50 3.00 3.75 6.00

    NAs

    1.00

    There are many, many other functions you can use on vectors!

    Brigid Wilson [email protected] R UCLA SCC

    Working with Vectors

    http://find/
  • 7/28/2019 Basic R Easy

    26/113

    Comparisons in R

    Comparisons in R

    Symbol Meaning! logical NOT

    & logical AND logical OR< less than greater than>= greater than or equal to

    == logical equals! = not equal

    Brigid Wilson [email protected] R UCLA SCC

    Working with Vectors

    http://find/http://goback/
  • 7/28/2019 Basic R Easy

    27/113

    Subsetting with Vectors

    Subsetting with Vectors I

    To find out what is stored in a given element of the vector,use [ ]:

    1

    d [ 2 ]2 [1] 4

    To see if the elements of a vector equal a certain number,use ==:

    1 d = = 32 [1] T RUE F AL SE F AL SE

    Brigid Wilson [email protected] R UCLA SCC

    Working with Vectors

    S b i i h V

    http://find/
  • 7/28/2019 Basic R Easy

    28/113

    Subsetting with Vectors

    Subsetting with Vectors II

    To see if any of the elements of a vector do not equal acertain number, use !=:

    1 d ! = 3

    2 [1] FALSE TRUE TRUE

    Brigid Wilson [email protected] R UCLA SCC

    Working with Vectors

    S b tti ith V t

    http://find/http://goback/
  • 7/28/2019 Basic R Easy

    29/113

    Subsetting with Vectors

    Subsetting with Vectors III

    To obtain the element number of the vector when a conditionis satisfied, use which():

    1 w h i c h ( d = = 4 )

    2 [1] 2

    To store the result, type: a=which(d==4); a

    Brigid Wilson [email protected] R UCLA SCC

    Working with Vectors

    Subsetting with Vectors

    http://find/http://goback/
  • 7/28/2019 Basic R Easy

    30/113

    Subsetting with Vectors

    Subsetting with Vectors IV

    We can also tell R what we do not want when subsetting byusing the minus - sign. To obtain everything but the 2ndelement,

    1 d

  • 7/28/2019 Basic R Easy

    31/113

    Subsetting with Vectors

    Subsetting with Vectors V

    We can use subsetting to explicitly tell R what observations wewant to use. To get all elements of d greater than or equal to2,

    1 d [ d >= 2]2 [1] 3 5 7 9

    R will return values of d where the expression within brackets

    is TRUE. Think of these statements as: give me all d suchthat d 2.

    Brigid Wilson [email protected] R UCLA SCC

    Working with Vectors

    Subsetting with Vectors

    http://find/http://goback/
  • 7/28/2019 Basic R Easy

    32/113

    Subsetting with Vectors

    Exercise 1

    Create a vector of the positive odd integers less than 100Remove the values greater than 60 and less than 80

    Find the variance of the remaining set of values

    Brigid Wilson [email protected] R UCLA SCC

    Working with Matrices

    http://find/
  • 7/28/2019 Basic R Easy

    33/113

    Part IV

    Working with Matrices

    Brigid Wilson [email protected] R UCLA SCC

    Working with Matrices

    Creating Matrices

    http://find/
  • 7/28/2019 Basic R Easy

    34/113

    g

    Creating Matrices I

    To create a matrix, use the matrix() function:

    1

    m at < - ma tr ix ( 10 :1 5 , n ro w =3 , n co l = 2) ; m at2 [ , 1] [ , 2]

    3 [1 ,] 10 13

    4 [2 ,] 11 14

    5 [3 ,] 12 15

    Brigid Wilson [email protected] R UCLA SCC

    Working with Matrices

    Some Matrix Functions

    http://find/
  • 7/28/2019 Basic R Easy

    35/113

    Some Useful Matrix Functions I

    To add two matrices, use +

    1 m a t + m a t

    2 [ , 1] [ , 2]

    3 [1 ,] 20 26

    4 [2 ,] 22 28

    5 [3 ,] 24 30

    Brigid Wilson [email protected] R UCLA SCC

    Working with Matrices

    Some Matrix Functions

    http://find/
  • 7/28/2019 Basic R Easy

    36/113

    Some Useful Matrix Functions II

    To find the transpose of a matrix, use t():

    1 t ( m a t )2 [ ,1] [ ,2] [ ,3]

    3 [1 ,] 10 11 12

    4 [2 ,] 13 14 15

    Brigid Wilson [email protected] R UCLA SCC

    Working with Matrices

    Some Matrix Functions

    http://find/
  • 7/28/2019 Basic R Easy

    37/113

    Some Useful Matrix Functions III

    To find the dimensions of a matrix, use dim():

    1 d i m ( m a t )

    2 [1] 3 2

    Alternatively, we can find the rows and columns of the matrix,

    by nrow() and ncol().

    Brigid Wilson [email protected] R UCLA SCC

    Working with Matrices

    Some Matrix Functions

    http://find/
  • 7/28/2019 Basic R Easy

    38/113

    Some Useful Matrix Functions IV

    To multiply two matrices, use %*%.Note: If you use * instead, you will be performing matrixmultiplication element-wise.

    1 m a t % * % t ( m a t )

    2 [ ,1] [ ,2] [ ,3]

    3 [1 ,] 269 292 315

    4 [2 ,] 292 317 342

    5 [3 ,] 315 342 369

    Brigid Wilson [email protected] R UCLA SCC

    Working with Matrices

    Subsetting with Matrices

    http://find/
  • 7/28/2019 Basic R Easy

    39/113

    Subsetting with Matrices I

    To see what is stored in the first element of the matrix,use [ ]:

    1 m a t [ 1 , 1 ]

    2 [1] 10

    To see what is stored in the first row of the matrix:

    1 m a t [ 1 , ]

    2 [1] 10 13

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Working with Matrices

    Subsetting with Matrices

    http://find/
  • 7/28/2019 Basic R Easy

    40/113

    Subsetting with Matrices II

    To see what is stored in the second column of the matrix:1 mat [ , 2]

    2 [1] 13 14 15

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Working with Matrices

    Subsetting with Matrices

    http://find/
  • 7/28/2019 Basic R Easy

    41/113

    Subsetting with Matrices III

    To extract elements 1 and 3 from the second column, use c()

    and [ ]:1 m at [ c ( 1 , 3) , 2]

    2 [1] 13 15

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    From Vectors to Matrices

    http://find/
  • 7/28/2019 Basic R Easy

    42/113

    Part V

    From Vectors to Matrices

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    From Vectors to Matrices

    Creating Matrices from Vectors

    http://find/
  • 7/28/2019 Basic R Easy

    43/113

    Creating Matrices from Vectors I

    To stack two vectors, one below the other, use rbind():

    1 m at 1 < - r b i nd ( d , d ) ; m at 1

    2 [ ,1] [ ,2] [ ,3]

    3 d 3 4 7

    4 d 3 4 7

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    From Vectors to Matrices

    Creating Matrices from Vectors

    http://find/
  • 7/28/2019 Basic R Easy

    44/113

    Creating Matrices from Vectors II

    To stack two vectors, one next to the other, use cbind():

    1 m at 2 < - c b i nd ( d , d ) ; m at 2

    2 d d3 [1 ,] 3 3

    4 [2 ,] 4 4

    5 [3 ,] 7 7

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    More Handling Missing Data

    http://find/
  • 7/28/2019 Basic R Easy

    45/113

    Part VI

    More on Handling Missing Data

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    More Handling Missing Data

    Missing Data in Matrices

    http://find/http://goback/
  • 7/28/2019 Basic R Easy

    46/113

    Missing Data in Matrices I

    Start by creating a matrix with missing data:

    1 h = m a t ri x ( c ( NA ,3 , 1 ,7 , -8 , NA ) , n ro w = 3 , n c ol

    =2 , b yr ow = T RUE ) ; h

    2 [ , 1] [ , 2]

    3 [1 ,] NA 34 [2 ,] 1 7

    5 [3 ,] -8 NA

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    More Handling Missing Data

    Missing Data in Matrices

    http://find/
  • 7/28/2019 Basic R Easy

    47/113

    Missing Data in Matrices II

    To see if any of the elements of a vector are missing useis.na():

    1 i s . n a ( h )

    2 [ ,1] [ ,2]

    3 [1 ,] TRUE F ALSE4 [2 ,] F AL SE F AL SE

    5 [3 ,] F A LSE TRUE

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    More Handling Missing Data

    Missing Data in Matrices

    http://find/
  • 7/28/2019 Basic R Easy

    48/113

    Missing Data in Matrices III

    To see how many missing values there are, use sum() andis.na() (TRUE=1, FALSE=0):

    1 s u m ( i s . n a ( h ) )

    2 [1] 2

    To obtain the element number of the matrix of the missingvalue(s), use which() and is.na():

    1 w h i c h ( i s . n a ( h ) )

    2 [1] 1 6

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    More Handling Missing Data

    Missing Data in Matrices

    http://find/
  • 7/28/2019 Basic R Easy

    49/113

    Missing Data in Matrices IV

    To keep only the rows without missing value(s), usena.omit()

    1 n a . o m i t ( h )

    2 [ , 1] [ , 2]3 [1 ,] 1 7

    4 a t t r ( ," n a . a c t i o n " )

    5 [1] 1 3

    6 a t t r ( ," c l a s s ")

    7 [1] " o m i t "

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    More Handling Missing Data

    Missing Data in Matrices

    http://find/
  • 7/28/2019 Basic R Easy

    50/113

    Exercise 2

    Find the matrix product of A and B if

    Matrix A=

    2 3 71 6 23 5 1

    Matrix B=

    3 2 90 7 85 8 2

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Getting Help in R

    http://find/http://goback/
  • 7/28/2019 Basic R Easy

    51/113

    Part VII

    The Help System

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Getting Help in R

    Help with a Function

    http://find/
  • 7/28/2019 Basic R Easy

    52/113

    Help with a Function I

    To get help with a function in R, use ? followed by the nameof the function.

    1 ? r e a d . t a b l e

    help(function name) also works.

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Getting Help in R

    Help with a Function

    http://find/
  • 7/28/2019 Basic R Easy

    53/113

    Help with a Function II

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Getting Help in R

    Help with a Package

    http://find/
  • 7/28/2019 Basic R Easy

    54/113

    Help with a Package I

    To get help with a package, use help(package="name").

    1 h e l p ( p a c k a g e = " M A S S " )

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Getting Help in R

    Help with a Package

    H l i h P k II

    http://find/
  • 7/28/2019 Basic R Easy

    55/113

    Help with a Package II

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Getting Help in R

    Searching for Help

    S hi f H l I

    http://find/
  • 7/28/2019 Basic R Easy

    56/113

    Searching for Help I

    To search R packages for help with a topic, usehelp.search().

    1 h e l p . s e a r c h ( " r e g r e s s i o n " )

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Getting Help in R

    Searching for Help

    S hi f H l II

    http://find/
  • 7/28/2019 Basic R Easy

    57/113

    Searching for Help II

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Datasets in R

    http://find/
  • 7/28/2019 Basic R Easy

    58/113

    Part VIII

    Datasets in R

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Datasets in R

    Importing Datasets into R

    D t f th I t t I

    http://find/
  • 7/28/2019 Basic R Easy

    59/113

    Data from the Internet I

    When downloading data from the internet, useread.table(). In the arguments of the function:

    header if TRUE, tells R to include variables names whenimporting

    sep tells R how the entires in the data set are separatedsep="," when entries are separated by COMMASsep="\t" when entries are separated by TABsep=" " when entries are separated by SPACE

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Datasets in R

    Importing Datasets into R

    D t f th I t t II

    http://find/
  • 7/28/2019 Basic R Easy

    60/113

    Data from the Internet II

    1 s t o c k . d a t a < - r e a d . t a b l e ( " h t t p : / / w w w . g o o g l e

    . c o m / f i n a n c e / h i s t o r i c a l ? q = N A S D A Q : A A P L &o u t p u t = c s v " , h ea de r = TRUE , sep = " ," )

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Datasets in R

    Importing Datasets into R

    Importing Data from Your Computer I

    http://find/http://goback/
  • 7/28/2019 Basic R Easy

    61/113

    Importing Data from Your Computer I

    1 Check what folder R is working with now:

    1 g e t w d ( )

    2 Tell R in what folder the data set is stored (if differentfrom (1)). Suppose your data set is on your desktop:

    1 s e t w d (" ~ / D e s k t o p ")

    3

    Now use the read.table() command to read in thedata, substituting the name of the file for the website.

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Datasets in R

    Importing Datasets into R

    Using Data Available in R I

    http://find/
  • 7/28/2019 Basic R Easy

    62/113

    Using Data Available in R I

    To use a data set available in one of the R packages, installthat package (if needed). Load the package into R, using thelibrary() function.

    1 l i b r a r y ( a l r 3 )

    Extract the data set you want from that package, using thedata() function. In our case, the data set is called UN2.

    1 d a t a ( U N 2 )

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Datasets in R

    Working with Datasets in R

    Working with Datasets in R I

    http://find/
  • 7/28/2019 Basic R Easy

    63/113

    Working with Datasets in R I

    To use the variable names when working with data, use

    attach():1 d a t a ( U N 2 )

    2 a t t a c h ( U N 2 )

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Datasets in R

    Working with Datasets in R

    Working with Datasets in R II

    http://find/
  • 7/28/2019 Basic R Easy

    64/113

    Working with Datasets in R II

    After the variable names have been attached, to see thevariable names, use names():

    1 n a m e s ( U N 2 )

    To see the descriptions of the variables, use ?:

    1 ? U N 2

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Datasets in R

    Working with Datasets in R

    Working with Datasets in R III

    http://find/
  • 7/28/2019 Basic R Easy

    65/113

    Working with Datasets in R III

    After modifying variables, use detach() and attach() tosave the results:

    1 # Ma ke a co py of the da t a set

    2 U N 2 . c o p y < - U N 2

    3 d e t a c h ( U N 2 )

    4 a t t a c h ( U N 2 . c o p y )

    5 # C h a ng e the 10 th o bs er va ti on for

    l o g F e r t i l i t y

    6 U N2 . c op y [ 10 , 2 ] < - 9 99

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Datasets in R

    Working with Datasets in R

    Working with Datasets in R IV

    http://find/
  • 7/28/2019 Basic R Easy

    66/113

    Working with Datasets in R IV

    To get an overview of the data sets and its variables, use thesummary() function:

    1 # C h e c k t ha t the c h a n g e has be e n m ad e

    2 s u m m a r y ( U N 2 )

    3 s u m m a r y ( U N 2 . c o p y )

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Datasets in R

    Working with Datasets in R

    Working with Datasets in R V

    http://find/
  • 7/28/2019 Basic R Easy

    67/113

    Working with Datasets in R V

    Caution!

    Avoid using attach() if possible. Many strange things canoccur if you accidentally attach the same data frame multipletimes, or forget to detach. Instead, you can refer to a variableusing $. To access the Locality variable in data frame UN2,

    use UN2$Locality. You can also get around this by using thewith function or if your function of choice takes dataargument.

    attach at your own risk!

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Datasets in R

    Working with Datasets in R

    Working with Datasets in R VI

    http://find/
  • 7/28/2019 Basic R Easy

    68/113

    Working with Datasets in R VI

    To get the mean of all the variables in the data set, usemean():

    1 m e a n ( U N 2 , n a . r m = T R U E )

    2 l og P P g d p l o g F e r t i l i t y P urb an

    3 1 0. 99 30 94 1 . 0 1 8 01 6 5 5 . 5 3 8 8 6 0

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Datasets in R

    Working with Datasets in R

    Working with Datasets in R VII

    http://find/
  • 7/28/2019 Basic R Easy

    69/113

    Working with Datasets in R VII

    To get the correlation matrix of all the (numerical) variables inthe data set, use cor():

    1 c o r ( U N 2 [ , 1 : 2 ] )

    2 l o gP P gd p l o g Fe r t il i t y3 l o g PP g d p 1 . 0 00 0 0 0 -0.677604

    4 l o g F e r t i l i t y -0.677604 1. 0 0 0 0 0 0

    Can similarly obtain the variance-covariance matrix using

    var().

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Datasets in R

    Working with Datasets in R

    Exercise 3

    http://find/
  • 7/28/2019 Basic R Easy

    70/113

    Exercise 3

    Load the Animals dataset from the MASS package

    Examine the documentation for this dataset

    Find the correlation coefficient of brain weight and bodyweight in this dataset

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Overview of Plots in R

    http://find/
  • 7/28/2019 Basic R Easy

    71/113

    Part IX

    Overview of Plots in R

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Overview of Plots in R

    Creating Plots

    Basic scatterplot I

    http://find/
  • 7/28/2019 Basic R Easy

    72/113

    p

    To make a plot in R, you can use plot():

    1 p l o t ( x = U N 2 $ l o g P Pg d p ,

    2 y = U N2 $ l o gF e rt i li ty ,

    3 main = " F e rt il it y vs . P er Ca pi ta GDP " ,

    4 xlab = " log P er Ca pi ta GDP , in $ US " ,

    5 ylab = " L og F er ti li ty " )

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Overview of Plots in R

    Creating Plots

    Basic scatterplot II

    http://find/
  • 7/28/2019 Basic R Easy

    73/113

    p

    8 10 12 14

    0.

    0

    0.

    5

    1.

    0

    1.5

    2.

    0

    Fertility vs. PerCapita GDP

    log PerCapita GDP, in $US

    Log

    Fertili

    ty

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Overview of Plots in R

    Creating Plots

    Histogram I

    http://find/
  • 7/28/2019 Basic R Easy

    74/113

    g

    To make a histogram in R, you can use hist():

    1 h i s t ( U N 2 $ l o g PP g d p ,2 main = " D i st ri bu ti on of P er Ca pi ta GDP " ,

    3 xlab = " l og P er Ca pi ta GDP " )

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Overview of Plots in R

    Creating Plots

    Histogram II

    http://find/
  • 7/28/2019 Basic R Easy

    75/113

    g

    Distribution of PerCapita GDP

    log PerCapita GDP

    Frequenc

    y

    6 8 10 12 14 16

    0

    5

    10

    15

    20

    25

    30

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Overview of Plots in R

    Creating Plots

    Boxplot I

    http://find/
  • 7/28/2019 Basic R Easy

    76/113

    p

    To make a boxplot in R, you can use boxplot():

    1 b o x p l o t ( U N 2 $ l o g PP g d p ,

    2 main = " B o xp lo t of P er Ca pi ta GDP " )

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Overview of Plots in R

    Creating Plots

    Boxplot II

    http://find/http://goback/
  • 7/28/2019 Basic R Easy

    77/113

    8

    10

    12

    14

    Boxplot of PerCapita GDP

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Overview of Plots in R

    Creating Plots

    Matrix of Scatterplots I

    http://find/
  • 7/28/2019 Basic R Easy

    78/113

    To make scatterplots of all the numeric variables in your

    dataset in R, you can use pairs():1 p a i r s ( U N 2 )

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Overview of Plots in R

    Creating Plots

    Matrix of Scatterplots II

    http://find/
  • 7/28/2019 Basic R Easy

    79/113

    logPPgdp

    0.0 0.5 1.0 1.5 2.0

    8

    10

    12

    14

    0.

    0

    0.

    5

    1.

    0

    1.

    5

    2.

    0

    logFertility

    8 10 12 14 20 40 60 80 100

    20

    40

    60

    80

    100

    Purban

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Overview of Plots in R

    Creating Plots

    Overlaying with points I

    http://find/
  • 7/28/2019 Basic R Easy

    80/113

    To add more points to an existing plot, use points().Here,we will first plot fertility vs. PerCapita GDP first where %urban is less than 50.

    1 a t t a c h ( U N 2 )

    2 p lo t ( l o gP Pg dp [ P ur ba n < 50] ,

    3 l og F er t il i ty [ P ur ba n < 50] ,

    4 main = " F er ti li ty vs . PPGDP , by % Urban " ,

    5 xlab = " l og P PG DP " ,6 ylab = " l og F er ti li ty " )

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Overview of Plots in R

    Creating Plots

    Overlaying with points II

    http://find/http://goback/
  • 7/28/2019 Basic R Easy

    81/113

    We then add in the points where % urban is more than 50 andmark these points with a different color.

    1 p oi nt s ( l o g PP gd p [ P u rb an >= 50] ,

    2 l og F er t il i ty [ P ur ba n >= 50] ,

    3 col = " r e d ")

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Overview of Plots in R

    Creating Plots

    Overlaying with points III

    http://find/
  • 7/28/2019 Basic R Easy

    82/113

    8 10 12 14

    0.

    5

    1.

    0

    1.

    5

    2.

    0

    Fertility vs. PPGDP, by % Urban

    log PPGDP

    log

    Fertility

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Overview of Plots in R

    Creating Plots

    Overlaying

    http://find/
  • 7/28/2019 Basic R Easy

    83/113

    Caution!

    Once a plot is constructed using plot, whatever is containedin the plot cannot be modified. To overlay things on a

    rendered plot, use one of the following1 abline - add a line with slope b, intercept a or

    horizontal/vertical.

    2 points - add points.

    3 lines - add lines.

    Brigid Wilson [email protected]

    Basic R UCLA SCC Overview of Plots in R

    Saving Plots as a PDF

    Saving Plots as a PDF I

    http://find/
  • 7/28/2019 Basic R Easy

    84/113

    Note: The files will be saved in the folder specified withsetwd().

    To save a plot in R as a PDF, you can use pdf():1 p d f ( " m y p l o t . p d f " )

    2 p a i r s ( U N 2 )

    3 d e v . o f f ( )

    Brigid Wilson [email protected]

    Basic R UCLA SCC R Environment Exporting R Objects to Other Formats

    http://find/
  • 7/28/2019 Basic R Easy

    85/113

    Part X

    R Environment

    Brigid Wilson [email protected]

    Basic R UCLA SCC R Environment Exporting R Objects to Other Formats

    Exploring R Objects

    Exploring R Objects I

    http://find/
  • 7/28/2019 Basic R Easy

    86/113

    To see the names of the objects available to be saved (in yourcurrent workspace), use ls().

    1 l s ( )

    [1] "UN2" "a" "b" "d" "data" "e" "f" "h" "mat1" "mat2"

    Brigid Wilson [email protected]

    Basic R UCLA SCC R Environment Exporting R Objects to Other Formats

    Exploring R Objects

    Exploring R Objects II

    http://find/
  • 7/28/2019 Basic R Easy

    87/113

    To remove objects from your workspace, use rm().

    1 r m ( d )

    2 l s ( )

    [1] "UN2" "a" "b" "data" "e" "f" "h" "mat1" "mat2"

    Brigid Wilson [email protected]

    Basic R UCLA SCC R Environment Exporting R Objects to Other Formats

    Exploring R Objects

    Exploring R Objects III

    http://find/
  • 7/28/2019 Basic R Easy

    88/113

    To remove all the objects from your workspace, type:

    1 r m ( l i s t = l s ( ) )

    2 l s ( )

    character(0)

    Brigid Wilson [email protected]

    Basic R UCLA SCC R Environment Exporting R Objects to Other Formats

    Saving and Loading R Objects

    Saving and Loading R Objects I

    http://find/
  • 7/28/2019 Basic R Easy

    89/113

    To save (to the current directory) all the objects in theworkspace, use save.image().

    1 s a v e . i m a g e ( " b a s i c R . R D a t a " )

    To load (from the current directory), use load().

    1 l o a d ( " b a s i c R . R D a t a " )

    Brigid Wilson [email protected]

    Basic R UCLA SCC R Environment Exporting R Objects to Other Formats

    Saving and Loading R Objects

    Saving and Loading R Objects I

    http://find/
  • 7/28/2019 Basic R Easy

    90/113

    To save (to the current directory) a single object in theworkspace, use save().

    1 s a v e ( s t o c k . d a t a , f i l e = " s t o c k s . R D a t a " )

    To load (from the current directory), use load().

    1 l o a d ( " s t o c k s . R D a t a " )

    Brigid Wilson [email protected]

    Basic R UCLA SCC R Environment Exporting R Objects to Other Formats

    Exporting R Objects to Other Formats I

    http://find/
  • 7/28/2019 Basic R Easy

    91/113

    To save (to the current directory) certain objects in theworkspace to be used in Excel, use write.csv().

    1 w r i t e . c s v ( s t o c k . d a t a ,

    2 f i l e = " s t o c k d a t a . c s v " )

    Brigid Wilson [email protected]

    Basic R UCLA SCC R Environment Exporting R Objects to Other Formats

    Saving R Commands

    Saving R Commands I

    http://find/
  • 7/28/2019 Basic R Easy

    92/113

    To see all of the commands you typed in an R session,click on the Yellow and Green Tablet

    Brigid Wilson [email protected]

    Basic R UCLA SCC R Environment Exporting R Objects to Other Formats

    Saving R Commands

    Saving R Commands II

    T ll f th d t d i R i

    http://find/
  • 7/28/2019 Basic R Easy

    93/113

    To save all of the commands you typed in an R session,

    use:1 s a v e h i s t o r y ( f i l e = " h i s t o r y . l o g " )

    Brigid Wilson [email protected]

    Basic R UCLA SCC R Environment Exporting R Objects to Other Formats

    Saving R Commands

    Saving R Commands III

    http://find/
  • 7/28/2019 Basic R Easy

    94/113

    Alternatively, use a .r file to store your commands.1 Go to: File -> New Document2 Type your commands3 Save the file as "code.r"4 Go back to the R Console5 To run all the commands, use:

    1 s o u r c e (" c o d e . r " )

    The More You Know...

    Use the # sign to write comments in your code. Use them!

    Brigid Wilson [email protected]

    Basic R UCLA SCC Common Bugs and Fixes

    http://find/
  • 7/28/2019 Basic R Easy

    95/113

    Part XI

    Common Bugs and Fixes

    Brigid Wilson [email protected]

    Basic R UCLA SCC Common Bugs and Fixes

    Syntax Error

    Error: syntax error

    http://find/http://goback/
  • 7/28/2019 Basic R Easy

    96/113

    Possible causes:

    Incorrect spelling (of the function, variable, etc.)

    Including a + when copying code from the ConsoleHaving an extra parenthesis at the end of a function

    Having an extra bracket when subsetting

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    http://find/
  • 7/28/2019 Basic R Easy

    97/113

    Common Bugs and Fixes

    Error When Performing Operations

    Error in ... : requires numeric

    matrix/vector arguments

  • 7/28/2019 Basic R Easy

    98/113

    matrix/vector arguments

    Possible causes:

    1 Objects are data frames, not matrices

    2

    Elements of the vectors are charactersPossible solutions:

    1 Coerce (a copy of) the data set to be a matrix, with theas.matrix() command

    2 Coerce (a copy of) the vector to have numeric entries,with the as.numeric() command

    Brigid Wilson [email protected]

    Basic R UCLA SCC Useful Links for R

    http://find/
  • 7/28/2019 Basic R Easy

    99/113

    Part XII

    Online Resources for R

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Useful Links for R

    CRAN

    http://find/
  • 7/28/2019 Basic R Easy

    100/113

    http://cran.stat.ucla.edu/

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Useful Links for R

    R-Seek Search Engine

    http://find/
  • 7/28/2019 Basic R Easy

    101/113

    http://www.rseek.org

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Useful Links for R

    UCLA Statistics Bootcamp Resources

    http://find/
  • 7/28/2019 Basic R Easy

    102/113

    R Bootcamp is a day-long introduction to R. Handouts and

    datasets from Bootcamp 2008 can be found on Ryan Rosarioswebsite: http://www.stat.ucla.edu/rosario/boot08/

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Useful Links for R

    UCLA Statistics Information Portal

    http://find/
  • 7/28/2019 Basic R Easy

    103/113

    http://info.stat.ucla.edu/grad/

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Useful Links for R

    UCLA Statistical Consulting Center E-consulting

    and Walk-in Consulting

    http://find/
  • 7/28/2019 Basic R Easy

    104/113

    g

    http://scc.stat.ucla.edu

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Exercises

    http://find/
  • 7/28/2019 Basic R Easy

    105/113

    Part XIII

    Exercises

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Exercises

    Exercise 1

    http://find/
  • 7/28/2019 Basic R Easy

    106/113

    Create a vector of the positive odd integers less than 100

    Remove the values greater than 60 and less than 80

    Find the variance of the remaining set of values

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Exercises

    Exercise 2

    http://find/
  • 7/28/2019 Basic R Easy

    107/113

    Find the matrix product of A and B if

    Matrix A=

    2 3 71 6 23 5 1

    Matrix B=

    3 2 90 7 85 8 2

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Exercises

    Exercise 3

    http://find/
  • 7/28/2019 Basic R Easy

    108/113

    Load the Animals dataset from the MASS package

    Examine the documentation for this datasetFind the correlation coefficient of brain weight and bodyweight in this dataset

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    Exercises

    Solutions I

    http://find/
  • 7/28/2019 Basic R Easy

    109/113

    1 e1

  • 7/28/2019 Basic R Easy

    110/113

    1 A

  • 7/28/2019 Basic R Easy

    111/113

    1 l i b r a r y ( M A S S )

    2 d a t a ( A n i m a l s )

    3 ? A n i m a l s

    4 c o r ( A n i m a l s )5 body bra in

    6 body 1 .0 00 00 00 00 -0 .00 53 411 63

    7 b ra in - 0. 00 53 41 16 3 1 .0 00 00 00 00

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    http://find/
  • 7/28/2019 Basic R Easy

    112/113

    Upcoming Mini-Courses

  • 7/28/2019 Basic R Easy

    113/113

    April 7 - R Programming II: Data Manipulation and Functions

    April 12 - LaTex I: Writing a Document, Paper, or Thesis

    April 14 - LaTeX II: Bibliographies, Style and Math in LaTeXApril 19 - LaTeX III: Sweave, Embedding R in LaTeX

    For a schedule of all mini-courses offered please visithttp://scc.stat.ucla.edu/mini-courses.

    Brigid Wilson [email protected]

    Basic R UCLA SCC

    http://find/