Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
Introduc)onto
BISTBiosta+s+cscourseMay6th,2016SarahBonnin
CRGBioinforma)csCoreFacility
06/05/16 1
Outline
• R:what,when,why?• GeEngstarted• Datastructures• Libraries/Packages• Basiccommands• Func)ons• RStudio
06/05/16 2
R:what,when,why?
06/05/16 3
WhatisR?
• IntegratedsuiteofsoNwarefacili)esfordatamanipula+on,calcula+onandgraphicaldisplay.
• Simpleandeffec)veprogramminglanguagewhichincludescondi)onals,loopsetc.
• Implementa)onoftheSprogramminglanguage(Belllaboratories)
R:what,when,why?
06/05/16 4
WhatisR?
• CreatedbyRossIhakaandRobertGentlemanattheUniversityofAuckland,NewZealand
• Nowdevelopedbythe“Rdevelopmentcoreteam”
• GNUproject(freesoNware,passcollabora)on)
àOpensource!
R:what,when,why?
06/05/16 5
h]ps://www.r-project.org/
R:what,when,why?
06/05/16 6
WhentouseR?
• Dataanalysis• Sta)s)calmodeling• Simula)on• Graphics
R:what,when,why?
06/05/16 7
WhytouseR?
• Flexible• Powerful• Interac)ve• Veryac+vecommunityofdevelopersandusers!
R:what,when,why?
06/05/16 8
R:what,when,why?
06/05/16 9
GeDngstarted
06/05/16 10
Rsyntax
func+on(arguments)example:
list.files(path=".")
command argument
Ge.ngstarted
06/05/16 11
Rsyntax
a<-func+on(arguments)example:
myobject<-list.files(path=".")
object assignmentoperator
Ge.ngstarted
06/05/16 12
Rsyntax
• Casesensi)ve:List.fileslist.files
• Commentlinesstartwith#
• Commandsseparatedby;oranewline
• Argumentsinafunc)onseparatedby,
Ge.ngstarted
06/05/16 13
GeEnghelp
• help(func+on)• ?func+on
• Google!
Ge.ngstarted
06/05/16 14
Star)ngRinterac)velyfromaterminal
• Star)ngsession:R• Endingsession:q()
Ge.ngstarted
06/05/16 15
DataStructures
Vectors
Factors
Matrices
Dataframes
List
06/05/16 16
DataTypes• Everyvaluehasadatatypethattellswhatsortofvalueitis.
• Mostcommondatatypes:
– Numeric(Numbers)– Character(Text)– Logical(True/False)
• Checkingobjecttypewithmode():– mode("a")/[1]"character”– mode(10)/[1]"numeric”– mode(FALSE)/[1]"logical"
Datastructures
06/05/16 17
Vectors
• Sequenceofdataelementsofthesametype• Elementsofanatomicvectorareofonetypeonly,either:– Numeric(1,2,5.3,6,-2,4)– Character("one","two","three")– Logical(TRUE,TRUE,TRUE,FALSE,TRUE,FALSE)
Datastructures
06/05/16 18
Vectors
• Assignmentofvaluestovectorusingtheccommand(combiningelements)
myvector<-c(0.5,2,10,3,8)myvector<-1:5• Checkingifanobjectisavectorwithis.vector(myvector)
Datastructures
06/05/16 19
sameasc(1,2,3,4,5)
Vectors
• Fetchelementsofavector:subscripts
a
a[1]a[2:4]a[c(5,7)]
1 9 4 8 0 11 7
Datastructures
06/05/16 20
Factors
• Vectorobjectusedtospecifyadiscreteclassifica+onofthecomponentsinadataset:
àCategoricalvariables
• Usedmainlyinsta+s+calmodeling,butalsoinsomegraphicalfunc)ons.
• Similartovectors,buttheirvaluesarelimitedtoafixedsetofpossiblevalues.
Datastructures
06/05/16 21
Factors
• Bothnumericandcharactervariablescanbemadeintofactors.
myfactor<-factor(c("a","a","b","c",1,2))
myfactor[1]aabc12Levels:12abc
Datastructures
Levelsarecharactervectors
06/05/16 22
Factors
• Checkingifanobjectisafactor:– is.factor(myfactor)
• Transformingavectorintoafactor:– as.factor(myvector)
Datastructures
06/05/16 23
Matrices
• Amatrixisavectorof2dimensions• Allcolumnsinamatrixmusthave:– thesametype(numeric,character,logical)– thesamelength
a<-matrix(c(1,0,34,5,13,44,12,4,3,8,6,9,22,7,76),
nrow=5, ncol=3)
Datastructures
06/05/16 24
Matrices
• Fetchrows,columnsorsingleelementsofamatrixusingsubscript:
a[,]
Datastructures
rowindices
columnindices
06/05/16 25
Matrices
• Fetchrows,columnsorsingleelementsofamatrixusingsubscript:
a1 44 6
0 12 9
34 4 22
5 3 7
13 8 76
a[1,]
a[2:3,]
a[,3]
a[5,1]
Datastructures
06/05/16 26
Dataframes
• Structuresof2dimensions.
• Moregeneralthanmatrices.• Differentcolumnscanhavedifferenttypesbutmusthavethesamelength.
a<-data.frame(c(1,3,34,5,13),
c(“etc”,“ok”,“yes”,“no”,“well”), c(TRUE,TRUE,FALSE,FALSE,TRUE))
Datastructures
06/05/16 27
1 etc TRUE
3 ok TRUE
34 yes FALSE
5 not FALSE
13 well TRUE
Dataframes
• Fetchrows,columnsorsingleelementsofadataframeusingsubscript:
a
a[1,]
a[2:3,]
a[,3]
a[5,1]
Datastructures
06/05/16 28
Matricesanddataframesdimensionnames
a<-matrix(c(1,0,34,5,13,44), nrow=3, ncol=2, dimnames=list(c(”row1”,”row2”,”row3”),c(”col1”,”col2”)))
b<-data.frame(c(1,0,34),
c(5,13), row.names=c("row1","row2","row3"), col.names=,c(”col1”,”col2”))
Datastructures
06/05/16 29
Matricesanddataframesdimensionnames
• Changingcolumnand/orrownames:colnames(x)<-c(“a”,“b”,“c”)rownames(x)<-1:5
• Changingcolumnandrownamesatonce:dimnames(x)<-list(c(“a”,“b”,“c”),1:5)
Datastructures
06/05/16 30
Matricesanddataframesdimensionnames
col1 col2
row1 1 5
row2 0 13
row3 34 44
Matrix Dataframe
a[,"col1”]
a["col1”]
a$col1
a["row1",]
a["row1”]
a["row1","col1"]
b[,"col1”]
b["col1”]
b$col1
b["row1",]
b["row1”]
b["row1","col1"]
Datastructures
06/05/16 31
MatricesanddataframesChecksandconversions
Datastructures
06/05/16 32
• Checkingiftheobjectisamatrix…is.matrix(mymatrix)
…oradataframe:is.data.frame(mydataframe)
• Conver)ngamatrixintoadataframe…as.matrix(mydataframe)
…orviceversa:as.data.frame(mymatrix)
Lists
• Linearstructures.• Acomponentofalistcanbeanydatastructure
(matrix,vector,dataframe,anotherlist)
Datastructures
06/05/16 33
Lists
• Createalist:x<-list()x<-list(c(“fir”,“sec”,“th”),
c(0,20,4,2), TRUE, matrix(c(2,5,4,6,3,7), nrow=2,ncol=3))
Datastructures
#1stelementofthelist
#2delementofthelist
#3delementofthelist
#4thelementofthelist
06/05/16 34
#Emptylistthatcanbefilledlateron
Lists
• Accessingelementsofalistx:
x[[1]]x[[2]][3]
x[[4]][1,2]
“fir”,“sec”,”th” 0,20,4,2 TRUE2 4 3
5 6 7
Datastructures
06/05/16 35
Namingelementsofalist
x<-list(charvector=c(“fir”,“sec”,“th”), numvector=c(0,20,4,2), oneboolean=TRUE, onematrix=matrix(c(2,5,4,6,3,7), nrow=2,ncol=3))
ornames(x)<-c(“charvector”,“numvector”,“oneboolean”,“onematrix”)
Datastructures
06/05/16 36
Accessingelementsoflistxpername
x[“charvector”]or
x$charvector
x[“numvector”][3]or
x$numvector[3] x[“onematrix”][1,2]or
x$onematrix[1,2]
“fir”,“sec”,”th” 0,20,4,2 TRUE2 4 3
5 6 7
charvector numvector oneboolean onematrix
Datastructures
06/05/16 37
Lengthanddimensionoflists
• Listshavealengthbutonlyonedimension
• Elementswithinalistcanhavelengthsanddimensions,dependingontheirdatastructure.
Datastructures
“fir”,“sec”,”th” 0,20,4,2 TRUE2 4 3
5 6 7
length(x)
4
length(x[[2]])
4dim(x[[4]])
2306/05/16 38
Checkingandconver)ngdatatypesandstructures
Datastructures
Checking Conver+ngis.matrix(x) as.matrix()
is.data.frame(x) as.data.frame()
is.vector() as.vector()
is.list() as.list()
is.factor() as.factor()
is.character() as.character()
is.numeric() as.numeric()
06/05/16 39
Library/packages
06/05/16 40
Library/packages
• Packagesarecollec)onsofRfunc)ons,data,andcompiledcodeinawell-definedformat.
• Thedirectorywherepackagesarestorediscalledthelibrary.
Library/packages
Defini&onsfromh.p://www.statmethods.net/interface/packages.html06/05/16 41
Standardpackages
• About25standardpackagesaresuppliedwithRbydefault(example:base,stats,graphics).
• OnMay4th2016,8340packageswereavailable!• AnyonecancontributetotheRpackagerepository
Library/packages
06/05/16 42
Installingandloadingpackages
• Packagescanbeinstalledfrom:
– theinterac)vesession:install.packages(“ggplot2”)install.packages(“ggplot2”,repos=
h]p://cran.r-project.org/web/packages/)
– theterminal:RCMDINSTALLggplot2_version.tar.gz
• Andloaded:library(“ggplot2”)
Library/packages
06/05/16 43
Lis)ngpackages
• Installedpackages:library()acepackace()andavas()forselec)ngregressiontransforma)onsaffyMethodsforAffymetrixOligonucleo)deArraysaffyioToolsforparsingAffymetrixdatafilesAnnota)onDbiAnnota)onDatabaseInterfaceBiobaseBiobase:Basefunc)onsforBioconductorBiocGenericsS4genericfunc)onsforBioconductor…• Loadedpackages:search()[1]".GlobalEnv""package:stats""package:graphics"[4]"package:grDevices""package:u)ls""package:datasets"[7]"package:methods""Autoloads""package:base"
Library/packages
06/05/16 44
Lis)ngfunc)onsfrompackages
• ls(“package:yourpackage”)• ls(“package:VennDiagram”)[1]"add.)tle""adjust.venn"[3]"calculate.overlap""decide.special.case"[5]"draw.pairwise.venn""draw.quad.venn"[7]"draw.quintuple.venn""draw.single.venn"…."
Library/packages
06/05/16 45
Accessingpackage’sfunc)ons
• Usually,callingthefunc)onbyitsnameisenough:add.+tle()
• If2packageshaveafunc)onwiththesamename,makesureyouareusingtherightone:VennDiagram::add.+tle()
Library/packages
06/05/16 46
Basiccommands
06/05/16 47
GeEngandchangingworkingdirectory
• getwd()– Returnscurrentworkingdirectory
• setwd(“/home/mydir/”)– Changescurrentworkingdirectoryto“/home/mydir”
Basiccommands
06/05/16 48
Informa)onaboutobjects• summary(x)
àfornumericaldata:min,max,medianetc.àforcharacterdata:countofitemsoccurrences
Basiccommands
06/05/16 49
col1 col2 col3
1 ok TRUE
3 yes FALSE
5 no FALSE
6 maybe FALSE
col1col2col3Min.:1.00maybe:1Mode:logical1stQu.:2.50no:1FALSE:3Median:4.00ok:1TRUE:1Mean:3.75yes:1NA's:03rdQu.:5.25Max.:6.00
Informa)onaboutobjects
• str(x):àInternalstructureofanRobject
'data.frame':4obs.of3variables:$col1:num1356$col2:Factorw/4levels"maybe","no","ok",..:3421$col3:logiTRUEFALSEFALSEFALSE
Basiccommands
06/05/16 50
col1 col2 col3
1 ok TRUE
3 yes FALSE
5 no FALSE
6 maybe FALSE
Size/dimensionsofobjectsBasiccommands
numberofelementsinvector,factororlist length(x)
dimensionsofmatrixordataframe dim(x)
numberofrowsofmatrixordataframe nrow(x)
numberofcolumnsofmatrixordataframe ncol(x)
06/05/16 51
Elementaryarithme)coperatorsBasiccommands
addi)on +
subtrac)on -
division /
mul)plica)on *
exponen)a)on ^or**
06/05/16 52
LogicaloperatorsBasiccommands
inferior <inferiororequal <=
superior >superiororequal >=
equality ==inequality !=
intersec)on(“and”) &union(“or”) |
06/05/16 53
Obtainingsummarysta)s)csBasiccommands
average/mean mean(x)
median median(x)
minimum min(x)
maximum max(x)
variance var(x)
correla)on cor(x)
06/05/16 54
Somecommonarithme)cfunc)onsBasiccommands
naturallogarithm log(x)exponen)alfunc)on
e^x exp(x)
sine sin(x)
cosine cos(x)
tangent tan(x)
absolutevalue abs(x)
squareroot sqrt(x)
06/05/16 55
Objectsstoredintheglobalenvironment
• Lis)ng:ls()orobjects()• Removingoneobjectfromenvironment:rm(x)• Removingseveralobjects:rm(x,y)
• Removingallobjectfromenvironment:rm(list=ls())
Basiccommands
06/05/16 56
Readingandwri)ngfilesBasiccommands
Readingfileintoobject
a<-read.table(“file.txt”)
Wri)ngobjecttofile
write.table(a,“file.txt”)
06/05/16 57
Savingobjectsorsession
• Saveobjectsxandyinto“myobjects.RData”file– save(x,y,file=“myobjects.RData”)
• Loadobjectsxandyintocurrentdirectory:– load(“myobjects.RData”)
• Savethecurrentworkspace(allobjects):– save.image(file=“.RData”)
Basiccommands
06/05/16 58
Commandhistory
• Last25commands:– history()
• Allpreviouscommands– history(max.show=Inf)
• Savecommandhistory:– savehistory()
• Loadcommandhistory:– loadhistory()
Basiccommands
06/05/16 59
Informa)onaboutthecurrentsession
sessionInfo()
Basiccommands
Rversion
PlavormandOSversion
Packagesawached
06/05/16 60
Func+onsinR
06/05/16 61
Knowingthesourcecodeofafunc)on
• Nameofthefunc)onwithout()Forexample:
sort
func)on(x,decreasing=FALSE,...){if(!is.logical(decreasing)||length(decreasing)!=1L)stop("'decreasing'mustbealength-1logicalvector.\nDidyouintendtoset'par)al'?")UseMethod("sort")}
Func@ons
06/05/16 62
User-wri]enfunc)ons
• Func)onsarepiecesofcodewri]entocarryout(a)specifiedtask(s)andbeabletorepeatit(them)easily.
• Rallowsyoutocreateyourownfunc)ons.
Func@ons
06/05/16 63
Func)ons’structure
myfunc+on<-func+on(arg1,arg2,...){ commands return(something)}
Func@ons
Func+on’sname Crea+ngthefunc+on
myfunc+on’slistofarguments
whatmyfunc+ondoes
Value/objectmyfunc+onreturns
06/05/16 64
Func@ons
Objectsinafunc+onarelocaltothatfunc+on!
fromhBps://www.datacamp.com/community/tutorials/func@ons-in-r-a-tutorial06/05/16 65
Objectsinfunc)onsmyfunc+on<-func+on(arg1){a<-arg1return(a+1)
}
Func@ons
06/05/16 66
>a<-12>myfunc)on(10)[1]11>a[1]12
>myfunc)on(10)[1]11>aError:object'a'notfound
Exampleofafunc)on
Func+on“somestats”:
takesasanargument:onenumericvector.computes:medianandmeanofthatvector.returns:avectorcontainingthese2values.
Func@ons
06/05/16 67
Exampleofafunc)onsomestats<-func)on(vector_input){ my_mean<-mean(vector_input) my_median<-median(vector_input)
return(c(my_mean,my_median))}
x<-c(0,2,1,6.3,2.2,10,8,5.4)
somestats(x)4.36253.8000
Func@ons
06/05/16 68
Savingandsourcingfunc)onsFunc@ons
06/05/16 69
somestats<-func)on(vector_input){ my_mean<-mean(vector_input) my_median<-median(vector_input) vector_output<-c(my_mean,my_median) return(c(vector_output))}
source(“myfunc+ons.R”)àContentofmyfunc+ons.Rloadedintheuserworkspace(globalenvironment)
myfunc+ons.R
RStudio
06/05/16 70
WhatisRstudio?• FreeandopensourceIDE(IntegratedDevelopmentEnvironment)forR
• AvailableforWindows,MacOSandLinux• Wri]eninC++• Firstbetaversionavailablein2009
• Userfriendlyenvironment
RStudio
06/05/16 71
Screen:4windowsRStudio
3.Environmentandhistory
1.Console 4.Files,plots,packages,help
2.Rscript
06/05/16 72
RStudio
1.Console
Console:Interac+veenvironment.àWheretheworkisdone.Averyintui+veterminal!
06/05/16 73
RStudio
2.Rscript
Savedscript
Canberunen+rely
Canberunbyline/block
06/05/16 74
RStudio
3.Environmentandhistory
Globalenvironment:allobjectspresentinthesession
Youcanalso:SavethecurrentworkspaceLoadasavedworkspaceImportdatasets
Listfunc+onsavailableperpackage
06/05/16 75
RStudio
3.Environmentandhistory
History:Commandsusedduringthesession.Canberunagain,copiedintofile,etc.
06/05/16 76
RStudio
4.Plotarea
DisplaysthegraphsYoucan:• Gobacktothepreviousplot• Zoom• Exportintopdf,jpeg,png,+ff,etc.
06/05/16 77
RStudio
4.Files
Directoriesandfilesincurrentworkingdirectory
06/05/16 78
RStudio
4.Packages
Listofpackages:• Available• Loadedincurrentsession(+cked)
Youcan:• Installnewpackages• Updatepackagestonewestversions
06/05/16 79
RStudio
4.Help
Helptab:• RandRStudiodocumenta+on• Packagesandfunc+onshelppage
displayedherewhenrequested
06/05/16 80
RMD:RMarkdown
• Formatforwri)ngreproducible,dynamicreportswithR.
• Workdirectlyinsertedintoforma]eddocuments(HTML,PDFandWord)
• Easytouse!
RStudio
06/05/16 81
RStudio
RMD:RMarkdown
06/05/16 82
RStudio
RMD:RMarkdown
.Rmdextensionfiles
06/05/16 83
RStudio
RMD:RMarkdown
06/05/16 84
RStudio
RMD:RMarkdown
```{r}YOURCODE```
06/05/16 85
RStudio
RMD:RMarkdown
echo=TRUEeval=TRUEcode+outputinreportecho=FALSEeval=TRUEonlyoutputinreport
06/05/16 86
RStudio
RMD:RMarkdown
knitrpackagedynamicreportgenera)oninR
06/05/16 87
RStudio
RMD:RMarkdown
KnittoHTMLàCreatesa.htmlfile
06/05/16 88
Usefulresources
• QuickR:h]p://www.statmethods.net/• Rbloggers:h]p://www.r-bloggers.com/• CookbookforR:h]p://www.cookbook-r.com/• Rforgeforum:h]ps://r-forge.r-project.org/forum/forum.php?forum_id=78
• Rhelp:h]p://r.789695.n4.nabble.com/R-help-f789696.html
06/05/16 89