Upload
goran-s-milovanovic
View
15.566
Download
1
Embed Size (px)
Citation preview
Introduction to R for Data Science
Lecturers
dipl. ing Branko Kovač
Data Analyst at CUBE/Data Science Mentor
at Springboard
Data Science zajednica Srbije
dr Goran S. Milovanović
Data Scientist at DiploFoundation
Data Science zajednica Srbije
Lists in R
• Lists can contain elements (objects) of various types/classes
• Lists can be recursive: a list of lists
• In R we use lists a lot; however, computing over lists is seldom the most efficient way
Intro to R for Data Science
Session 2: Lists & Functions
# Introduction to R for Data Science
# SESSION 3 :: 12 May, 2016
# It's time to speak about lists num_vct <- c(2:5) # just another num vector chr_vct <- c("data", "science") # char vector data_frame <- data.frame(x = c("a", "b", "c", "d"), y = c(1:4)) # simple df lista <- list(data_frame, num_vct, chr_vct) # and this is a list lista # this is our list
Lists in R
• Subsetting lists
• Think of an element (a node) of a list as a “container” which is always a list itself
• Subsetting with [[ ]] and [ ] – careful!
Intro to R for Data Science
Session 2: Lists & Functions
# Introduction to R for Data Science
# SESSION 3 :: 12 May, 2016
str(lista) # about a list length(lista) as.list(chr_vct) # another way to create a list # Lists manipulation names(lista) <- c("data", "numbers", "words") lista[3] # 3rd element? lista[[3]] # 3rd element? is.list(lista[3]) # is this a list? is.list(lista[[3]]) # and this? class(lista[[3]]) # also a list? Don’t be so sure!
Lists in R
• More subsetting
• Adding and removing a node
• unlist()
Intro to R for Data Science
Session 2: Lists & Functions
# Introduction to R for Data Science
# SESSION 3 :: 12 May, 2016
lista$words # we can also extract an element this way lista[["words"]] # or even like this lista[["words"]][1] # digging even deeper lista$new_elem <- c(TRUE, FALSE, FALSE, TRUE) # add new element length(lista) # now list has 4 elements lista$new_elem <- NULL # but we can remove it easily new_vect <- unlist(lista) # creating a vector from list
Functions in R
Intro to R for Data Science
Session 2: Lists & Functions
# Introduction to R for Data Science
# SESSION 3 :: 12 May, 2016 # Functions # (w. less formalism but tips & tricks added)
# elementary: a definition fun <- function(x) x+10; fun(5) # taking two arguments fun2 <- function(x,y) x+y; fun2(3,4) # using "{" and "}" to enclose multiple R # expressions in the function body fun <- function(x,y) { a <- sum(x); b <- sum(y); a-b}
# Introduction to R for Data Science
# SESSION 3 :: 12 May, 2016
r <- c(5,4,3); q <- c(1,1,1); fun(r,q) fun(c(5,4,3),c(1,1,1)) # NOTE: "{" and "}" are generally used in R # to mark the beginning and the end of # block # a function is a function: is.function(fun); is.function(log); # log is built-in
Intro to R for Data Science
Session 2: Lists & Functions
# Introduction to R for Data Science
# SESSION 3 :: 12 May, 2016
# Functional programming ("Everything is a function...") "^"(2,2) "^"(2,3) # magic! - how do you do that? 2^2 2^3 # the difference between "operators" and "functions" in R: none # Everything is a function: "+"(2,2) # Four? 2+2 # yeah, right - Oh but I love this "-"("+"(3,5),2) "&"(">"(2,2),T) "&"(">"(3,2),T) # punishment: write all your lab code for this week in this fashion...
Functions in R
• Functional programming
Intro to R for Data Science
Session 2: Lists & Functions
# Introduction to R for Data Science
# SESSION 3 :: 12 May, 2016
# Step 1: here's a list: aList <- list(c(1,2,3), c(4,5,6), c(7,8,9), c(10,11,12)) # Step 2: I want to apply the following function: myFun <- function(x) {x[1]+x[2]-x[3]} # to all elements of the aList list, and get the result as a list again. # Here it is: res <- lapply(aList, function(x) { x[1]+x[2]-x[3]}) unlist(res) # to get a vector
Lists and Functions in R
• Two things that come handy: lapply() and apply()
Intro to R for Data Science
Session 2: Lists & Functions
# Introduction to R for Data Science
# SESSION 3 :: 12 May, 2016
# Now say I've got a matrix myMat <- matrix(c(1,2,3,4,5,6,7,8,9), nrow=3, ncol=3) # now, I want the sums of all rows: rsMyMat <- apply(myMat, 1, function(x) {sum(x)}) rsMyMat is.list(rsMyMat) # just beatiful # for columns: csMyMat <- apply(myMat, 2, function(x) {sum(x)})
Lists and Functions in R
• Two things that come handy: lapply() and apply()
Intro to R for Data Science
Session 2: Lists & Functions
# Introduction to R for Data Science # SESSION 3 :: 12 May, 2016 # with existings functions, such as sum(), this will do: rsMyMat1 <- apply(myMat, 1, sum) rsMyMat1 csMyMat1 <- apply(myMat, 2, sum) csMyMat1 # try also… rowSums(myMat) colSums(myMat)
Lists and Functions in R
• Two things that come handy: lapply() and apply()