33
purrr DRAFT

purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

purrr

DRAFT

Page 2: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

DRAFT

https://jennybc.github.io/purrr-tutorial/index.html

these are not slides from a talk!

I refer to them before and during live coding while teaching STAT 545 and DSCI 523

don’t expect them to stand on their own

more material developing here:

Page 3: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

what is purrr?

functional programming

blah blah blah

ok I admit it:

FP not actually front of mind when I use purrr

Page 4: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

what does purrr help me do?

iterate in a data-structure-informed way

tolerate list-columns in data frames

with consistent UI across a large family of fxns

and return values that are ready for further computation

Page 5: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

for every X

do Y

return combined results like Z

Page 6: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

for every X

do Y

return combined results like Z

X and Z will make reference to actual R data structures

Y will be a function, possibly anonymous

like for i in 1 to n … but much higher level

Page 7: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

iterate in a data-structure-informed way

for every GitHub username

do GET https://api.github.com/users/username

and give me HTTP responses in a list

https://jennybc.github.io/purrr-tutorial/ex03_github-api-json.html

Page 8: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

iterate in a data-structure-informed way

for every HTTP response

extract the “name” element

and give me a character vector

https://jennybc.github.io/purrr-tutorial/ex03_github-api-json.html

Page 9: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

iterate in a data-structure-informed way

for every HTTP response

extract the elements "login", "name", "id", "location"

and give me a data frame

https://jennybc.github.io/purrr-tutorial/ex03_github-api-json.html

Page 10: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

iterate in a data-structure-informed way

for every row in a data frame

create a MIME object

and give me a list

https://jennybc.github.io/purrr-tutorial/ex20_bulk-gmail.html

Page 11: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

iterate in a data-structure-informed way

for every MIME object

send an email

and return send status as a list

https://jennybc.github.io/purrr-tutorial/ex20_bulk-gmail.html

Page 12: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

iterate in data-structure-informed way

for every tuple (string, pos of substring starts, pos of substring ends)

extract the substrings

and give me a list of character vectors

https://jennybc.github.io/purrr-tutorial/ex10_trump-tweets.html

Page 13: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

inspectquerymodify

Page 14: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

inspectstr() str(my_list, max.level = 1) str(my_list[[i]], list.len = 10) listviewer::jsonedit()

Page 15: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

map(.x, .f, ...)

Page 16: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

map(.x, .f, ...).x is a vector

“for every X” = for every element of .x

remember lists are vectors

remember data frames are lists

Page 17: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

map(.x, .f, ...).f is a function

possibly specified with shortcuts

all shown in the worked examples

“do Y” = .f(.x[[i]], …)

Page 18: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

“give me a Z”

map(.x, .f, …) can be thought of as map_list(.x, .f, …)

Page 19: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

“give me a Z”

map_lgl(.x, .f, ...) map_chr(.x, .f, ...) map_int(.x, .f, ...) map_dbl(.x, .f, …) return an atomic vector of requested type

Page 20: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

“give me a Z”

map_df(.x, .f, ..., .id = NULL) basically: map() then dplyr::bind_rows()

Page 21: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

“give me a Z”

walk(.x, .f, …) can be thought of as map_nothing(.x, .f, …)

Page 22: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

“for every X”

map2(.x, .y, .f, …) X = (element i of .x, element i of .y)

pmap(.l, .f, …) X = tuple of the i-th elements of the lists in .l

remember a data frame is a list!

Page 23: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

how might you be such things today?

maybe you don’t, because you don’t know how 😔

for loops

apply(), [slvmt]apply(), split(), by()

the plyr package: [adl][adl_]ply()

with dplyr: df %>% group_by() %>% do()

Page 24: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

this is not my first R rodeo

I have gone through intense, evangelical phases of iterating with base “apply” functions and plyr

I highly recommend you give purrr a try

Page 25: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

relationship to base R approaches

there’s nothing you can do with purrr that you cannot do with base

specifically: map() is basically lapply()

main reasons to use purrr:

- shortcuts facilitate anonymous functions for .f

- greater encouragement for type-safety

- consistent API across large family of functions

Page 26: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

tolerate list-columns in data frames

tidyverse lifestyle ~ work in a data frame when possible

what about stuff that can’t be stored as an atomic vector? - stick it in a list-column

but list-columns are awful! - get better at inspecting lists - get better at computing on lists

use purrr::map() and friends - probably inside dplyr::mutate()

Page 27: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

tolerate list-columns in data frames

tidyverse lifestyle ~ work in a data frame when possible

ok there’s a whole section I want to write here, with more worked examples on the site, etc.

but that’s not happening this round

what follows are a few hints of the what I will say

Page 28: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

every time someone asks:

how can I iterate over a list, but also access the index i or the list names at the same time?

they should probably be working inside a data frame, with a list column and a variable for i or the names

use tibble::enframe() on your vexing_list and have at it with mutate(new_var = map_*(vexing_list, f)) or map2() or pmap()

Page 29: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

Great example is Gapminder

draw on

http://r4ds.had.co.nz/many-models.html

and

STAT 545 Gapminder materials (translate from plyr and dplyr)

natural to nest at country level and put data in list-column fit models, etc. by mutating the data list-column extract model summaries by mutating the fits w broom fxns

Page 30: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

more far out example is

https://jennybc.github.io/purrr-tutorial/ex24_xml-wrangling.html

where I put XML nodesets in a data frame each row is one row of a Google Sheet I proceed to wrangle it on the way to get cell contents

Page 31: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

also, just to be clear:

no one in their right mind enjoys having list-columns in a data frame

but the benefits often outweigh the costs especially if you have the right tools and a productive mindset

it’s always a temporary state goal is always to get back to something simpler

Page 32: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

ok this is where things just peter out 😬

and we go back to live coding

Page 33: purrr - files.speakerdeck.com€¦ · blah blah blah ok I admit it: FP not actually front of mind when I use purrr. what does purrr help me do? iterate in a data-structure-informed

My economic policy speech will be carried live at 12:15 P.M. Enjoy! Join me in Fayetteville, North Carolina tomorrow evening at 6pm. Tickets now available at: https://t.co/Z80d4MYIg8 The media is going crazy. They totally distort so many things on purpose. Crimea, nuclear, "the baby" and so much more. Very dishonest!

I see where Mayor Stephanie Rawlings-Blake of Baltimore is pushing Crooked hard. Look at the job she has done in Baltimore. She is a joke!

Bernie Sanders started off strong, but with the selection of Kaine for V.P., is ending really weak. So much for a movement! TOTAL DISRESPECT

Crooked Hillary Clinton is unfit to serve as President of the U.S. Her temperament is weak and her opponents are strong. BAD JUDGEMENT!

The Cruz-Kasich pact is under great strain. This joke of a deal is falling apart, not being honored and almost dead. Very dumb!

substring(text, first, last)

[[1]][1] -1

[[2]][1] -1

[[3]][1] 20

[[4]][1] 134

[[5]][1] 28 95

[[6]][1] 87 114

[[7]][1] 50 112 123

[[1]][1] -3

[[2]][1] -3

[[3]][1] 24

[[4]][1] 137

[[5]][1] 33 98

[[6]][1] 90 119

[[7]][1] 53 115 126

tweets match_first match_last

https://jennybc.github.io/purrr-tutorial/ex10_trump-tweets.html

pmap(list(text = tweets, first = match_first, last = match_last), substring)