x <- c(10, 34, 3)
y <- c(3, 35, 1)
x + y[1] 13 69 4
for() loop in R to handle repeated tasksmap() family of functions in the purrr package to handle repeated tasks.map() family of functions for repeated tasks.We just learned the rule of “don’t repeat yourself more than two times” and to instead automate our procedures with functions.
We previously usedacross() to help eliminate copy-paste when working with data frames. This is a form of iteration in programming as across() “iterates” over variables, applying a function to manipulate each variable and then doing the same for the next variable.
This week, we are adding to our toolbox of ways to do efficient iteration rather than repeating code.
while() and for() loops are a common form of iteration that can be extremely useful when logically thinking through a problem. If you are unfamiliar with loops or have not seen them in R, read the R4DS section linked below.
Unlike some other programming languages, loops are extremely computationally intensive in R. Thus, we will avoid using them at all costs!
As we discussed at the beginning of the quarter, one of the beautiful things about R is that many functions are vectorized. This means, that functions are built to work with vectors, and specifically to apply some operation to each element of a vector separately. This is actually a form of iteration!
x <- c(10, 34, 3)
y <- c(3, 35, 1)
x + y[1] 13 69 4
Since addition is vectorized, we don’t have to use a for loop to add each element of the two vectors x and y together.
In languages which don’t have implicit support for vectorized computations, you would have to instead do:
result <- rep(NA, 3)
for(i in 1:3){
result[i] <- x[i] + y[i]
}
result[1] 13 69 4
In other words, we would map the function + to each entry of a and b. For atomic vectors, most functions if a function is vectorized, it will do this automatically!
But what if we want to map a function to each element of a list? Or what if a function isn’t vectorized? This is where the map family of the purrr package comes in.
purrrThe purrr package in R provides functions that allow us to apply some task (function) to all elements of a list. This supports very computationally efficient iteration!
Note that there are base functions in R that solve similar problems (apply(), lapply(), tapply(), etc.), but purrr is much easier to use and has more consistent behavior. For that reason, we will not be working with the apply family functions in this course. I will also say that I would not use apply after learning how to use the map family of functions in purrr myself – they are much better!
If you feel a bit shaky on lists, please read the review below before continuing.
A list is a 1-dimensional data structure that has no restrictions on what type of content is stored within it. A list is a “vector”, but it is not an atomic vector - that is, it does not necessarily contain things that are all the same type.
mylist <- list(
logicals = c(TRUE, TRUE, FALSE, FALSE, TRUE),
numeric_vec = 1:12,
third_thing = letters[1:2]
)
mylist$logicals
[1] TRUE TRUE FALSE FALSE TRUE
$numeric_vec
[1] 1 2 3 4 5 6 7 8 9 10 11 12
$third_thing
[1] "a" "b"
List components may have names (or not), be homogeneous (or not), have the same length (or not).
Indexing necessarily differs between R and Python, and since the list types are also somewhat different (e.g. lists cannot be named in python), we will treat list indexing in the two languages separately.

pepper
pepper[1], the return value is always a list containing the selected element(s).
pepper[[1]], the return value is the selected element.
pepper[[1]][[1]].There are 3 ways to index a list:
mylist[1]$logicals
[1] TRUE TRUE FALSE FALSE TRUE
mylist[2]$numeric_vec
[1] 1 2 3 4 5 6 7 8 9 10 11 12
mylist[c(T, F, T)]$logicals
[1] TRUE TRUE FALSE FALSE TRUE
$third_thing
[1] "a" "b"
mylist[[1]][1] TRUE TRUE FALSE FALSE TRUE
mylist[["third_thing"]][1] "a" "b"
x$name. This is equivalent to using x[["name"]]. Note that this does not work on unnamed entries in the list.mylist$third_thing[1] "a" "b"
To access the contents of a list object, we have to use double-indexing:
mylist[["third_thing"]][[1]][1] "a"
The map family of functions is so called, because all of the function names start with the word map. We can think of these functions as “mapping” a function to all elements of a list. You will learn more about these functions in the reading.
map() family