Week 4 Starter Notes

Download .qmd

Warning

Download the data that we will be using this week at the followings links and add them to a data/ directory

IMDb data

Rainfall data

Eras Tour data

Data

Once in R, data frames can be saves as .Rdata files using the syntax:

save(data_frame1, data_frame2, ..., file = "path/file-name.Rdata")

and then loaded into R using the syntax:

load("path/file-name.Rdata")

This can be preferable when saving intermediate datasets in an analysis because .Rdata files are much smaller and more memory efficient than .csv files. Additionally, you can save and load multiple data frames at once!

You can see that this is useful here, where we have 7 related data frames saved in imdb_data.Rdata, which are then loaded in one line below.

We will use 7 datasets that describe movies from IMDb.

Relationship between data sets in IMDb movie data.
load(file = "data/imdb_data.Rdata")

On Thursday will also look at joining datasets created from the Lab 2 Rodent data. Note that you will need to change the file path to be appropriate for your directory strucure!

rodent <- read_csv("../../labs/lab2/surveys.csv")

species <- rodent |> 
  select(genus:taxa, species_id) |> 
  distinct()

measurements <- rodent |> 
  select(genus, species, sex:weight) |> 
  rename(genus_name = genus)

… and daily rainfall observed in SLO in January 2023. Data source

slo_rainfall <- read_excel("data/2023-rainfall-slo.xlsx")

slo_rainfall <- slo_rainfall |> 
  mutate(across(Sunday:Saturday, as.numeric))