PA 2: Using Data Visualization to Find the Penguins

Download the .qmd template and save it in a reasonable location.

Today you will be exploring different types of visualizations to uncover which species of penguins reside on different islands.

Some advice:

Getting Started

We will be creating visualizations using the ggplot2 package.

For this activity, we will be exploring the penguins data from the palmerpenguins package, which has fantastic documentation with really awesome artwork. So, you will need to install the palmerpenguins package.

install.packages("palmerpenguins")

install.packages() in the console NOT in your .qmd file!

You should type this into your console and NOT include it in a code chunk in your .qmd file. Recall that we only have to install a package once, but load it each time we open R. Each time you render your .qmd file, ALLthe code chunks are run. Therefore, installing a package in a code chunk would cause R to unnecessarily install the package over and over again. Not good.

Creating a Setup Code Chunk

  1. Insert a code chunk at the beginning of your document (directly under the YAML).
  2. Name the code chunk setup.
  3. Use the hashpipe #| to specify a code chunk option that prevents any messages (e.g., from loading in packages) from appearing.
  4. Load in the tidyverse or ggplot2 package.
  5. Load in the palmerpenguins package.
Code chunk name: setup

Naming your code chunk “setup” has special properties in a .qmd - specifically, this code chunk will run automatically when you try to run a subsequent code chunk. This ensures all packages and any other specifications for your document are loaded and will not cause you errors or messages.

Dataset: penguins

I like to start by seeing the dataset I will be working with, so I am going to pull the penguins data into my R environment. Do you see it in the top right Environment tab?

data(penguins)
Warning in data(penguins): data set 'penguins' not found

You may notice that a dataset called penquins_raw also loaded. We will ignore this and focus on the penguins dataset.

  1. Get to know your data. What are the variables and what units are they measured in? What does each row represent?

Graphics

Note

Make sure to give your plots reader friendly axes labels!

Note

Make sure your final report does not display any warnings or messages from RStudio!

  1. Use https://excalidraw.com/ (or pen and paper, a tablet, etc.) to create a game plan for a barchart of species, where species is mapped to the fill color of the barchart. Save or take a screenshot of your game plan – you will be uploading this to Canvas with your practice activity submission.

  2. Use ggplot2 to create the barchart you outlined above.

  1. Use ggplot2 to create a scatterplot of the relationship between the bill length (bill_length_mm) and bill depth (bill_depth_mm).
  1. Building on your code from (9), add an aesthetic to differentiate the species by color.
  1. Building on your code from (10), add the location of the penguins (island) to your visualization. There us more than one way you could to address this, however, one method will make it easier to answer the questions below, so you might want to read those questions first!

Canvas Quiz

Use the plots you created to answer the following questions on Canvas.
  1. Which species of penguins is represented least in the Palmer Penguins data set?

  2. Which species of penguins are found on every island?

  3. Which species of penguins are found only on Dream Island?

  4. Which species of penguins are found only on Biscoe Island?