Using lubridate to Work with Dates

Thursday, May 1

Today we will…

  • Midterm Exam - What to expect
  • New Material
    • Working with Date & Time Variables
  • PA 5.2: Jewel Heist

Midterm Info

Midterm Exam - Thu 5/8

  • This is a three-part exam
  • First two sections are completed in the one hour and 50 minute class period
    1. You will first complete a General Questions section on paper and without your computer.
    2. After you turn that in, you will complete a Short Answer section with your computer.
  • Third section is “takehome” and due 48 hours after the end of class.
    1. The Take-Home Analysis is completed out of class (should not take more than 3 hours)

Midterm Exam - Thu 5/8

  • Review the “What to Expect” document thoroughly as it includes
    • detailed expectations
    • the dataset you will be working with
  • Set yourself up with a dedicated directory that has the data in it
  • Make sure to bring to the exam:
    • something to write with (black/blue pen or pencils)
    • your laptop (& a charging chord)

Caution

While the coding tasks are open-resource, you will likely run out of time if you have to look everything up. Know what functions you might need and where to find documentation for implementing these functions.

Midterm Preparation Suggestions

  • Review course slides & Check-Ins
  • Quiz each other on the uses of different functions
  • Try to re-do parts of the PAs or LAs from scratch
  • Start working with the data
    • HAVE CODE SET UP THAT READS IN THE DATA
    • Ask some questions about the data and try to answer them
  • Save example code for things you find tricky in a place you can find it
  • Get sleep and feed yourself! 🛌🥞🥙🍛

Date + Time Variables

lubridate

  • Convert a date-like variable (“May 8, 1995”) to a date or date-time object.

  • Find the weekday, month, year, etc from a date-time object.

  • Convert between time zones.

The image shows the hex logo for the lubridate R package. The logo is a green hexagon with a stylized calendar in the center. The calendar has a small clock icon overlapping its bottom left corner, symbolizing time-related functions. The text 'lubridate' appears prominently below the calendar icon within the hexagon. Lubridate is commonly used in R for working with date and time data.

Note

The lubridate package installs and loads with the tidyverse.

Why are dates and times tricky?

When parsing dates and times, we have to consider complicating factors like…

  • Daylight Savings Time.
    • One day a year is 23 hours; one day a year is 25 hours.
    • Some places use it, some don’t.
  • Leap years – most years have 365 days, some have 366.
  • Time zones.

date-time Objects

There are multiple data types for dates and times.

  • A date:
    • date or Date
  • A date and a time (identifies a unique instant in time):
    • dtm
    • POSIXlt – stores date-times as the number of seconds since January 1, 1970 (“Unix Epoch”)
    • POSIXct – stores date-times as a list with elements for second, minute, hour, day, month, year, etc.

Creating date-time Objects

Big Picture

There are a lot of diferent ways to create date-time objects!

Create a date from individual components:

make_date(year = 1995, month = 05, day = 08)
[1] "1995-05-08"

Create a date-time Object from a String

mdy("August 29, 1991")
[1] "1991-08-29"
dmy("29-August-1991", 
    tz = "America/Denver")
[1] "1991-08-29 MDT"
dmy_hms("29-August-1991 9:32:12", 
        tz = "America/Denver")
[1] "1991-08-29 09:32:12 MDT"
as_datetime("91-08-29", 
            format = "%y-%m-%d")
[1] "1991-08-29 UTC"
parse_datetime("8/29/1991", 
               format = "%m/%d/%Y")
[1] "1991-08-29 UTC"

Creating date-time Objects

Common Mistake with Dates

as_datetime(2023-02-6)
[1] "1970-01-01 00:33:35 UTC"
my_date <- 2023-02-6
my_date
[1] 2015


What’s wrong here?


Make sure you use quotes!

  • 2,015 seconds \(\approx\) 33.5 minutes

Extracting date-time Components

bday <- ymd_hms("1995-02-27 07:03:12", 
                tz = "America/Chicago")
bday
[1] "1995-02-27 07:03:12 CST"


year(bday)
[1] 1995
month(bday)
[1] 2
day(bday)
[1] 27
wday(bday)
[1] 2
wday(bday, 
     label = TRUE, 
     abbr = FALSE)
[1] Monday
7 Levels: Sunday < Monday < Tuesday < Wednesday < Thursday < ... < Saturday

Subtraction with date-time Objects

Doing subtraction gives you a difftime object.

difftime objects do not always have the same units – it depends on the scale of the objects you are working with.

How old am I?

today() - mdy("02-27-1995")
Time difference of 11021 days


How long did it take me to type this slide?

begin <- mdy_hms("10/21/2024 20:40:34")
finish <- mdy_hms("10/21/2024 20:43:11")

finish - begin
Time difference of 2.616667 mins

Durations and Periods

Durations will always give the time span in an exact number of seconds.

as.duration(
  today() - mdy("02-27-1995")
            )
[1] "952214400s (~30.17 years)"

Periods will give the time span in more approximate, but human readable times.

as.period(
  today() - mdy("02-27-1995")
  )
[1] "11021d 0H 0M 0S"

Durations and Periods

We can also add time to date-time objects:

  • days(), years(), etc. will add a period of time.
  • ddays(), dyears(), etc. will add a duration of time.

Because durations use the exact number of seconds to represent days and years, you might get unexpected results.


When is is my 99th birthday?

mdy("02/27/1995") + years(99)
[1] "2094-02-27"
mdy("02/27/1995") + dyears(99)
[1] "2094-02-26 18:00:00 UTC"

Time Zones…

…are complicated!


Specify time zones in the form:

  • {continent}/{city} – “America/Denver”, “Africa/Nairobi”
  • {ocean}/{city} – “Pacific/Auckland”

What time zone does R think I’m in?

Sys.timezone()
[1] "America/Los_Angeles"

Time Zones

You can change the time zone of a date in two ways:

x <- ymd_hms("2024-10-24 18:00:00", 
             tz = "Europe/Copenhagen")

with_tz()

Keeps the instant in time the same, but changes the visual representation.

x |> 
  with_tz()
[1] "2024-10-24 09:00:00 PDT"
x |> 
  with_tz(tzone = "Asia/Kolkata")
[1] "2024-10-24 21:30:00 IST"

force_tz()

Changes the instant in time by forcing a time zone change.

x |> 
  force_tz()
[1] "2024-10-24 18:00:00 PDT"
x |> 
  force_tz(tzone = "Asia/Kolkata")
[1] "2024-10-24 18:00:00 IST"

Common Mistake with Dates

When you read data in or create a new date-time object, the default time zone (if not specified) is UTC (Universal Time Coordinated)*.

So, make sure you specify your desired time zone!

x <- mdy("11/20/1993")
tz(x)
[1] "UTC"
x <- mdy("11/20/1993", 
         tz = "America/Los_Angeles")
tz(x)
[1] "America/Los_Angeles"

*UTC is the same as GMT (Greenwich Mean Time)

Tips for Working with Dates

  • Always just check that you are getting results that you expect!
  • Pay attention to time zones
  • Use the lubridate cheatsheet

PA 5.2: Jewel Heist

  • Use dates from clues to find the jewel thief!
  • Make sure to pay attention to time zones ⏰

Movie poster for The Pink Panther, a movie and TV show series. Shows a detective following paw prints with a magnifying glass, but a shadow of the pink panther looming over them looking mischieveous.

LA 5: Murder in SQL City

  • This lab looks different!
  • You will need a number of steps to follow the clues - it won’t be done in one pipeline
  • Read the instructions carefully
  • At the end, try to delete any code or output that you don’t actually need
  • Check with others if you are stuck! You can see if they get the witness or clues answers at that step.

T

To do…

  • PA 5.2: Jewel Heist
    • due Friday (5/2) at 11:59pm
  • Lab 5: Murder in SQL City
    • due Monday (5/5) at 11:59pm
  • Read Chapter 6: Version Control
    • Check-in 6.1 - 6.2 due Tuesday (5/6) before class
  • Project Checkpoint 1: Group Formation Survey
    • due Tuesday (5/6) at 11:59pm