PA 5.1: Scrambled Message

stringr + Regular Expressions

Download starter qmd file

library(tidyverse)

Working with the Secret Message Data

message_data <- read_csv("https://github.com/earobinson95/stat331-calpoly/raw/master/practice-activities/data/scrambled_message.txt")

If you preivew the data you’ll notice this is a very simple data frame, with only one column – Word. This column contains strings (or words).

Since the Word is stored in a data frame, you will need to use some of the dplyr functions you have learned to work with the data. For example…

  • if you need to find words that have a certain pattern, then you will likely want to pair a str_XXXX() function with filter().

  • f you need to make changes to the words, then you will likely want to pair a str_XXXX() function with mutate().

  • if you want to summarize patterns in the words, then you will likely want to pair a str_XXXX() function with summarize().

Important

In this activity, a “word” is a set of characters with no white space. That is, even though many of elements of the scrambled mess vector are nonsense, and some have punctuation, you can consider each observation to be a “word”.

Warm-up exercises

  1. Print out every word in the scrambled message that starts with the letter “m”. Hint: What symbol do you use to indicate the beginning of a string?

  2. Print out every word in the scrambled message that ends with a letter “z”. Hint: What symbol do you use to indicate the end of a string?

  3. Print out the punctuation symbols that are present in the scrambled message. Your output should just be the unique symbols! Hint: You don’t need to type out every punctuation symbol! There is a shortcut way to look for any punctuation symbol.

  1. Print out the words that have more than 10 characters. Hint: This should take two steps!

  2. Print out the longest word in the scrambled message. Hint: If you already have the length of each word, you can grab the max easily!

Decode the Message

Complete the following steps to decode the message.

  1. Remove any spaces before or after each word.

  2. No word should be longer than 16 characters. Drop all extra characters off the end of each word.

  1. Every time you see the word “ugh”, with any number of h’s, followed by a punctuation mark, delete this.

  2. Replace all instances of exactly 2 a’s with exactly 2 e’s.

  3. Replace all z’s with t’s.

  4. Every word that ends in b, change that to a y.

  1. Every word that starts with k (or K), change that to a v.

  2. Recombine all your words into a message with a stringr function.

NoteCanvas Quiz Submission

What is the name of the movie the quote is from?