Today we will…
dplyr verbs to have more functionality.dplyr verbs| name | manuf | type | calories | protein | fat | sodium | fiber | carbo | sugars | potass | vitamins | shelf | weight | cups | rating |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 100% Bran | N | cold | 70 | 4 | 1 | 130 | 10 | 5 | 6 | 280 | 25 | 3 | 1 | 0.33 | 68.40297 |
| 100% Natural Bran | Q | cold | 120 | 3 | 5 | 15 | 2 | 8 | 8 | 135 | 0 | 3 | 1 | 1.00 | 33.98368 |
| All-Bran | K | cold | 70 | 4 | 1 | 260 | 9 | 7 | 5 | 320 | 25 | 3 | 1 | 0.33 | 59.42551 |
| All-Bran with Extra Fiber | K | cold | 50 | 4 | 0 | 140 | 14 | 8 | 0 | 330 | 25 | 3 | 1 | 0.50 | 93.70491 |
| Almond Delight | R | cold | 110 | 2 | 2 | 200 | 1 | 14 | 8 | -1 | 25 | 3 | 1 | 0.75 | 34.38484 |
dplyrWe have already covered a lot, but not everything you might want…
Today we will cover functions that help with the following tasks:
pull()What is the mean potassium for cold cereals?
$ operator in a pipelinepull() to the rescue!pull() extracts a data frame column as a vectorsummarize()How many cereals does each manuf have in this dataset?
count()How many cereals does each manuf have in this dataset?
if_else()For each cereal, label the potass as “high” or “low”.
One if-else statement:
if_else(<CONDITION>, <TRUE OUTPUT>, <FALSE OUTPUT>)| name | manuf | type | calories | protein | fat | sodium | fiber | carbo | sugars | potass | po_category | vitamins | shelf | weight | cups | rating |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 100% Bran | N | cold | 70 | 4 | 1 | 130 | 10.0 | 5.0 | 6 | 280 | high | 25 | 3 | 1.00 | 0.33 | 68.40297 |
| 100% Natural Bran | Q | cold | 120 | 3 | 5 | 15 | 2.0 | 8.0 | 8 | 135 | high | 0 | 3 | 1.00 | 1.00 | 33.98368 |
| All-Bran | K | cold | 70 | 4 | 1 | 260 | 9.0 | 7.0 | 5 | 320 | high | 25 | 3 | 1.00 | 0.33 | 59.42551 |
| All-Bran with Extra Fiber | K | cold | 50 | 4 | 0 | 140 | 14.0 | 8.0 | 0 | 330 | high | 25 | 3 | 1.00 | 0.50 | 93.70491 |
| Almond Delight | R | cold | 110 | 2 | 2 | 200 | 1.0 | 14.0 | 8 | -1 | low | 25 | 3 | 1.00 | 0.75 | 34.38484 |
| Apple Cinnamon Cheerios | G | cold | 110 | 2 | 2 | 180 | 1.5 | 10.5 | 10 | 70 | low | 25 | 1 | 1.00 | 0.75 | 29.50954 |
| Apple Jacks | K | cold | 110 | 2 | 0 | 125 | 1.0 | 11.0 | 14 | 30 | low | 25 | 2 | 1.00 | 1.00 | 33.17409 |
| Basic 4 | G | cold | 130 | 3 | 2 | 210 | 2.0 | 18.0 | 8 | 100 | low | 25 | 3 | 1.33 | 0.75 | 37.03856 |
| Bran Chex | R | cold | 90 | 2 | 1 | 200 | 4.0 | 15.0 | 6 | 125 | high | 25 | 1 | 1.00 | 0.67 | 49.12025 |
| Bran Flakes | P | cold | 90 | 3 | 0 | 210 | 5.0 | 13.0 | 5 | 190 | high | 25 | 3 | 1.00 | 0.67 | 53.31381 |
| Cap'n'Crunch | Q | cold | 120 | 1 | 2 | 220 | 0.0 | 12.0 | 12 | 35 | low | 25 | 2 | 1.00 | 0.75 | 18.04285 |
| Cheerios | G | cold | 110 | 6 | 2 | 290 | 2.0 | 17.0 | 1 | 105 | high | 25 | 1 | 1.00 | 1.25 | 50.76500 |
| Cinnamon Toast Crunch | G | cold | 120 | 1 | 3 | 210 | 0.0 | 13.0 | 9 | 45 | low | 25 | 2 | 1.00 | 0.75 | 19.82357 |
| Clusters | G | cold | 110 | 3 | 2 | 140 | 2.0 | 13.0 | 7 | 105 | high | 25 | 3 | 1.00 | 0.50 | 40.40021 |
| Cocoa Puffs | G | cold | 110 | 1 | 1 | 180 | 0.0 | 12.0 | 13 | 55 | low | 25 | 2 | 1.00 | 1.00 | 22.73645 |
| Corn Chex | R | cold | 110 | 2 | 0 | 280 | 0.0 | 22.0 | 3 | 25 | low | 25 | 1 | 1.00 | 1.00 | 41.44502 |
| Corn Flakes | K | cold | 100 | 2 | 0 | 290 | 1.0 | 21.0 | 2 | 35 | low | 25 | 1 | 1.00 | 1.00 | 45.86332 |
| Corn Pops | K | cold | 110 | 1 | 0 | 90 | 1.0 | 13.0 | 12 | 20 | low | 25 | 2 | 1.00 | 1.00 | 35.78279 |
| Count Chocula | G | cold | 110 | 1 | 1 | 180 | 0.0 | 12.0 | 13 | 65 | low | 25 | 2 | 1.00 | 1.00 | 22.39651 |
| Cracklin' Oat Bran | K | cold | 110 | 3 | 3 | 140 | 4.0 | 10.0 | 7 | 160 | high | 25 | 3 | 1.00 | 0.50 | 40.44877 |
| Cream of Wheat (Quick) | N | hot | 100 | 3 | 0 | 80 | 1.0 | 21.0 | 0 | -1 | low | 0 | 2 | 1.00 | 1.00 | 64.53382 |
| Crispix | K | cold | 110 | 2 | 0 | 220 | 1.0 | 21.0 | 3 | 30 | low | 25 | 3 | 1.00 | 1.00 | 46.89564 |
| Crispy Wheat & Raisins | G | cold | 100 | 2 | 1 | 140 | 2.0 | 11.0 | 10 | 120 | high | 25 | 3 | 1.00 | 0.75 | 36.17620 |
| Double Chex | R | cold | 100 | 2 | 0 | 190 | 1.0 | 18.0 | 5 | 80 | low | 25 | 3 | 1.00 | 0.75 | 44.33086 |
| Froot Loops | K | cold | 110 | 2 | 1 | 125 | 1.0 | 11.0 | 13 | 30 | low | 25 | 2 | 1.00 | 1.00 | 32.20758 |
| Frosted Flakes | K | cold | 110 | 1 | 0 | 200 | 1.0 | 14.0 | 11 | 25 | low | 25 | 1 | 1.00 | 0.75 | 31.43597 |
| Frosted Mini-Wheats | K | cold | 100 | 3 | 0 | 0 | 3.0 | 14.0 | 7 | 100 | low | 25 | 2 | 1.00 | 0.80 | 58.34514 |
| Fruit & Fibre Dates; Walnuts; and Oats | P | cold | 120 | 3 | 2 | 160 | 5.0 | 12.0 | 10 | 200 | high | 25 | 3 | 1.25 | 0.67 | 40.91705 |
| Fruitful Bran | K | cold | 120 | 3 | 0 | 240 | 5.0 | 14.0 | 12 | 190 | high | 25 | 3 | 1.33 | 0.67 | 41.01549 |
| Fruity Pebbles | P | cold | 110 | 1 | 1 | 135 | 0.0 | 13.0 | 12 | 25 | low | 25 | 2 | 1.00 | 0.75 | 28.02576 |
| Golden Crisp | P | cold | 100 | 2 | 0 | 45 | 0.0 | 11.0 | 15 | 40 | low | 25 | 1 | 1.00 | 0.88 | 35.25244 |
| Golden Grahams | G | cold | 110 | 1 | 1 | 280 | 0.0 | 15.0 | 9 | 45 | low | 25 | 2 | 1.00 | 0.75 | 23.80404 |
| Grape Nuts Flakes | P | cold | 100 | 3 | 1 | 140 | 3.0 | 15.0 | 5 | 85 | low | 25 | 3 | 1.00 | 0.88 | 52.07690 |
| Grape-Nuts | P | cold | 110 | 3 | 0 | 170 | 3.0 | 17.0 | 3 | 90 | low | 25 | 3 | 1.00 | 0.25 | 53.37101 |
| Great Grains Pecan | P | cold | 120 | 3 | 3 | 75 | 3.0 | 13.0 | 4 | 100 | low | 25 | 3 | 1.00 | 0.33 | 45.81172 |
| Honey Graham Ohs | Q | cold | 120 | 1 | 2 | 220 | 1.0 | 12.0 | 11 | 45 | low | 25 | 2 | 1.00 | 1.00 | 21.87129 |
| Honey Nut Cheerios | G | cold | 110 | 3 | 1 | 250 | 1.5 | 11.5 | 10 | 90 | low | 25 | 1 | 1.00 | 0.75 | 31.07222 |
| Honey-comb | P | cold | 110 | 1 | 0 | 180 | 0.0 | 14.0 | 11 | 35 | low | 25 | 1 | 1.00 | 1.33 | 28.74241 |
| Just Right Crunchy Nuggets | K | cold | 110 | 2 | 1 | 170 | 1.0 | 17.0 | 6 | 60 | low | 100 | 3 | 1.00 | 1.00 | 36.52368 |
| Just Right Fruit & Nut | K | cold | 140 | 3 | 1 | 170 | 2.0 | 20.0 | 9 | 95 | low | 100 | 3 | 1.30 | 0.75 | 36.47151 |
| Kix | G | cold | 110 | 2 | 1 | 260 | 0.0 | 21.0 | 3 | 40 | low | 25 | 2 | 1.00 | 1.50 | 39.24111 |
| Life | Q | cold | 100 | 4 | 2 | 150 | 2.0 | 12.0 | 6 | 95 | low | 25 | 2 | 1.00 | 0.67 | 45.32807 |
| Lucky Charms | G | cold | 110 | 2 | 1 | 180 | 0.0 | 12.0 | 12 | 55 | low | 25 | 2 | 1.00 | 1.00 | 26.73451 |
| Maypo | A | hot | 100 | 4 | 1 | 0 | 0.0 | 16.0 | 3 | 95 | low | 25 | 2 | 1.00 | 1.00 | 54.85092 |
| Muesli Raisins; Dates; & Almonds | R | cold | 150 | 4 | 3 | 95 | 3.0 | 16.0 | 11 | 170 | high | 25 | 3 | 1.00 | 1.00 | 37.13686 |
| Muesli Raisins; Peaches; & Pecans | R | cold | 150 | 4 | 3 | 150 | 3.0 | 16.0 | 11 | 170 | high | 25 | 3 | 1.00 | 1.00 | 34.13976 |
| Mueslix Crispy Blend | K | cold | 160 | 3 | 2 | 150 | 3.0 | 17.0 | 13 | 160 | high | 25 | 3 | 1.50 | 0.67 | 30.31335 |
| Multi-Grain Cheerios | G | cold | 100 | 2 | 1 | 220 | 2.0 | 15.0 | 6 | 90 | low | 25 | 1 | 1.00 | 1.00 | 40.10596 |
| Nut&Honey Crunch | K | cold | 120 | 2 | 1 | 190 | 0.0 | 15.0 | 9 | 40 | low | 25 | 2 | 1.00 | 0.67 | 29.92429 |
| Nutri-Grain Almond-Raisin | K | cold | 140 | 3 | 2 | 220 | 3.0 | 21.0 | 7 | 130 | high | 25 | 3 | 1.33 | 0.67 | 40.69232 |
| Nutri-grain Wheat | K | cold | 90 | 3 | 0 | 170 | 3.0 | 18.0 | 2 | 90 | low | 25 | 3 | 1.00 | 1.00 | 59.64284 |
| Oatmeal Raisin Crisp | G | cold | 130 | 3 | 2 | 170 | 1.5 | 13.5 | 10 | 120 | high | 25 | 3 | 1.25 | 0.50 | 30.45084 |
| Post Nat. Raisin Bran | P | cold | 120 | 3 | 1 | 200 | 6.0 | 11.0 | 14 | 260 | high | 25 | 3 | 1.33 | 0.67 | 37.84059 |
| Product 19 | K | cold | 100 | 3 | 0 | 320 | 1.0 | 20.0 | 3 | 45 | low | 100 | 3 | 1.00 | 1.00 | 41.50354 |
| Puffed Rice | Q | cold | 50 | 1 | 0 | 0 | 0.0 | 13.0 | 0 | 15 | low | 0 | 3 | 0.50 | 1.00 | 60.75611 |
| Puffed Wheat | Q | cold | 50 | 2 | 0 | 0 | 1.0 | 10.0 | 0 | 50 | low | 0 | 3 | 0.50 | 1.00 | 63.00565 |
| Quaker Oat Squares | Q | cold | 100 | 4 | 1 | 135 | 2.0 | 14.0 | 6 | 110 | high | 25 | 3 | 1.00 | 0.50 | 49.51187 |
| Quaker Oatmeal | Q | hot | 100 | 5 | 2 | 0 | 2.7 | -1.0 | -1 | 110 | high | 0 | 1 | 1.00 | 0.67 | 50.82839 |
| Raisin Bran | K | cold | 120 | 3 | 1 | 210 | 5.0 | 14.0 | 12 | 240 | high | 25 | 2 | 1.33 | 0.75 | 39.25920 |
| Raisin Nut Bran | G | cold | 100 | 3 | 2 | 140 | 2.5 | 10.5 | 8 | 140 | high | 25 | 3 | 1.00 | 0.50 | 39.70340 |
| Raisin Squares | K | cold | 90 | 2 | 0 | 0 | 2.0 | 15.0 | 6 | 110 | high | 25 | 3 | 1.00 | 0.50 | 55.33314 |
| Rice Chex | R | cold | 110 | 1 | 0 | 240 | 0.0 | 23.0 | 2 | 30 | low | 25 | 1 | 1.00 | 1.13 | 41.99893 |
| Rice Krispies | K | cold | 110 | 2 | 0 | 290 | 0.0 | 22.0 | 3 | 35 | low | 25 | 1 | 1.00 | 1.00 | 40.56016 |
| Shredded Wheat | N | cold | 80 | 2 | 0 | 0 | 3.0 | 16.0 | 0 | 95 | low | 0 | 1 | 0.83 | 1.00 | 68.23588 |
| Shredded Wheat 'n'Bran | N | cold | 90 | 3 | 0 | 0 | 4.0 | 19.0 | 0 | 140 | high | 0 | 1 | 1.00 | 0.67 | 74.47295 |
| Shredded Wheat spoon size | N | cold | 90 | 3 | 0 | 0 | 3.0 | 20.0 | 0 | 120 | high | 0 | 1 | 1.00 | 0.67 | 72.80179 |
| Smacks | K | cold | 110 | 2 | 1 | 70 | 1.0 | 9.0 | 15 | 40 | low | 25 | 2 | 1.00 | 0.75 | 31.23005 |
| Special K | K | cold | 110 | 6 | 0 | 230 | 1.0 | 16.0 | 3 | 55 | low | 25 | 1 | 1.00 | 1.00 | 53.13132 |
| Strawberry Fruit Wheats | N | cold | 90 | 2 | 0 | 15 | 3.0 | 15.0 | 5 | 90 | low | 25 | 2 | 1.00 | 1.00 | 59.36399 |
| Total Corn Flakes | G | cold | 110 | 2 | 1 | 200 | 0.0 | 21.0 | 3 | 35 | low | 100 | 3 | 1.00 | 1.00 | 38.83975 |
| Total Raisin Bran | G | cold | 140 | 3 | 1 | 190 | 4.0 | 15.0 | 14 | 230 | high | 100 | 3 | 1.50 | 1.00 | 28.59278 |
| Total Whole Grain | G | cold | 100 | 3 | 1 | 200 | 3.0 | 16.0 | 3 | 110 | high | 100 | 3 | 1.00 | 1.00 | 46.65884 |
| Triples | G | cold | 110 | 2 | 1 | 250 | 0.0 | 21.0 | 3 | 60 | low | 25 | 3 | 1.00 | 0.75 | 39.10617 |
| Trix | G | cold | 110 | 1 | 1 | 140 | 0.0 | 13.0 | 12 | 25 | low | 25 | 2 | 1.00 | 1.00 | 27.75330 |
| Wheat Chex | R | cold | 100 | 3 | 1 | 230 | 3.0 | 17.0 | 3 | 115 | high | 25 | 1 | 1.00 | 0.67 | 49.78744 |
| Wheaties | G | cold | 100 | 3 | 1 | 200 | 3.0 | 17.0 | 3 | 110 | high | 25 | 1 | 1.00 | 1.00 | 51.59219 |
| Wheaties Honey Gold | G | cold | 110 | 2 | 1 | 200 | 1.0 | 16.0 | 8 | 60 | low | 25 | 1 | 1.00 | 0.75 | 36.18756 |
.after – specifies the location of the newly created column
case_when()For each cereal, label the amount of sugar as “low”, “medium”, “high”, or “very high”.
A series of if-else statements.
| name | sugars | sugar_level |
|---|---|---|
| 100% Bran | 6 | high |
| 100% Natural Bran | 8 | high |
| All-Bran | 5 | medium |
| All-Bran with Extra Fiber | 0 | low |
| Almond Delight | 8 | high |
| Apple Cinnamon Cheerios | 10 | high |
| Apple Jacks | 14 | very high |
| Basic 4 | 8 | high |
| Bran Chex | 6 | high |
| Bran Flakes | 5 | medium |
| Cap'n'Crunch | 12 | very high |
| Cheerios | 1 | low |
| Cinnamon Toast Crunch | 9 | high |
| Clusters | 7 | high |
| Cocoa Puffs | 13 | very high |
| Corn Chex | 3 | medium |
| Corn Flakes | 2 | low |
| Corn Pops | 12 | very high |
| Count Chocula | 13 | very high |
| Cracklin' Oat Bran | 7 | high |
| Cream of Wheat (Quick) | 0 | low |
| Crispix | 3 | medium |
| Crispy Wheat & Raisins | 10 | high |
| Double Chex | 5 | medium |
| Froot Loops | 13 | very high |
| Frosted Flakes | 11 | very high |
| Frosted Mini-Wheats | 7 | high |
| Fruit & Fibre Dates; Walnuts; and Oats | 10 | high |
| Fruitful Bran | 12 | very high |
| Fruity Pebbles | 12 | very high |
| Golden Crisp | 15 | very high |
| Golden Grahams | 9 | high |
| Grape Nuts Flakes | 5 | medium |
| Grape-Nuts | 3 | medium |
| Great Grains Pecan | 4 | medium |
| Honey Graham Ohs | 11 | very high |
| Honey Nut Cheerios | 10 | high |
| Honey-comb | 11 | very high |
| Just Right Crunchy Nuggets | 6 | high |
| Just Right Fruit & Nut | 9 | high |
| Kix | 3 | medium |
| Life | 6 | high |
| Lucky Charms | 12 | very high |
| Maypo | 3 | medium |
| Muesli Raisins; Dates; & Almonds | 11 | very high |
| Muesli Raisins; Peaches; & Pecans | 11 | very high |
| Mueslix Crispy Blend | 13 | very high |
| Multi-Grain Cheerios | 6 | high |
| Nut&Honey Crunch | 9 | high |
| Nutri-Grain Almond-Raisin | 7 | high |
| Nutri-grain Wheat | 2 | low |
| Oatmeal Raisin Crisp | 10 | high |
| Post Nat. Raisin Bran | 14 | very high |
| Product 19 | 3 | medium |
| Puffed Rice | 0 | low |
| Puffed Wheat | 0 | low |
| Quaker Oat Squares | 6 | high |
| Quaker Oatmeal | -1 | NA |
| Raisin Bran | 12 | very high |
| Raisin Nut Bran | 8 | high |
| Raisin Squares | 6 | high |
| Rice Chex | 2 | low |
| Rice Krispies | 3 | medium |
| Shredded Wheat | 0 | low |
| Shredded Wheat 'n'Bran | 0 | low |
| Shredded Wheat spoon size | 0 | low |
| Smacks | 15 | very high |
| Special K | 3 | medium |
| Strawberry Fruit Wheats | 5 | medium |
| Total Corn Flakes | 3 | medium |
| Total Raisin Bran | 14 | very high |
| Total Whole Grain | 3 | medium |
| Triples | 3 | medium |
| Trix | 12 | very high |
| Wheat Chex | 3 | medium |
| Wheaties | 3 | medium |
| Wheaties Honey Gold | 8 | high |
group_by() + slice()For each manuf, find the cereal with the most fiber.
| name | manuf | type | calories | protein | fat | sodium | fiber | carbo | sugars | potass | vitamins | shelf | weight | cups | rating |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Maypo | A | hot | 100 | 4 | 1 | 0 | 0.0 | 16 | 3 | 95 | 25 | 2 | 1.00 | 1.00 | 54.85092 |
| Total Raisin Bran | G | cold | 140 | 3 | 1 | 190 | 4.0 | 15 | 14 | 230 | 100 | 3 | 1.50 | 1.00 | 28.59278 |
| All-Bran with Extra Fiber | K | cold | 50 | 4 | 0 | 140 | 14.0 | 8 | 0 | 330 | 25 | 3 | 1.00 | 0.50 | 93.70491 |
| 100% Bran | N | cold | 70 | 4 | 1 | 130 | 10.0 | 5 | 6 | 280 | 25 | 3 | 1.00 | 0.33 | 68.40297 |
| Post Nat. Raisin Bran | P | cold | 120 | 3 | 1 | 200 | 6.0 | 11 | 14 | 260 | 25 | 3 | 1.33 | 0.67 | 37.84059 |
| Quaker Oatmeal | Q | hot | 100 | 5 | 2 | 0 | 2.7 | -1 | -1 | 110 | 0 | 1 | 1.00 | 0.67 | 50.82839 |
| Bran Chex | R | cold | 90 | 2 | 1 | 200 | 4.0 | 15 | 6 | 125 | 25 | 1 | 1.00 | 0.67 | 49.12025 |
slice()Find the 3 cereals with the highest fiber and potass.
| name | manuf | type | calories | protein | fat | sodium | fiber | carbo | sugars | potass | vitamins | shelf | weight | cups | rating |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| All-Bran with Extra Fiber | K | cold | 50 | 4 | 0 | 140 | 14 | 8 | 0 | 330 | 25 | 3 | 1 | 0.50 | 93.70491 |
| 100% Bran | N | cold | 70 | 4 | 1 | 130 | 10 | 5 | 6 | 280 | 25 | 3 | 1 | 0.33 | 68.40297 |
| All-Bran | K | cold | 70 | 4 | 1 | 260 | 9 | 7 | 5 | 320 | 25 | 3 | 1 | 0.33 | 59.42551 |
For each type of cereal, calculate the mean nutrient levels.
SO MUCH COPY-PASTE!
There are 9 different nutrient columns in the dataset! There has to be a better way…
across()For each type of cereal, calculate the mean nutrient levels.
| type | calories | protein | fat | sodium | fiber | carbo | sugars | potass |
|---|---|---|---|---|---|---|---|---|
| cold | 107.1622 | 2.486486 | 1.013513 | 165.06757 | 2.189189 | 14.7027 | 7.1756757 | 97.21622 |
| hot | 100.0000 | 4.000000 | 1.000000 | 26.66667 | 1.233333 | 12.0000 | 0.6666667 | 68.00000 |
So much better!
across()Within the summarize() function, we use the across() function, with three arguments:
.cols – to specify the columns to apply functions to..fns – to specify the functions to apply..x – as a placeholder for the variables being passed into the function.Use lambda functions: ~ <FUN_NAME>(.x, <ARGS>)
across()select()For each type of cereal, calculate the means of all numeric variables.
| type | calories | protein | fat | sodium | fiber | carbo | sugars | potass | vitamins | shelf | weight | cups | rating |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| cold | 107.1622 | 2.486486 | 1.013513 | 165.06757 | 2.189189 | 14.7027 | 7.1756757 | 97.21622 | 29.054054 | 2.229730 | 1.030811 | 0.8182432 | 42.09522 |
| hot | 100.0000 | 4.000000 | 1.000000 | 26.66667 | 1.233333 | 12.0000 | 0.6666667 | 68.00000 | 8.333333 | 1.666667 | 1.000000 | 0.8900000 | 56.73771 |
across()Break it down:
.x and add a ~ in front for the .fns argument. You have created a lambda function!.cols argumentacross(): Related FunctionsThese functions are used with filter() to select rows based on a logical statement applied to multiple columns
if_any() – returns a logical vector (one element for each row) that is TRUE if the logical statement is true for any column in the supplied columnsif_all() – returns a logical vector (one element for each row) that is TRUE if the logical statement is true for all columns in the supplied columnsif_any() ExampleRemember, you got warnings in PA3 when converting some columns to numeric? If you look at the original data, you can see this is because missing values were indicated with the string "NULL".
| INSTNM | CITY | STABBR | ZIP | ADM_RATE | SAT_AVG | UGDS | TUITIONFEE_IN | TUITIONFEE_OUT | CONTROL | REGION |
|---|---|---|---|---|---|---|---|---|---|---|
| Alabama A & M University | Normal | AL | 35762 | 0.9027 | 929 | 4824 | 9857 | 18236 | 1 | 5 |
| University of Alabama at Birmingham | Birmingham | AL | 35294-0110 | 0.9181 | 1195 | 12866 | 8328 | 19032 | 1 | 5 |
| Amridge University | Montgomery | AL | 36117-3553 | NULL | NULL | 322 | 6900 | 6900 | 2 | 5 |
ggplot()Plot the mean protein per cup for each manuf.
cereal |>
mutate(manuf = case_when(manuf == "A" ~ "American Home Food Products",
manuf == "G" ~ "General Mills",
manuf == "K" ~ "Kelloggs",
manuf == "N" ~ "Nabisco",
manuf == "P" ~ "Post",
manuf == "Q" ~ "Quaker Oats",
manuf == "R" ~ "Ralston Purina")) |>
filter(type == "cold") |>
mutate(pro_per_cup = protein / cups) |>
group_by(manuf) |>
summarise(mean_pro_per_cup = mean(pro_per_cup)) |>
ggplot(aes(x = manuf,
y = mean_pro_per_cup)) +
geom_point(size = 6) +
labs(x = "Manufacturer",
subtitle = "Mean Protein per Cup") +
theme_bw() +
theme(axis.title.y = element_blank(),
axis.title.x = element_text(size = 24),
plot.subtitle = element_text(size = 24),
axis.text = element_text(size = 20),
axis.text.x = element_text(angle = 13)) +
scale_y_continuous(limits = c(0,6))ggplot()Plot the mean protein per cup for each manuf.
cereal |>
mutate(manuf = case_when(manuf == "A" ~ "American Home Food Products",
manuf == "G" ~ "General Mills",
manuf == "K" ~ "Kelloggs",
manuf == "N" ~ "Nabisco",
manuf == "P" ~ "Post",
manuf == "Q" ~ "Quaker Oats",
manuf == "R" ~ "Ralston Purina")) |>
filter(type == "cold") |>
mutate(pro_per_cup = protein / cups) |>
group_by(manuf) |>
summarise(mean_pro_per_cup = mean(pro_per_cup)) |>
ggplot(aes(x = manuf,
y = mean_pro_per_cup)) +
geom_point(size = 6) +
labs(x = "Manufacturer",
subtitle = "Mean Protein per Cup") +
theme_bw() +
theme(axis.title.y = element_blank(),
axis.title.x = element_text(size = 24),
plot.subtitle = element_text(size = 24),
axis.text = element_text(size = 20),
axis.text.x = element_text(angle = 13)) +
scale_y_continuous(limits = c(0,6))ggplot()
How would you make this plot from the diamonds dataset in ggplot2?
diamonds |>
mutate(category = case_when(price < 1000 ~ "<$1k",
price <= 5000 ~ "$1k-$5k",
.default = ">$5k")) |>
ggplot(mapping = aes(x = cut,
fill = cut)) +
geom_bar() +
facet_wrap(vars(category)) +
labs(subtitle = "Number of Diamonds",
x = "Cut",
y = "",
fill = "Cut") +
theme(axis.text.x = element_blank(),
axis.title = element_text(size = 14),
legend.title = element_text(size = 14),
legend.text = element_text(size = 14),
strip.text = element_text(size = 14),
title = element_text(size = 14))Just like when creating graphics with ggplot, wrangling data with dplyr involves thinking through many steps and writing many layers of code.
This might involve…
dplyr verbs and variable names.head of the dataframe.What is the median grams of sugars per shelf and the number of cereals per shelf, when we drop the missing values (coded as sugars = -1)?
The person with the nearest birthday: explain out loud to your neighbor how you would do this manipulation.
1. What do we mean by data ethics?
2. Why do we (as statisticians, data scientists, folks working with data) need to think about data ethics?
1. What do we mean by data ethics?
2. Why do we (as statisticians, data scientists, folks working with data) need to think about data ethics?
With the people next to you discuss:
Source: Data Feminism by by Catherine D’Ignazio and Lauren Klein (2020)
Heather Kraus suggests asking 5 questions of your data:
I would love to discuss these with you in office hours!
dplyr cheatsheetkable() for formatting in labsknitr package at the beginning of your filekable() outputs a markdown version of your data