Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Function in R / dplyr similar to VLOOKUP when using mutate()

I want to use dplyr’s mutate and subtract a variable from another one, that I have to find based on two criteria.

Here’s an example of what I want to do:

# Generating the data
mode_of_travel <- c("car", "car", "plane", "plane", "train", "train")
variant <- c("slow","fast","slow","fast","slow","fast") 
speed <- c(5, 7, 10, 14, 6, 7)
df <- data.frame(mode_of_travel, variant, speed)
# Data result
  mode_of_travel variant speed
1            car    slow     5
2            car    fast     7
3          plane    slow    10
4          plane    fast    14
5          train    slow     6
6          train    fast     7

and now I want to find the difference in speed between every mode and variant and the corresponding "slow" variant:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

# Computing the speed difference between the slow and the fast variant
df %>% mutate(speed_difference = speed - case_when(mode_of_travel == "car" ~ 7,
                                                  mode_of_travel == "plane" ~ 5,
                                                  mode_of_travel == "train" ~ 4))

so the output looks like this:

  mode_of_travel variant speed speed_difference
1            car    slow     5                0
2            car    fast     7                2
3          plane    slow    10                0
4          plane    fast    14                4
5          train    slow     6                0
6          train    fast     7                1

But of course I don’t want to manually do this via the "case_when" function and typing in the value.
How can this be done properly?

Thanks 🙂

>Solution :

I would describe this as within each mode_of_travel group you want to subtract the "slow" variant speed from the current row’s speed:

df %>%
  group_by(mode_of_travel) %>%
  mutate(speed_difference = speed - speed[variant == "slow"]) %>%
  ungroup()
# # A tibble: 6 × 4
#   mode_of_travel variant speed speed_difference
#   <chr>          <chr>   <dbl>            <dbl>
# 1 car            slow        5                0
# 2 car            fast        7                2
# 3 plane          slow       10                0
# 4 plane          fast       14                4
# 5 train          slow        6                0
# 6 train          fast        7                1

This code assumes there is exactly one "slow" variant within each mode of travel group.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading