Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Add column in first data frame based upon two columns in second data frame

I am trying to add a column to a first data frame based upon a second data frame.
Basically, in the data frame 1, I have values, that are existing in data frame 2 but with additional information that I would like to extract into data frame 1.

Down below a MWE:

Class <- c("Fruit", "Fruit", "Vegetable", "Vegetable")
Food_a <- c("Apple", "Banana", "Onion", "Carrot")
dt_a <- data.frame(Class, Food_a)

dt_a
      Class Food_a
1     Fruit  Apple
2     Fruit Banana
3 Vegetable  Onion
4 Vegetable Carrot
   
Food_b <- c("Apple", "Bread", "Onion", "Banana")
dt_b <- data.frame(Food_b)

  Food_b
1  Apple
2  Bread
3  Onion
4 Banana

Desired output:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

  Food_b Class
1  Apple Fruit
2  Bread NA
3  Onion Vegetable
4 Banana Fruit

I have looked at other questions on SO but I cannot seem to get them to work on my example. The closes one I could find were these two: R dplyr: Find a specific value in a column, then replace the adjacent cell in the subsequent columns to the right with that value and Replace NA in column with value in adjacent column.

As I have two data frames, I am unsure how to apply this.

I thought about using ifelse or case_when but I don’t know how to link these to the desired Class
Here is what I came up with so far:

dt_c <- dt_b %>% mutate(Class=(ifelse(dt_b$Food_b %in% dt_a$Food_a, dt_a$Class, NA)))

  Food_b     Class
1  Apple     Fruit
2  Bread      <NA>
3  Onion Vegetable
4 Banana Vegetable

I would be very grateful for any help and suggestions!

>Solution :

As you are using the tidyverse you just need a left_join

dt_b %>% left_join(dt_a %>% rename(Food_b = Food_a))

Joining, by = "Food_b"

  Food_b     Class
1  Apple     Fruit
2  Bread      <NA>
3  Onion Vegetable
4 Banana     Fruit

The rename is to ensure there is a column with the same name in each dataframe for joining. You can also use a by argument in the left_join to achieve the same thing (dt_b %>% left_join(dt_a, by = c("Food_b" = "Food_a"))).

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading