How can I calculate a new column for a dataframe in R based on another dataframe?

Advertisements

I’m having trouble adding a new column to a dataframe based on values from another dataframe. I’m going to try to provide a minimal, replicable example; of course my real dataframes have lots of other columns, but I don’t think they matter here.

So let’s say I have two dataframes, DF1 and DF2.

ID <- c(1, 2, 3, 4, 5)
material <- c("a", "b", "c", "d", "a")
DF1 <- data.frame(ID, material)

ID <- c("a", "b", "c", "d,")
group <- c("x", "x", "y", "z")
DF2 <- data.frame(ID, group)

ID is a unique key in each dataframe.
I’d like to perform an operation where I match DF1$material to DF2$ID,
and add a column to DF1 that contains the associated value from DF2$group.

So in the above example the result would look like:

ID <- c(1, 2, 3, 4, 5)
material <- c("a", "b", "c", "d", "a")
group <- c("x", "x", "y", "z", "x")
DF1 <- data.frame(ID, material, group)

I’d preferably like to use tidyverse/dplyr to achieve this.
I’ve tried inner_join and left_join, as well as mutate() with an ifelse() statement.
The joins create a ton of duplicate columns, and the ifelse() produces a warning:
"longer object length is not a multiple of shorter object length".

I’m sure there’s an obvious solution here. How can I do this?

>Solution :

We may use the different column

library(dplyr)
left_join(DF1, DF2, by = c("material" = "ID"))
   ID material group
1  1        a     x
2  2        b     x
3  3        c     y
4  4        d     z
5  5        a     x

Leave a ReplyCancel reply