R Multiple Dataframe Column Matches to Populate Column

September 30, 2022

I have a dataframe ‘df1’ that looks like:

structure(list(MAPS_code = c("SARI", "SABO", "SABO", "SABO", 
"ISLA"), Location_code = c("LCP-", "LCP-", "LCP-", "LCP-", "LCP-"
), Contact = c("Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall", 
"Chase Mendenhall", "Chase Mendenhall"), Lat = c(NA, NA, NA, 
NA, NA), Long = c(NA, NA, NA, NA, NA)), row.names = c(NA, 5L), class = "data.frame")

And a second dataframe ‘df2’ that looks like:

structure(list(MAPS_code = c("SAFR", "SAGA", "ELPU", "ISLA", 
"SABO", "SATE", "QUST", "SARI", "PANA", "COPA", "LOAN", "GAPA", 
"MELI", "CAGO", "PINO", "GABO", "RIJA", "FILA", "AMIS"), Lat = c(8.765833, 
8.751389, 8.768611, 8.835833, 8.801111, 8.808333, 8.815, 8.827778, 
8.781667, 8.778333, 8.783333, 8.800833, 8.790278, 8.754444, 8.844444, 
8.801389, 8.786667, 8.785278, 8.952222), Long = c(-82.94277, 
-82.951111, -82.95, -82.963056, -82.917222, -82.924444, -82.923889, 
-82.924167, -82.896944, -82.955833, -82.938611, -82.972222, -82.967222, 
-82.925833, -82.97, -82.972222, -82.964722, -82.976111, -82.833333
), Contact = c("Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall", 
"Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall", 
"Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall", 
"Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall", 
"Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall"
), Location = c("LCP-", "LCP-", "LCP-", "LCP-", "LCP-", "LCP-", 
"LCP-", "LCP-", "LCP-", "LCP-", "LCP-", "LCP-", "LCP-", "LCP-", 
"LCP-", "LCP-", "LCP-", "LCP-", "LCP-")), class = "data.frame", row.names = c(NA, 
-19L))

How can I make populate each row of ‘Lat’ and ‘Long’ of df1 from ‘Lat’ and ‘Long’ of df2 when ‘Contact’, ‘Location’, and ‘MAPS_code’ of the corresponding rows match between df1 and df2? So that the result of df1 looks like:

structure(list(MAPS_code = c("SARI", "SABO", "SABO", "SABO", 
"ISLA"), Location_code = c("LCP-", "LCP-", "LCP-", "LCP-", "LCP-"
), Contact = c("Chase Mendenhall", "Chase Mendenhall", "Chase Mendenhall", 
"Chase Mendenhall", "Chase Mendenhall"), Lat = c("8.827778", "8.801111", "8.801111
", "8.801111", "8.835833"), Long = c("-82.92417", "-82.91722", "-82.91722", "-82.91722", "-82.96306")), row.names = c(NA, 5L), class = "data.frame")

>Solution :

This is very similar to @Anoushiravan R’s answer, but I’d suggest removing unnecessary variables before joining, rather than wrestling with the variable names generated by duplicate columns:

df1 %>%
    select(-Lat, -Long) %>%
    left_join(df2, by=c('MAPS_code', 'Location_code'='Location', 'Contact'))


  MAPS_code Location_code          Contact      Lat      Long
1      SARI          LCP- Chase Mendenhall 8.827778 -82.92417
2      SABO          LCP- Chase Mendenhall 8.801111 -82.91722
3      SABO          LCP- Chase Mendenhall 8.801111 -82.91722
4      SABO          LCP- Chase Mendenhall 8.801111 -82.91722
5      ISLA          LCP- Chase Mendenhall 8.835833 -82.96306