I have two data frames. One of the data frames contains a ID column, while the other does not. They do have the a column NumID that can be used as a reference, and a date column that can be used too. I would like to use the NumID and the first date for each ID in df to append a ID column into df2.
library(lubridate)
library(tidyverse)
library(purrr)
date <- rep_len(seq(dmy("01-01-2011"), dmy("25-01-2011"), by = "days"), 25)
ID <- rep(c("A","B", "C"), 25)
NumID <- rep(c("00001", "00002", "00003"), 25)
df <- data.frame(date = date,
ID,
NumID)
date2 <- c("01-01-2011", "2011-01-02", "2011-01-03")
NumID2 <- c("00001", "00002", "00003")
df2 <- data.frame(date = date2,
NumID = NumID2)
My expected output would look something like this:
ID2 <- c("A","B", "C")
expected <- data.frame(date = date2,
NumID = NumID2,
ID = ID2)
>Solution :
There are multiple date formats in date column of ‘df2’. An option is to convert to Date class with parse_date and then do a join
library(parsedate)
library(dplyr)
df2$date <- as.Date(parse_date(df2$date))
# or use `lubridate::parse_date_time` with formats
# df2$date <- as.Date(lubridate::parse_date_time(df2$date, c("dmy", "ymd")))
left_join(df2, df)
-output
date NumID ID
1 2011-01-01 00001 A
2 2011-01-02 00002 B
3 2011-01-03 00003 C
Or with a chain (%>%)
df2 %>%
mutate(date = as.Date(parse_date(date))) %>%
left_join(df)