I have 2 df.
df_tailnum, a list of unique tailnum
TailNum
<chr>
1 N657AW
2 N834AW
3 N605AW
4 N312AW
5 N309AW
6 N733UW
7 N177UW
8 N404US
9 N722UW
10 N104UW
And df_planes, with information on the names of the tailnum
tailnum type manufacturer issue_date model status aircraft_type engine_type year
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 N997AT Corpo… BOEING 01/02/2003 717-… Valid Fixed Wing M… Turbo-Fan 2002
2 N997DL Corpo… MCDONNELL DO… 03/11/1992 MD-88 Valid Fixed Wing M… Turbo-Fan 1992
3 N998AT Corpo… BOEING 01/23/2003 717-… Valid Fixed Wing M… Turbo-Fan 2002
4 N998DL Corpo… MCDONNELL DO… 04/02/1992 MD-88 Valid Fixed Wing M… Turbo-Jet 1992
5 N999CA Forei… CANADAIR 07/09/2008 CL-6… Valid Fixed Wing M… Turbo-Jet 1998
6 N999DN Corpo… MCDONNELL DO… 04/02/1992 MD-88 Valid Fixed Wing M… Turbo-Jet 1992
I would like to know how do I add value from year column in df_planes to df_tailnum as a new column base on the corresponding tailnum?
something like this:
TailNum Year
<chr>
1 N657AW
2 N834AW
3 N605AW
4 N312AW
5 N309AW
6 N733UW
7 N177UW
8 N404US
9 N722UW
10 N104UW
I tried
df_tailnum <- df_planes %>%
if_else(df_planes$tailnum == df_tailnum$TailNum,select(df_planes$year))
mutate()
which I realise doesn’t make sense.
Please help.
>Solution :
-
if_elseis a vector-based function, not a verb for working on the whole data at once. At best it should be within amutateorsummarizecall, not on its own. (This is breaking partially because of the%>%dumping adata.frameinto its first argument.) -
(Almost) Never use
df_planes$within a dplyr pipe. -
This is an advanced operation known as "merge" or "join". It’s a good one to know, but if you’re just starting with
data.frames and R (and not yet SQL or similar), then it might be a big. See How to join (merge) data frames (inner, outer, left, right) and What's the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN and FULL JOIN? for good discussions on how it works (both in R and in general).
Try these:
dplyr
library(dplyr)
left_join(df_tailnum, df_planes, by = c(TailNum = "tailnum"))
# TailNum type manufacturer issue_date model status aircraft_type engine_type year
# 1 N657AW <NA> <NA> <NA> <NA> <NA> <NA> <NA> NA
# 2 N834AW <NA> <NA> <NA> <NA> <NA> <NA> <NA> NA
### ...
left_join(df_tailnum, select(df_planes, tailnum, year), by = c(TailNum = "tailnum"))
# TailNum year
# 1 N657AW NA
# 2 N834AW NA
# 3 N605AW NA
### ...
Nothing is found because your sample data has nothing in common between them, but the method is sound.
base R
merge(df_tailnum, df_planes, by.x = "TailNum", by.y = "tailnum", all.x = TRUE)