This is my first post so hopefully I have included all the necessary information. I don’t think it replicates previously answered questions.
Eventually I am trying to combine numerous datasets together to create a single one with all the necessary variables for our project. So far I have been using dplyr to combine these sets based on by = c(‘Country’ and ‘Year’). This has been successful when the data are presented in the same way in both e.g. a column with country, a column for year, and then the variables in the following columns (see below).
Country | Year | Temperature |
---|---|---|
Germany | 2010 | 20 |
Germany | 2009 | 19 |
Germany | 2008 | 18 |
Australia | 2010 | 24 |
Australia | 2009 | 23 |
Australia | 2018 | 22 |
Spain | 2010 | 23 |
Spain | 2009 | 24 |
Spain | 2008 | 21 |
However, I can’t seem to work out how to combine them when they are in different formats e.g. one as the example above and the other as in the example below (containing data on annual rainfall mm).
Country | 2010 | 2009 | 2008 |
---|---|---|---|
Germany | 800 | 900 | 1000 |
Australia | 700 | 600 | 750 |
Spain | 800 | 650 | 550 |
I would like a combined table in the format of the first example but now with an additional column for annual rainfall taken from the second df.
Any suggestions would be greatly appreciated as I have been trying without success for quite a while. I thought some form of Pivot (?) but unfortunately I don’t have any examples of my attempts to include because I am at a complete loss.
>Solution :
Your second dataframe is in wide format, so you should convert it into long format
library(dplyr, tidyr)
df2 <- read.table(text="Country 2010 2009 2008
Germany 800 900 1000
Australia 700 600 750
Spain 800 650 550", header=TRUE)
df2 %>%
pivot_longer(-Country, names_to = "Year", values_to = "Temperature") %>%
mutate(Year = as.integer(sub("X", "", Year)))
Country Year Temperature
<chr> <int> <int>
1 Germany 2010 800
2 Germany 2009 900
3 Germany 2008 1000
4 Australia 2010 700
5 Australia 2009 600
6 Australia 2008 750
7 Spain 2010 800
8 Spain 2009 650
9 Spain 2008 550
Then you can use merge
as you said