I have grouped data like follows:
group id var
1 1 1 1
2 1 2 0
3 1 3 1
4 2 1 0
5 2 2 1
6 3 1 0
I’d like to create N new var columns (var1, var2,…) where N is the max number of individuals in a group (in this case N = 3 as largest group is group 1 which has three ids.
var1 should contain the var value for id = 1 in each group; likewise var2 should contain the var value for each id = 2 in each group. Where the number of columns exceeds the number in a group it should return NA.
The sought output for the above example would be:
group id var var1 var2 var3
1 1 1 1 1 0 1
2 1 2 0 1 0 1
3 1 3 1 1 0 1
4 2 1 0 0 1 NA
5 2 2 1 0 1 NA
6 3 1 0 0 NA NA
Example Code
a <- data.frame(group = c(1,1,1,2,2,3),
id = c(1,2,3,1,2,1),
var = c(1,0,1,0,1,0))
>Solution :
Looks like you want to pivot the dataframe wider: you can use pivot_wider(). If you want to keeps the original rows, you can also use right_join().
library(tidyverse)
a %>% right_join(
pivot_wider(a,values_from = var, names_from = id, names_prefix = "var"),
by='group'
)
Output:
group id var var1 var2 var3
1 1 1 1 1 0 1
2 1 2 0 1 0 1
3 1 3 1 1 0 1
4 2 1 0 0 1 NA
5 2 2 1 0 1 NA
6 3 1 0 0 NA NA
However, maybe is this a bit redundant? Just using pivot_wider() keeps the same info because varx are the original id column:
library(tidyverse)
a %>%
pivot_wider(values_from = var, names_from = id, names_prefix = "var")
# A tibble: 3 x 4
group var1 var2 var3
<dbl> <dbl> <dbl> <dbl>
1 1 1 0 1
2 2 0 1 NA
3 3 0 NA NA