How do I condense one column of a data frame by checking another column?

March 14, 2024

I’m very new to coding so I apologize if my question is terribly worded. I have a data frame with two columns like this,

record        x

1             NA
1              2
1              1
2             NA
3              9
3              1
4              2
5             NA
6             4
6             NA

and I need to create a new data frame or matrix which has rows equal to the unique values of record, and columns which contain each value of x for that given unique value of instance (and 0 if NA), like this:

1   0   2   1  
2   0   0   0
3   9   1   0
4   2   0   0
5   0   0   0
6   4   0   0

Is there an easy way to alter the data like this?

Currently I’m trying to do this with nested for loops, with the first running a number of times equal to the number of unique values of record. I then check the total number of occurrences of that value, then try another for loop which iterates that number of times? I’m not really sure where I’m going but if there is a function or easier way to think about it I would greatly appreciate your help.

>Solution :

We can use pivot.wider

library(dplyr)
library(tidyr)
df %>% 
  group_by(record) %>% 
  mutate(id = 1:n(),
         x = coalesce(x, 0)) %>% 
  pivot_wider(names_from = id, values_from = x, values_fill = 0) 
  
# A tibble: 6 × 4
# Groups:   record [6]
  record   `1`   `2`   `3`
   <int> <dbl> <dbl> <dbl>
1      1     0     2     1
2      2     0     0     0
3      3     9     1     0
4      4     2     0     0
5      5     0     0     0
6      6     4     0     0

Sample data:

df <- read.table(text="record        x
1             NA
1              2
1              1
2             NA
3              9
3              1
4              2
5             NA
6             4
6             NA", header=TRUE)