Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Condensing a data frame in R to one row

I have a list of data frames, and the data frames in the list look something like this where the columns x_, y_, and t_ all have the same values in each role and the only thing differnt are the var1, var2, and var3 values:

x_ y t_ var1 var2 var3
1 1 1 5 NA NA
1 1 1 NA 9 NA
1 1 1 NA NA 20

Here is the code for an example of the data frame above:

df <- data.frame(x_ = c(1,1,1),
                 y_ = c(1,1,1),
                 t_ = c(1,1,1),
                 var1 = c(5, NA, NA),
                 var2 = c(NA, 9, NA),
                 var3 = c(NA, NA, 20))

I would like to get the data frames to look something like this, where I can condense the data into a single row:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

x_ y t_ var1 var2 var3
1 1 1 5 9 20

Is there a good way to do this?

>Solution :

One potential option is to fill the NAs then remove duplicate lines, e.g.

library(tidyverse)
library(vctrs)

df <- data.frame(x_ = c(1,1,1),
                 y_ = c(1,1,1),
                 t_ = c(1,1,1),
                 var1 = c(5, NA, NA),
                 var2 = c(NA, 9, NA),
                 var3 = c(NA, NA, 20))

df2 <- df %>%
  mutate(across(everything(),
                ~vec_fill_missing(.x, direction = "downup")))
df2
#>   x_ y_ t_ var1 var2 var3
#> 1  1  1  1    5    9   20
#> 2  1  1  1    5    9   20
#> 3  1  1  1    5    9   20

df2 %>%
  distinct()
#>   x_ y_ t_ var1 var2 var3
#> 1  1  1  1    5    9   20

If you have NAs for every line, this will have NA in the final distinct row:

df3 <- data.frame(x_ = c(1,1,1),
                 y_ = c(1,1,1),
                 t_ = c(1,1,1),
                 var1 = c(5, NA, NA),
                 var2 = c(NA, 9, NA),
                 var3 = c(NA, NA, NA))

df4 <- df3 %>%
  mutate(across(everything(),
                ~vec_fill_missing(.x, direction = "downup")))
df4
#>   x_ y_ t_ var1 var2 var3
#> 1  1  1  1    5    9   NA
#> 2  1  1  1    5    9   NA
#> 3  1  1  1    5    9   NA

df4 %>%
  distinct()
#>   x_ y_ t_ var1 var2 var3
#> 1  1  1  1    5    9   NA

Created on 2023-03-17 with reprex v2.0.2

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading