Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Transform long-to-wide database spreading data on multiple columns in R

I am struggling in transforming a long dataframe to wide version, but with some complications.
I have a column, ID, with duplicated entry – each refers to a timepoint for an individual. I have also other columns (namely, visit, var1 and var2), in which data regarding each timepoint are reported. Here is a reproducible example:

df <- data.frame(ID=c(1,1,1,1,2,2,2,3,3,3),
                 visit=c(1,4,5,7,1,3,4,2,5,6),
                 var1=c("AF","no","no","no","AG","AG","no","BA","BA","BA"),
                 var2=c("good","good","good","bad","good","good","bad","good","good","good"))

And the output:

   ID visit var1 var2
1   1     1   AF good
2   1     4   no good
3   1     5   no good
4   1     7   no  bad
5   2     1   AG good
6   2     3   AG good
7   2     4   no  bad
8   3     2   BA good
9   3     5   BA good
10  3     6   BA good

I really need to end up with a dataframe which contains only one row per ID, and multiple columns for each of the other variables, with a numeric suffix for example (such as visit_1, visit_2, visit_3 etc.).

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

The output I have in mind is like this:

  ID visit_1 visit_2 visit_3 visit_4 var1_1 var1_2 var1_3 var1_4 var2_1 var2_2 var2_3 var2_4
1  1       1       4       5       7     AF     no     no     no   good   good   good    bad
2  2       1       3       4      NA     AG     AG     no   <NA>   good   good    bad   <NA>
3  3       2       5       6      NA     BA     BA     BA   <NA>   good   good   good   <NA>.

in which essentially each entry of the columns visit, var1 and var2 has been placed in a separate, sequential column based on the ID column.

I have tried data.frame::dcast and tidyr::spread, and also pivot_wider() but it seems like these formula will end up with multiple column based on the actual values rather than producing fixed columns instead. For example, with pivot_wider():

df %>% pivot_wider(names_from = ID, values_from = c("visit","var1","var2"))

It returns me an error saying

Warning messages:
1: Values are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = length` to identify where the duplicates arise
* Use `values_fn = {summary_fun}` to summarise duplicates 
2: Values are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = length` to identify where the duplicates arise
* Use `values_fn = {summary_fun}` to summarise duplicates 
3: Values are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = length` to identify where the duplicates arise
* Use `values_fn = {summary_fun}` to summarise duplicates 

Anyone can help?

>Solution :

You have to add a sequential ID for each group:

library(tidyverse)
df %>% 
  group_by(ID) %>% 
  mutate(count = row_number()) %>% 
  pivot_wider(ID, names_from = count, values_from = c(visit, var1, var2))

# A tibble: 3 x 13
# Groups:   ID [3]
#     ID visit_1 visit_2 visit_3 visit_4 var1_1 var1_2 var1_3 var1_4 var2_1 var2_2 var2_3 var2_4
#  <dbl>   <dbl>   <dbl>   <dbl>   <dbl> <chr>  <chr>  <chr>  <chr>  <chr>  <chr>  <chr>  <chr> 
#1     1       1       4       5       7 AF     no     no     no     good   good   good   bad   
#2     2       1       3       4      NA AG     AG     no     NA     good   good   bad    NA    
#3     3       2       5       6      NA BA     BA     BA     NA     good   good   good   NA    
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading