Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

how to put multiple cols into long format base on suffix of variable names

I have wide df with multiple measurements. I would like to change wide to long. How should I do this. I know how to do 2 cols, but not multiples. Could someone guide me on this?
Input on the top, and ideal output on the bottom:

enter image description here

enter image description here

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df<-structure(list(Subject = c("Tom", "Tom", "Tom", "Tom", "Tom", 
"Tom", "Tom", "Tom", "Jerry", "Jerry", "Jerry", "Jerry", "Jerry"
), VSWGT_RAW = c(54.8, 55.6, 54.6, NA, 55.5, 55.7, 58.3, 60.7, 
66.4, 66.4, NA, 65.5, 65.2), VSWGT_UN = c("kg", "kg", "kg", NA, 
"kg", "kg", "kg", "kg", "kg", "kg", NA, "kg", "kg"), VSWGT_STD = c(54.8, 
55.6, 54.6, NA, 55.5, 55.7, 58.3, 60.7, 66.4, 66.4, NA, 65.5, 
65.2), VSWGT_STD_UN = c("kg", "kg", "kg", NA, "kg", "kg", "kg", 
"kg", "kg", "kg", NA, "kg", "kg"), VSHT_RAW = c(161, NA, NA, 
NA, NA, NA, NA, NA, NA, 181, NA, NA, NA), VSHT_UN = c("cm", NA, 
NA, NA, NA, NA, NA, NA, NA, "cm", NA, NA, NA), VSHT_STD = c(161, 
NA, NA, NA, NA, NA, NA, NA, NA, 181, NA, NA, NA), VSHT_STD_UN = c("cm", 
NA, NA, NA, NA, NA, NA, NA, NA, "cm", NA, NA, NA), VSRRATE_RAW = c("19", 
"13", "ND", "15", "14", "16", "9", "16", "12", "30", "14", "15", 
"16"), VSSYSBP = c(115, 115, 104, 103, 111, 113, 138, 146, 102, 
87, 103, 132, 132), date = structure(c(1655078400, 1655337600, 
1660521600, 1655424000, 1655942400, 1656547200, 1657152000, 1657756800, 
1674691200, 1673827200, 1674777600, 1675296000, 1675900800), class = c("POSIXct", 
"POSIXt"), tzone = "UTC")), row.names = c(NA, -13L), class = c("tbl_df", 
"tbl", "data.frame"))

>Solution :

You can use the names_sep argument of pivot_longer, in combination with using the special .value token in names_to. I did have to change two things:

  • Change variable names to only include 1 underscore, for easier separation.
  • Make sure types match.
library(tidyverse)
    
df %>% 
  rename(VSWGT_STD.UN = VSWGT_STD_UN, VSHT_STD.UN = VSHT_STD_UN) %>% 
  mutate(VSRRATE_RAW = as.numeric(VSRRATE_RAW)) %>% 
  pivot_longer(-c(Subject, date), names_sep = '_', names_to = c('test', '.value'))

Yields:

# A tibble: 52 × 7
   Subject date                test      RAW UN      STD STD.UN
   <chr>   <dttm>              <chr>   <dbl> <chr> <dbl> <chr> 
 1 Tom     2022-06-13 00:00:00 VSWGT    54.8 kg     54.8 kg    
 2 Tom     2022-06-13 00:00:00 VSHT    161   cm    161   cm    
 3 Tom     2022-06-13 00:00:00 VSRRATE  19   NA     NA   NA    
 4 Tom     2022-06-13 00:00:00 VSSYSBP  NA   NA     NA   NA    
 5 Tom     2022-06-16 00:00:00 VSWGT    55.6 kg     55.6 kg    
 6 Tom     2022-06-16 00:00:00 VSHT     NA   NA     NA   NA    
 7 Tom     2022-06-16 00:00:00 VSRRATE  13   NA     NA   NA    
 8 Tom     2022-06-16 00:00:00 VSSYSBP  NA   NA     NA   NA    
 9 Tom     2022-08-15 00:00:00 VSWGT    54.6 kg     54.6 kg    
10 Tom     2022-08-15 00:00:00 VSHT     NA   NA     NA   NA    
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading