Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

R pivot_wider x multiple columns with names as ColName + row number

Thank you for looking, I struggled to think of a good name for the question.

I have looked at many different versions of my question here and elsewhere but haven’t found the exact answer I need.

I have data consisting of ID numbers, codes, and dates. There are often more than 1 row per ID, there is no set number of times an ID can appear. The data now are in long format, I need to pivot them to wide format. I need the result to be 1 row per ID, with the Code, and Date fields widened. I want to keep them the same names but add a +N to the end of the field name where the highest +N is the max number of times a single ID repeats…If an ID repeats 9 times, the pivot will produce 9 new columns for Code and 9 for Date, as Code1, Code2…Code9.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I can do it, see below, but the names are a total mess. I am unable to get the names to be neat the way I describe.

EG Data:

eg_data <- data.frame(
  ID = c('1','1','1','2', '2', '3', '4', '4') , 
  FName = c('John','John','John','Gina', 'Gina', 'Tom', 'Bobby', 'Bobby') , 
  LName = c('Smith','Smith','Smith','Jones', 'Jones', 'Anderson', 'Kennedy', 'Kennedy') , 
  Code = c('ECV','EDC','EER','ECV', 'ECV', 'EER', 'EDC', 'EER') , 
  Date = c('2022-04-23','2021-12-21','2020-01-25','2022-05-18', '2020-05-26', '2021-01-21', '2020-05-14', '2020-06-25')) 

What I’ve done – yes, it works, but it requires renaming every_single_column that gets widened. There must be a better way:

eg_data %>%
  group_by(ID) %>%
  mutate(
    Count = row_number(), 
    CodeName = paste0('ContactCode', Count),
    DateName = paste0('ContactDate', Count)) %>%
  ungroup() %>%
  select(-Count) %>%
  pivot_wider(
    names_from = c(CodeName, DateName), values_from = c(Code, Date)) -> LongNamesYuck

Desired Output (using two of the IDs above for brevity)

desired_format <- data.frame(
  ID = c('1','2'),
  FName = c('John', 'Gina'),
  LName = c('Smith', 'Jones'),
  Code1 = c('ECV', 'ECV'),
  Code2 = c('EDC', 'ECV'),
  Code3 = c('EER', NA),
  Date1 = c('2022-04-23', '2022-05-18'),
  Date2 = c('2021-12-21', '2020-05-26'),
  Date3 = c('2020-01-25', NA))

The example below gets close, but it uses the values from one of the fields as new field names, and I don’t want that.

How to `pivot_wider` multiple columns without combining the names?

Any help is appreciated, thank you.

>Solution :

We can make it compact i.e. using rowid (from data.table instead of two lines group_by/mutate), and then use pivot_wider with names_from on the sequence (‘rn’) column and the values_from as a vector of column names (quoted/unquoted)

library(dplyr)
library(tidyr)
library(data.table)
out <- eg_data %>%
  mutate(rn = rowid(ID, FName, LName)) %>% 
  pivot_wider(names_from = rn, values_from = c(Code, Date), names_sep = "")

-checking the output with desired_format

> desired_format
  ID FName LName Code1 Code2 Code3      Date1      Date2      Date3
1  1  John Smith   ECV   EDC   EER 2022-04-23 2021-12-21 2020-01-25
2  2  Gina Jones   ECV   ECV  <NA> 2022-05-18 2020-05-26       <NA>
> out[1:2,]
# A tibble: 2 × 9
  ID    FName LName Code1 Code2 Code3 Date1      Date2      Date3     
  <chr> <chr> <chr> <chr> <chr> <chr> <chr>      <chr>      <chr>     
1 1     John  Smith ECV   EDC   EER   2022-04-23 2021-12-21 2020-01-25
2 2     Gina  Jones ECV   ECV   <NA>  2022-05-18 2020-05-26 <NA>      
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading