Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How do I rearrange this data frame in R?

I am sure this has been answered, but I am having difficulty having things make sense in my specific use-case.

I have this table:

> data.frame(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9),
>            Class1 = c("AB", "AB", "BC", "BC", "BC", "CD", "CD", "AB", "BC"),
>            Value1 = c(80, 90, 50, 45, 65, 30, 75, 70, 80),
>            Class2 = c("BC", "BC", "CD", "DQ", "CD", "AB", "AB", NA, NA),
>            Value2 = c(10, 5, 30, 30, 20, 10, 5, NA, NA), 
>            Class3 = c("AF", "DQ", "AB", "CD", "AB", NA, NA, NA, NA),
>            Value3 = c(5, 5, 20, 15, 15, NA, NA, NA, NA), 
>            Class4 = c("DQ", NA, NA, NA, NA, NA, NA, NA, NA),
>            Value4 = c(5, NA, NA, NA, NA, NA, NA, NA, NA))

  ID Class1 Value1 Class2 Value2 Class3 Value3 Class4 Value4
1  1     AB     80     BC     10     AF      5     DQ      5
2  2     AB     90     BC      5     DQ      5   <NA>     NA
3  3     BC     50     CD     30     AB     20   <NA>     NA
4  4     BC     45     DQ     30     CD     15   <NA>     NA
5  5     BC     65     CD     20     AB     15   <NA>     NA
6  6     CD     30     AB     10   <NA>     NA   <NA>     NA
7  7     CD     75     AB      5   <NA>     NA   <NA>     NA
8  8     AB     70   <NA>     NA   <NA>     NA   <NA>     NA
9  9     BC     80   <NA>     NA   <NA>     NA   <NA>     NA

I want to pull out all the class values as rows and their corresponding value associated with the correct ID. I need the final table to look like this:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

  Class   ID_1   ID_2   ID_3   ID_4   ID_5   ID_6   ID_7   ID_8   ID_9
1    AB     80     90     20      0     15     10      5     70      0
2    BC     10      5     50     45     65      0      0      0     80
3    CD      0      0     30     15     20     30     75      0      0
4    DQ      5      5      0     30      0      0      0      0      0
5    AF      5      0      0      0      0      0      0      0      0

I have made it as far as transposing the table and creating the correct column titles, but I cannot figure out how to extract the Classes to it’s own column? This may be the completely wrong direction, as well…

> data <- data.frame(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9),
>            Class1 = c("AB", "AB", "BC", "BC", "BC", "CD", "CD", "AB", "BC"),
>            Value1 = c(80, 90, 50, 45, 65, 30, 75, 70, 80),
>            Class2 = c("BC", "BC", "CD", "DQ", "CD", "AB", "AB", NA, NA),
>            Value2 = c(10, 5, 30, 30, 20, 10, 5, NA, NA), 
>            Class3 = c("AF", "DQ", "AB", "CD", "AB", NA, NA, NA, NA),
>            Value3 = c(5, 5, 20, 15, 15, NA, NA, NA, NA), 
>            Class4 = c("DQ", NA, NA, NA, NA, NA, NA, NA, NA),
>            Value4 = c(5, NA, NA, NA, NA, NA, NA, NA, NA))
> data_t <- as.data.frame(t(data), row) 
> colnames(data_t) <- paste0("ID_", unlist(data_t[1, ]))
> data_t <- data_t[-1, ]

> data_t
  ID_1 ID_2 ID_3 ID_4 ID_5 ID_6 ID_7 ID_8 ID_9
2   AB   AB   BC   BC   BC   CD   CD   AB   BC
3   80   90   50   45   65   30   75   70   80
4   BC   BC   CD   DQ   CD   AB   AB <NA> <NA>
5   10    5   30   30   20   10    5 <NA> <NA>
6   AF   DQ   AB   CD   AB <NA> <NA> <NA> <NA>
7    5    5   20   15   15 <NA> <NA> <NA> <NA>
8   DQ <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
9    5 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>

>Solution :

Not straightforward but we can use the pivot functions of tidyverse to get the job done:

library(tidyverse)

reshaped_data <- data %>% 
  pivot_longer(
    cols = -ID,
    names_to = c(".value", "group"),
    names_pattern = "(Class|Value)(\\d+)",
    values_drop_na = TRUE
  ) %>%
  group_by(Class, ID) %>%
  summarise(Value = sum(Value, na.rm = TRUE)) %>%
  pivot_wider(names_from = ID, values_from = Value, names_prefix = "ID_") %>%
  replace(is.na(.), 0)

print(reshaped_data)

Which give you the required output:

  Class    ID_1  ID_2  ID_3  ID_5  ID_6  ID_7  ID_8  ID_4  ID_9
1 AB       80    90    20    15    10     5    70     0     0
2 AF        5     0     0     0     0     0     0     0     0
3 BC       10     5    50    65     0     0     0    45    80
4 CD        0     0    30    20    30    75     0    15     0
5 DQ        5     5     0     0     0     0     0    30     0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading