Home dplyr: How to rearrange this dataframe and create new columns by extracting parts of other columns

Questions

dplyr: How to rearrange this dataframe and create new columns by extracting parts of other columns

February 8, 2022

Let’s say I have this dataframe

> a
  T..Gene.names Intensity.Mut_125 Intensity.Mut_250 Intensity.Mut.1000 Intensity.Mut.500
1          NCAN               NaN           25.6628            23.8427               NaN
2          AMBP           22.8276           27.0801            25.4740           23.5596
3          CHGB           25.4463           30.0065            27.8181           27.3170
4           APP           25.0346           29.7784            27.0848           24.7314

I need to re-arrange my dataframe so each a$T..Gene.names correspond to a new column. Then, I need a new column called a$sample that extracts the word between Intensity and the number (either 125, 250, 500, 1000 or 2000). An issue is that this word and following number is separated by either . or _ Finally, I need a column named a$volume that correspond to the number. NA should be converted to 0.

I tried several attempts with pivot_longer and pivot_wider but this is above my current skill level.

Expected output

sample     volume       NCAN         AMBP        CHGB        APP
   Mut        125          0      22.8276     25.4463    25.0346
   Mut        250    25.6638      27.0801     30.0065    29.7784
   Mut        500          0      23.5596     27.3170    24.7314
   Mut       1000    23.8427      25.4740     27.8181    27.0848

I prefer a dplyr-solution

a <- structure(list(T..Gene.names = c("NCAN", "AMBP", "CHGB", "APP"
), Intensity.Mut_125 = c(NaN, 22.8276, 25.4463, 25.0346), Intensity.Mut_250 = c(25.6628, 
27.0801, 30.0065, 29.7784), Intensity.Mut.1000 = c(23.8427, 25.474, 
27.8181, 27.0848), Intensity.Mut.500 = c(NaN, 23.5596, 27.317, 
24.7314)), row.names = c(NA, 4L), class = "data.frame")

>Solution :

reshape2::recast(a, variable~T..Gene.names,fill = 0) %>%
  separate(variable, c('type','sample', 'volume'))

       type sample volume    AMBP     APP    CHGB    NCAN
1 Intensity    Mut    125 22.8276 25.0346 25.4463  0.0000
2 Intensity    Mut    250 27.0801 29.7784 30.0065 25.6628
3 Intensity    Mut   1000 25.4740 27.0848 27.8181 23.8427
4 Intensity    Mut    500 23.5596 24.7314 27.3170  0.0000