Let’s say I have this dataframe
> a
T..Gene.names Intensity.Mut_125 Intensity.Mut_250 Intensity.Mut.1000 Intensity.Mut.500
1 NCAN NaN 25.6628 23.8427 NaN
2 AMBP 22.8276 27.0801 25.4740 23.5596
3 CHGB 25.4463 30.0065 27.8181 27.3170
4 APP 25.0346 29.7784 27.0848 24.7314
I need to re-arrange my dataframe so each a$T..Gene.names correspond to a new column. Then, I need a new column called a$sample that extracts the word between Intensity and the number (either 125, 250, 500, 1000 or 2000). An issue is that this word and following number is separated by either . or _ Finally, I need a column named a$volume that correspond to the number. NA should be converted to 0.
I tried several attempts with pivot_longer and pivot_wider but this is above my current skill level.
Expected output
sample volume NCAN AMBP CHGB APP
Mut 125 0 22.8276 25.4463 25.0346
Mut 250 25.6638 27.0801 30.0065 29.7784
Mut 500 0 23.5596 27.3170 24.7314
Mut 1000 23.8427 25.4740 27.8181 27.0848
I prefer a dplyr-solution
a <- structure(list(T..Gene.names = c("NCAN", "AMBP", "CHGB", "APP"
), Intensity.Mut_125 = c(NaN, 22.8276, 25.4463, 25.0346), Intensity.Mut_250 = c(25.6628,
27.0801, 30.0065, 29.7784), Intensity.Mut.1000 = c(23.8427, 25.474,
27.8181, 27.0848), Intensity.Mut.500 = c(NaN, 23.5596, 27.317,
24.7314)), row.names = c(NA, 4L), class = "data.frame")
>Solution :
reshape2::recast(a, variable~T..Gene.names,fill = 0) %>%
separate(variable, c('type','sample', 'volume'))
type sample volume AMBP APP CHGB NCAN
1 Intensity Mut 125 22.8276 25.0346 25.4463 0.0000
2 Intensity Mut 250 27.0801 29.7784 30.0065 25.6628
3 Intensity Mut 1000 25.4740 27.0848 27.8181 23.8427
4 Intensity Mut 500 23.5596 24.7314 27.3170 0.0000