Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Subtracting column values by a named vector in R

I found a post very similar to my problem (subtract a constant vector from each row in a matrix in r), but I was hoping I could solve this using dplyr.

I have a data.frame that looks like this:

set.seed(1)
toy_df <- data.frame(Patient.ID = letters[1:5],
                     Patient.Age = rnorm(5,35,4),
                     Protein.A = rnorm(5,100,10),
                     Protein.B = rnorm(5,100,10),
                     Protein.D = rnorm(5,100,10),
                     Protein.E = rnorm(5,100,10))

I calculated the median absolute deviation using this approach:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

medianDeviation <- apply(X = toy_df[,grepl("^Protein\\.", names(toy_df))], MARGIN = 2, FUN = function(x) median(x) + (2*mad(x)))

It created a named vector with the median deviation for each protein. Now, I want to subtract the median deviation for each corresponding protein from "toy_df".

I asked chatGPT for a solution, and it suggested this:

result <- toy_df %>% mutate(across(names(medianDeviation), ~ . - medianDeviation[.col]))

It looks promising, but for some reason, it is not working. I think the problem lies in the "medianDeviation[.col]"; however, I can’t find any alternative. Any suggestions?

>Solution :

You could directly use:

mutate(toy_df, across(starts_with('Protein'), ~.x - median(.x) - 2*mad(.x)))

  Patient.ID Patient.Age  Protein.A Protein.B  Protein.D  Protein.E
1          a    32.49418 -20.518532 -18.76128 -16.764619  -5.878928
2          b    35.73457  -7.439558 -29.98066 -16.477185  -7.247338
3          c    31.65749  -4.930601 -40.09150  -6.876920 -14.323052
4          d    41.38112  -6.556035 -56.02609  -8.103071 -34.962218
5          e    36.31803 -15.367732 -22.62978 -10.376269  -8.870444

or use

. - medianDeviation[cur_column()]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading