Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Subtracting vectors by group from two dataframes

I have two dataframes in R.
The first dataframe contains several columns-features, as well as a column that tells whether a particular sample (row) belongs to a certain group (a factor variable). The second dataframe contains the same number of columns, and the number of rows equals the number of unique groups. I want to subtract from each sample of the first dataframe the corresponding vector from the second dataframe, where the correspondence is specified using the key-group in the column of the same name.

Here is an example of the main dataset:

df_repr <- structure(list(f1 = c(-3.9956064225704, 
-0.52380279948658, 0.61089389331505, -3.47273625634875, -4.486918671214, 
-6.1761970731672, -4.62305749757367, -4.42540643005429, -3.61613137597131, 
-3.29821425516253), f2 = c(-1.57918114753228, 
-4.10523012500727, -1.80270009366593, -0.00905317702835884, -0.899585192079915, 
-2.89341515186212, 0.0132542126386332, -3.32639898550135, -0.867793877742314, 
0.0911950321630834), f3 = c(-6.02532301769732, 
-4.90073348094302, -3.73159604513274, -3.55290209472808, -6.63194560195811, 
2.69409789701296, -4.17675978927128, -3.84141885970095, -1.20571283849034, 
1.54287440902102), group = structure(c(1L, 1L, 1L, 1L, 1L, 
2L, 2L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor")), class = c("tbl_df", "tbl", 
"data.frame"), row.names = c(NA, -10L))

Here is an example dataframe with vectors to be subtracted from each row of the corresponding group of the first dataframe:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

to_subtract <- structure(list(group = structure(1:2, .Label = c("A", 
"B"), class = "factor"), f1 = c(-2.78048744402161, 
-2.33583431665818), f2 = c(-2.56086962108741, 
-0.689157827347865), f3 = c(-3.60224982918457, 
-0.782365376308658)), row.names = c(NA, -2L), class = c("tbl_df", 
"tbl", "data.frame"))

# # A tibble: 2 × 4
#   group    f1     f2     f3
#   <fct> <dbl>  <dbl>  <dbl>
# 1 A     -2.78 -2.56  -3.60
# 2 B     -2.34 -0.689 -0.782

I tried to do it like this:

df_repr %>%
  group_by(group) %>%
  mutate(across(where(is.numeric),
         ~ . - to_subtract[to_subtract$group == unique(.$group), -1]))

But I get the following error:

Error in `mutate()`:
ℹ️ In argument: `across(...)`.
ℹ️ In group 1: `group = A`.
Caused by error in `across()`:
! Can't compute column `f1`.
Caused by error in `f1$group`:
! $ operator is invalid for atomic vectors

Expected output for this example:

       f1     f2      f3 group
    <dbl>  <dbl>   <dbl> <fct>
 1 -1.22   0.982 -2.42   A    
 2  2.26  -1.54  -1.30   A    
 3  3.39   0.758 -0.129  A    
 4 -0.692  2.55   0.0493 A    
 5 -1.71   1.66  -3.03   A    
 6 -3.84  -2.20   3.48   B    
 7 -2.29   0.702 -3.39   B    
 8 -2.09  -2.64  -3.06   B    
 9 -1.28  -0.179 -0.423  B    
10 -0.962  0.780  2.33   B 

>Solution :

You can use powerjoin with (conflict = `-`):

library(powerjoin)

power_left_join(df_repr, to_subtract, by = "group", conflict = `-`)

# A tibble: 10 × 4
   group     f1     f2      f3
   <fct>  <dbl>  <dbl>   <dbl>
 1 A     -1.22   0.982 -2.42
 2 A      2.26  -1.54  -1.30  
 3 A      3.39   0.758 -0.129
 4 A     -0.692  2.55   0.0493
 5 A     -1.71   1.66  -3.03
 6 B     -3.84  -2.20   3.48
 7 B     -2.29   0.702 -3.39
 8 B     -2.09  -2.64  -3.06  
 9 B     -1.28  -0.179 -0.423
10 B     -0.962  0.780  2.33

Another dplyr::group_modify approach:

df_repr %>%
  group_by(group) %>%
  group_modify(~ mutate(.x, across(f1:f3, \(val) {
    val - filter(to_subtract, group == .y$group)[[cur_column()]]
  }))) %>%
  ungroup()
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading