Home R – How to operate on different columns for each row based on an extra-column containing the names of the columns to be used for operation

Questions

R – How to operate on different columns for each row based on an extra-column containing the names of the columns to be used for operation

byMR

March 1, 2022

I am new to R. I would like to calculate the mean for each row of a dataframe, but using different subset of columns for each row. I have two extra-columns providing me the names of the column that represent the "start" and the "end" that I should use to calculate each mean, respectively.

Let’s take this example

dframe <- data.frame(a=c("2","3","4", "2"), b=c("1","3","6", "2"), c=c("4","5","6", "3"), d=c("4","2","8", "5"), e=c("a", "c", "a", "b"), f=c("c", "d", "d", "c"))
dframe

Which provides the following dataframe:

  a b c d e f
1 2 1 4 4 a c
2 3 3 5 2 c d
3 4 6 6 8 a d
4 2 2 3 5 b c

The columns e and f represent the first and last column I use to calculate the mean for each row.
For example, on line 1, the mean would be calculated including column a, b, c ((2+1+4)/3 -> 2.3)
So I would like to obtain the following output:

  a b c d e f mean
1 2 1 4 4 a c  2.3
2 3 3 5 2 c d  3.5
3 4 6 6 8 a d    6
4 2 2 3 5 b c  2.5

I learnt how to create the indices, and I want then to use RowMeans, but I cannot find the correct arguments.

dframe %>%
  mutate(e_indice = match(e, colnames(dframe)))%>%
  mutate(f_indice = match(f, colnames(dframe)))%>%
  mutate(mean = RowMeans(????, na.rm = TRUE))

Thanks a lot for your help

>Solution :

One dplyr option could be:

dframe %>%
    rowwise() %>%
    mutate(mean = rowMeans(cur_data()[match(e, names(.)):match(f, names(.))]))

      a     b     c     d e     f      mean
  <dbl> <dbl> <dbl> <dbl> <chr> <chr> <dbl>
1     2     1     4     4 a     c      2.33
2     3     3     5     2 c     d      3.5 
3     4     6     6     8 a     d      6   
4     2     2     3     5 b     c      2.5