I have a large dataset with many x and y columns, but showing an example with 5 below
library(dplyr)
set.seed(505)
df <- data.frame(
x1 = sample(-2 : 4,size = 10, replace = TRUE),
x2 = sample(-2 : 4,size = 10, replace = TRUE),
x3 = sample(-2 : 4,size = 10, replace = TRUE),
x4 = sample(-2 : 4,size = 10, replace = TRUE),
x5 = sample(-2 : 4,size = 10, replace = TRUE),
y1 = sample(-2 : 4,size = 10, replace = TRUE),
y2 = sample(-2 : 4,size = 10, replace = TRUE),
y3 = sample(-2 : 4,size = 10, replace = TRUE),
y4 = sample(-2 : 4,size = 10, replace = TRUE),
y5 = sample(-2 : 4,size = 10, replace = TRUE))
My task can be achieved with the code below
df |> mutate(
sum1 = x1 * as.numeric(x1 > -1 & y1 > -1) +
x2 * as.numeric(x2 > -1 & y2 > -1) +
x3 * as.numeric(x3 > -1 & y3 > -1) +
x4 * as.numeric(x4 > -1 & y4 > -1) +
x5 * as.numeric(x5 > -1 & y5 > -1)
Because I have 25 x and y variables, my question is: is there a better way to achieve the same?
>Solution :
xs <- paste0("x", 1:5)
ys <- paste0("y", 1:5)
rowSums(df[xs] * (df[xs] > -1 & df[ys] > -1))
# Alternatively
foo <- \(x, y) replace(x, pmin(x, y) <= -1, 0)
rowSums(mapply(foo, df[xs], df[ys]))
# [1] 2 3 1 4 6 3 0 0 1 2