My data set contains ID and many columns that have ID on their name.
data = data.frame(ID = rep(1:3,2),
col1 = 1:6,
col2 = 7:12,
col3 = 13:18)
print(data)
I’m trying to make a variable based on its ID from different columns.
data$new = c(1, 8, 15, 4, 11, 18)
I tried it with dplyr using get(paste0()), but it calls the entire vector, not the specific element.
data <- data %>% mutate(new = get(paste0("col",ID)))
I also tried sapply but it again gives the vector not the element
data$new <- sapply(data$ID, function(x) get(paste0("col",ID),data)
How can I pick up the element from different columns by its ID?
>Solution :
You can use your original approach with rowwise() if you like
i.e.
data %>% rowwise() %>% mutate(new = get(paste0("col",ID)))
You could also do something like:
data$new <- apply(data, 1, \(x) x[x[1]+1])
the above only works if you only have ID and the col<num> columns, and they are in order. If that is not the case you could do this:
cols = sort(grep("col",names(data),value=T))
data$new <- apply(data[,c("ID", cols)], 1, \(x) x[x[1]+1])
And, while I wouldn’t necessarily recommend it for your current situation, often its easier to work on some issues like this when the data are in long format:
bind_cols(
select(data, ID, starts_with("col")),
tidyr::pivot_longer(data,-ID,names_prefix = "col") %>% filter(name==ID) %>% select(new=value)
)
All the above provide the same result