Advertisements
I have a dataframe with about 100 columns. I’d like to get the position of certain columns, so that I can select them later. The columns are often named the same, only differing by its suffix, that indicates the year.
abcd_2011 <- c(1,2,3,4)
xy_2011 <- c(5,6,7,8)
rew_2011 <- c(2,4,6,8)
abcd_2015 <- c(4,7,9,1)
xy_2015 <- c(5,9,1,2)
rew_2015 <- c(4,4,8,7)
df <- data.frame(abcd_2011, xy_2011, rew_2011, abcd_2015, xy_2015, rew_2015)
I managed to do it statically.
k.keep <- grep(c("^abcd_.*2011|xy_.*2011|^rew_"), colnames(df))
However, I’d like *2011
to be dynamic, so that I have to change it only once, if ever I want to select another year. As you can see above, just using grep and looking for the year doesn’t work, since I need all years of some columns (rew)..
Something like the following (which doesn’t work of course).
k.keep <- grep(c("^abcd_.*k.year|xy_.*k.year|^rew_"), colnames(df))
Any help is appreciated.
>Solution :
You may define a function with argument year
and use paste0
to define the pattern within grep
.
myfn <- function(year){
k.keep <- grep(paste0("^abcd_.*",year,'|xy_.*',year,"|^rew_"), colnames(df))
return(k.keep)
}
> myfn(year = 2011)
[1] 1 2 3 6
> myfn(year = 2015)
[1] 3 4 5 6