Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Get position of columns by using grep

I have a dataframe with about 100 columns. I’d like to get the position of certain columns, so that I can select them later. The columns are often named the same, only differing by its suffix, that indicates the year.

abcd_2011 <- c(1,2,3,4)
xy_2011 <- c(5,6,7,8)
rew_2011 <- c(2,4,6,8)
abcd_2015 <- c(4,7,9,1)
xy_2015 <- c(5,9,1,2)
rew_2015 <- c(4,4,8,7)

df <- data.frame(abcd_2011, xy_2011, rew_2011, abcd_2015, xy_2015, rew_2015)

I managed to do it statically.

k.keep <- grep(c("^abcd_.*2011|xy_.*2011|^rew_"), colnames(df))

However, I’d like *2011 to be dynamic, so that I have to change it only once, if ever I want to select another year. As you can see above, just using grep and looking for the year doesn’t work, since I need all years of some columns (rew)..
Something like the following (which doesn’t work of course).

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

k.keep <- grep(c("^abcd_.*k.year|xy_.*k.year|^rew_"), colnames(df))

Any help is appreciated.

>Solution :

You may define a function with argument year and use paste0 to define the pattern within grep.

myfn <- function(year){
  k.keep <- grep(paste0("^abcd_.*",year,'|xy_.*',year,"|^rew_"), colnames(df))
  return(k.keep)
}

> myfn(year = 2011)
[1] 1 2 3 6
> myfn(year = 2015)
[1] 3 4 5 6
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading