I have the following vector.
column_names <- c("6Li", "7Li", "10B", "11B", "7Li.1",
"205Pb", "206Pb", "207Pb", "238U",
"206Pb.1", "238U.1")
Notice that some of the values are just duplicates with a ".1" stuck at the end. I want to index out all of these character strings along with their corresponding character strings that match such that only the following are returned.
#[1] "7Li" "7Li.1" "206Pb" "238U" "206Pb.1" "238U.1"
Assume you don’t know the index positions and so you cannot simply index these values out as follows column_names[c(2,5,7,9,10,11)]
. How can I use pattern matching to extract these values?
>Solution :
There is likely a more elegant solution, but in base R you cold try a combination of grep
/gsub
and paste
:
idx <- grep(paste(gsub("\\.1", "", column_names[grep("\\.1", column_names)]), collapse = "|"), column_names)
# [1] 2 5 7 9 10 11
column_names[idx]
# [1] "7Li" "7Li.1" "206Pb" "238U" "206Pb.1" "238U.1"