Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Using grepl with OR condition to filter string

Currently I have a df like so:

df <- data.frame(
  player = c('Player To Have 1 Or More Shots On Target', 'Player To Have 1 Or More Shots On Target', 
             'Player To Have 2 Or More Shots On Target', 'Player To Have 3 Or More Shots On Target',
             'Player To Have 1 Or More Shots On Target in 1st Half'))

Output:

                                                player
1             Player To Have 1 Or More Shots On Target
2             Player To Have 1 Or More Shots On Target
3             Player To Have 2 Or More Shots On Target
4             Player To Have 3 Or More Shots On Target
5 Player To Have 1 Or More Shots On Target in 1st Half

I would like to use grepl (or another suitable alternative) to only capture 1,2,3,4, etc. shots on target (disregarding anything else like row 5 which also contains ‘in 1st Half).

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

In the example above, I wish to capture all of the first 4 rows (the original data has many more rows). I tried the following which works:

df2 <- dplyr::filter(df, grepl("Player To Have 1 Or More Shots On Target", player))

How can the above be ameneded to include multiple digits for the "1"? E.g. I would like to capture 1,2,3,4, etc. shots?

I tried something like:

number_of_shots <- c("1","2")
df2 <- dplyr::filter(df, grepl("Player To Have", number_of_shots, "Or More Shots On Target", player))

But I get the following error:

Error in `dplyr::filter()`:
ℹ In argument: `grepl(...)`.
Caused by error:
! `..1` must be of size 5 or 1, not size 2.

>Solution :

Regular expressions can be used

  • starting with ^ and ending with $ matching
  • [0-9] to match any digit from 0-9, and [1-4] if you just want to match 1 to 4. Use .* to match anything.

df <- data.frame(
  player = c('Player To Have 1 Or More Shots On Target', 'Player To Have 1 Or More Shots On Target', 
             'Player To Have 10 Or More Shots On Target', 'Player To Have 3 Or More Shots On Target',
             'Player To Have 1 Or More Shots On Target in 1st Half'))

# match 0-9
df %>%
  filter(grepl('^Player To Have [0-9] Or More Shots On Target$', player))

# match anything 
df %>%
  filter(grepl('^Player To Have .* Or More Shots On Target$', player))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading