Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to remove parenthesis and everything in between in a string in R where the parenthesis could be anywhere in the string?

I have a data set that gives me tennis scores but as character type such as "6-4 3-6 6-2" and
"7-6(6) 6-2". I want to add up all of the games played in the match so I need to remove the hyphens, spaces, and tiebreaker score which in the second example is seen as (6). Then, convert them to doubles and add every individual number to get total games played in the match so for the first and second examples the total games played would be 27 and 21 respectively.

So far I can deal with removing the dashes and spaces by using the stringr package and using
str_replace_all(score, c("-" = "", " " = "")) which gives me a string with numbers. I can’t figure out how to remove the tiebreak scores from a string because the value between the parenthesis could be anything. Somehow need to figure out how to replace "(...)" to "", where anything string could be inside the parenthesis (in my case it is only one number). Also, the parenthesis could appear anywhere in the string.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

games <- c("6-4 3-6 6-2" , "7-6(6) 6-2")

sub("\\(.*\\)", "", games) |>
  strsplit(split="-|\\s*") |>
  sapply(function(x) sum(as.numeric(x), na.rm = TRUE))
[1] 27 21
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading