fetch_url = "https://www.website.com?splitArr=[43]&splitArrPitch=&position=P&statType=player&startDate=2023-03-28&endDate=2023-04-11&players=&filter=&groupBy=season"
start_date <- stringr::str_extract(fetch_url, "(?<=startDate=)[^&]+")
end_date <- stringr::str_extract(fetch_url, "(?<=endDate=)[^&]+")
stat_type <- stringr::str_extract(fetch_url, "(?<=statType=)[^&]+")
split_arr <- stringr::str_extract(fetch_url, "(?<=splitArr\\[)[^]]+")
We are successfully able to extract start_date, end_date, and stat_type from this string, however we are struggling to get the '43' for split_arr. How can we update the code for this?
Alternatively, for a fetch_url such as https://www.website.com?splitArr=&splitArrPitch=&position=P&statType=player&startDate=2023-03-28&endDate=&players=&filter=&groupBy=season as a second example, this should return an empty string '' for split_arr as well as for end_date.
We are close as we’ve gotten the first 3 variables, but the [] brackets around splitArr is making this variable harder to grab.
>Solution :
There is an = before the [ – instead of the regex lookarounds, we can also capture ((...)) and specify the capture group in group
library(stringr)
str_extract(fetch_url, "splitArr=\\[(\\d+)", group = 1)
[1] "43"