I’m currently cleaning up URLs and I want to get everything before the last slash ("/")
This is an example string:
https://www.businessinsider.de/gruenderszene/plus-angebot/?tpcc=onsite_gs_header_nav&verification_code=DOVCGF75J8LSID
and the part I want to extract is: https://www.businessinsider.de/gruenderszene/plus-angebot
With normal RegEx, it is super simple with .*(?=\/)
You can see it here on regex101.com
Can you help me to replicate this on BigQuery please, as they don’t allow for lookahead/lookbehind?
>Solution :
I might phrase this as a regex replacement which removes the last path separator and path:
SELECT url, REGEXP_REPLACE(url, r'/[^/]+$', '') AS url_out
FROM yourTable;
If you want to specifically target a final path separator immediately followed by a query parameter, then use:
SELECT url, REGEXP_REPLACE(url, r'/\?[^/]+$', '') AS url_out
FROM yourTable;