Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Alternative for Positive Lookahead on Big Query – Match everything before the last delimiter

I’m currently cleaning up URLs and I want to get everything before the last slash ("/")

This is an example string:
https://www.businessinsider.de/gruenderszene/plus-angebot/?tpcc=onsite_gs_header_nav&verification_code=DOVCGF75J8LSID

and the part I want to extract is: https://www.businessinsider.de/gruenderszene/plus-angebot

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

With normal RegEx, it is super simple with .*(?=\/)

You can see it here on regex101.com

Can you help me to replicate this on BigQuery please, as they don’t allow for lookahead/lookbehind?

>Solution :

I might phrase this as a regex replacement which removes the last path separator and path:

SELECT url, REGEXP_REPLACE(url, r'/[^/]+$', '') AS url_out
FROM yourTable;

If you want to specifically target a final path separator immediately followed by a query parameter, then use:

SELECT url, REGEXP_REPLACE(url, r'/\?[^/]+$', '') AS url_out
FROM yourTable;
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading