We have urls in the following URL formats, I want to get only digit values between the strings I specified, I tried a pattern like this (?<=\/sub.example.com\/)(.*)(?=\?[Uu]rl|$) but it does not give the result I want
https://sub.example.com/79084/t/64931?Url=https%3a%2f%2fwww.test.com%2fpath%2fotherpath%2f
https://sub.example.com/79084/t/64931
Expected results:
[ 79084, 64931 ]
I need to exclude /t/
>Solution :
Using dynamic length lookbehind feature in Javascript, you can use this regex:
(?<=\/sub\.example\.com\/(?:[^\/]*\/)*)\d+(?=(?:\/[^\/]*)*(?:\?[Uu]rl|$))
Note that it will match all the digits after domain name e.g. https://sub.example.com/79084/t/64931/1234/6789 will have 4 matches for all the numbers.
RegEx Breakup:
(?<=\/sub\.example\.com\/(?:[^\/]*\/)*): Lookbehind to assert presence ofsub.example.com/followed by 0 or more repeats of path components separated with/\d+: Match 1+ digits(?=(?:\/[^\/]*)*(?:\?[Uu]rl|$)): Must be followed by 0 or more repeats of path components separated with/and that must be followed by?Urlor line end.