I haven’t found any helpful Regex tools to help me figure this complicated pattern out.
I have the following string:
Myfirstname Mylastname, Department of Mydepartment, Mytitle, The University of Me; 4-1-1, Hong,Bunk, Tokyo 113-8655, Japan E-mail:my.email@example.jp, Tel:00-00-222-1171, Fax:00-00-225-3386
I am trying to learn enough Regex patterns to remove the substrings one at a time:
E-mail:my.email@example.jp
Tel:00-00-222-1171
Fax:00-00-225-3386
So I think the correct pattern would be to remove a given word (ie., "E-mail", "Tel") all the way through the following comma.
Is type of dynamic pattern possible in Regex?
I am performing the match in Python, however, I don’t think that would matter too much.
Also, I know the data string looks comma separated, and it is. However there is no guarantee of preserving the order of those fields. That’s why I’m trying to use a Regex match.
>Solution :
How about this regex:
<YOUR_WORD>.*?(?=(,|($)))
Explanation:
- It looks for the word specified in
<YOUR_WORD>placeholder - It looks for any kind of character afterwards
- The search stops when it hits one of the two options:
- It finds the character
, - It finds an end of the line
- It finds the character
So:
E-mail.*?(?=(,|($)))
Will result in:
E-mail:my.email@example.jp
And
Fax.*?(?=(,|($)))
Will result in:
Fax:00-00-225-3386
If there are edge cases it misses – I would like to know, and whether it affects the performance/ is necessary.