Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Using regular expressions in Python to find specific word

I have following lines (the order of lines can be different, there can be other similar lines as well). And I would like to replace "sid" with "tempvalue" taking into an account that "sid" can be surrounded by any symbol except for letters and digits. How to do that on Python using regular expression?

lines = [
 "VAR0=sid_host1; -",
 "VAR1=sid; -",
 "VAR2=psid; -",
 "VAR3=sid_host1; -",
 "VAR4=psid_host2; -",
 "VAR5 = (file=/dir1/sid_host1/sid/trace/alert_sid.log)(database=sid)"
]

For line 0 desired result is: "VAR0=tempvalue_host1; -"

for line 1: "VAR1=tempvalue; -"

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

for line 3: "VAR3=tempvalue_host1; -"

for line 5: "VAR5 = (file=/dir1/tempvalue_host1/tempvalue/trace/alert_tempvalue.log)(database=tempvalue)"

Other lines must remain untouched.

>Solution :

I think we can just do a regex replace all on (?<![^\W_])sid(?![^\W_]) here:

lines = [
    "VAR0=sid_host1; -",
    "VAR1=sid; -",
    "VAR2=psid; -",
    "VAR3=sid_host1; -",
    "VAR4=psid_host2; -",
    "VAR5 = (file=/dir1/sid_host1/sid/trace/alert_sid.log)(database=sid)"
]

lines = [re.sub(r'(?<![^\W_])sid(?![^\W_])', 'tempvalue', x) for x in lines]
print(lines)

['VAR0=tempvalue_host1; -',
 'VAR1=tempvalue; -',
 'VAR2=psid; -',
 'VAR3=tempvalue_host1; -',
 'VAR4=psid_host2; -',
 'VAR5 = (file=/dir1/tempvalue_host1/tempvalue/trace/alert_tempvalue.log)(database=tempvalue)']

Explanation of regex:

  • (?<![^\W_]) preceding character is either a non word OR underscore
  • sid match literal sid
  • (?![^\W_]) following character is either a non word OR underscore

Note that we are basically building our own custom word boundaries which admit either \W or underscore (usually underscore is a word character).

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading