Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

String comparison between elements in list

I am trying to compare elements between two lists. One list is predefined which is a pattern with which new lists has to be compared. The comparison should be done between elements of same index between lists.
Example: list1[0] has to be only compared with list2[0], list1[1] has to be only compared with list2[1] etc. The output should only return as True if all the elements match.
The issue I am facing is, one element in predefined pattern has a part which will be dynamic, when comparing I have to ignore. How can I achieve this

pattern = ['Hi', 'my' , 'name is <xxxxxxxxxxx> age <yy>']

This is defined pattern. Here the contents inside <> is dynamic and has to be ignored.

when comparing list2 = ['Hi', 'my' , 'name is soku age 21'] should be true.
list3 = ['Hi', 'my', 'soku'] should be false

How can I achieve this because normal element to element string comparison wont work.

Another example

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

pattern = ['A', 'B', 'C_<xxxx>_AB']
list1 = ['A', 'B', 'C_aaaa112=22_AB']

This should be true

>Solution :

One approach is to use all and re.fullmatch:

import re

pattern = ['Hi', 'my', 'name is .+ age \d{2}']
list2 = ['Hi', 'my', 'name is soku age 21']
list3 = ['Hi', 'my', 'soku']

print(all(re.fullmatch(p, l) for p, l in zip(pattern, list2)))
print(all(re.fullmatch(p, l) for p, l in zip(pattern, list3)))

Output

True
False

As an alternative you could use the following pattern:

pattern = ['Hi', 'my', 'name is \S+ age \d{2}']

to avoid matching whitespaces characters.

The pattern:

.+

matches any character including whitespace, while

\S+

matches any character which is not a whitespace character. Moreover the pattern:

\d{2}

will match two contiguous digits.

To build the pattern dynamically from user input, you could do something like below:

pattern = ['Hi', 'my', 'name is <xxxxxxxxxxx> age <yy>']
regex_pattern = [re.sub(r"<.+?>", r".+", s) for s in pattern]
print(all(re.fullmatch(p, l) for p, l in zip(regex_pattern, list2)))
print(all(re.fullmatch(p, l) for p, l in zip(regex_pattern, list3)))

Output

True
False
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading