Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Check if a string is contained within the string elements of an array

We can replace:

if "ab" == "ab" or "ab" == "ac" or "ab" == "ad": # ...

With this:

if "ab" in ("ab", "ac", "ad"): # ...

All’s well and good.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

But now, if we change the equality (==) operator with the membership (in) operator, we get:

if "ab" in "aba" or "ab" in "ac" or "ab" in "ad": #...

Is there a better solution to achieve this, without using a for loop (and this many or operators)? I know I could do something like this:

if any(search_string in x for x in ("aba", "ac", "ad")): # ...

But is there a simpler way to accomplish this objective, without using a for clause?

>Solution :

There is no built-in type in Python that lets you search through a collection to test for substrings, no. What you have is arguably the simplest implementation of a substring search. There is nothing wrong with using any() and a generator expression here.

Don’t get hung up by the fact that you are using the same in operator here. It is the container on the right-hand side of the in operator that implements the actual search, not in itself (apart from a fallback for legacy objects that don’t have a __contains__ or __iter__ implementation). Python’s tuple, list, dict, set and frozenset types all define a __contains__ implementation that lets you search for something inside the collection that is equal to the left-hand side operand, which is what makes if "ab" in ("ab", "ac", "ad"): possible.

If you want to have the same ‘clean’ expression for finding matches for substring containment, implement your own:

from collections import Container

class SubstringContainer(Container):
    def __init__(self, *values):
        self._values = values

    def __contains__(self, needle):
        return any(needle in value for value in self.values)

and you could then use:

if "ab" in SubstringContainer("aba", "ac", "ad"):

You might even be able to optimise the __contains__ implementation where you pre-compute some kind of index, who knows. Personally, I’d not bother, unless there was a significant performance boost from a more involved algorithm encapsulated by such a class.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading