Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to identify the elements that do not match a given sequence in an array?

Consider the given sequence sequence = [1, 1, 0].

An example of an array that follows the sequence throughout is [1, 1, 0, 1, 1, 0, 1, 1, 0].

I have an array that follows the sequence throughout except with a few outliers. An example of such an array is [1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0]. In this array, the elements that are outliers are at the 6th and 16th indices. If these outliers are removed, the array would follow the sequence throughout.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I want to write a python code that prints the indices of the outliers in the array. The array is assumed to have only 0s and 1s.

The code should look like below:

sequence = [1, 1, 0]
array_with_outliers = [...] //given
//code to identify outliers
print(indices_of_outliers) 

My Failed Attempts

I tried to break the array_with_outliers into chunks of 3 elements. I then iterated through the chunks to see if they matched the sequence. If chunk n didn’t match the sequence, I popped the elements upto index 3(n-1) in the array_with_outliers and chunked the array_with_outliers again, carrying out the same process recursively till the array_with_outliers was empty. The code was too inefficient, however, and my laptop wasn’t able to run it.

I also tried to iterate the array_with_outliers and see if each element matched its expected value based on the sequence. I computed the expected value based on the index and the number of outliers identified so far. However, the problem was that I was individually comparing each element to its expected value, and not the chunk as a whole, which created overlapping problems.

>Solution :

You can do this in linear time by just iterating through the array, skipping matching chunks, and accumulating non-matching indices one at a time until you find the next matching chunk.

def find_outliers(needle: list[int], haystack: list[int]) -> list[int]:
    """Find indices in haystack where
    the sequence of repeated needles is broken."""
    outliers: list[int] = []
    i = 0
    while i < len(haystack):
        if haystack[i:i+len(needle)] == needle:
            i += len(needle)
        else:
            outliers.append(i)
            i += 1
    return outliers

assert not find_outliers(
    [1, 1, 0],
    [1, 1, 0, 1, 1, 0, 1, 1, 0]
)

assert find_outliers(
    [1, 1, 0],
    [1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0]
) == [6, 16]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading