Why aren't two `zip` objects equal if the underlying data is equal?

Advertisements

Suppose we create two zips from lists and tuples, and compare them, like so:

>>> x1=[1,2,3]
>>> y1=[4,5,6]
>>> x2=(1,2,3)
>>> y2=(4,5,6)
>>> w1=zip(x1,y1)
>>> w2=zip(x2,y2)
>>> w1 == w2
False

But using list on each zip shows the same result:

>>> list(w1)
[(1, 4), (2, 5), (3, 6)]
>>> list(w2)
[(1, 4), (2, 5), (3, 6)]

Why don’t they compare equal, if the contents are equal?

>Solution :

The two zip objects don’t compare equal because the zip class doesn’t define any logic for comparison, so it uses the default object logic that only cares about object identity. In this case, an object can only ever compare equal to itself; the object contents don’t matter.

So, the zip objects will not compare equal even if they are constructed the same way, from the same immutable data:

>>> x = (1, 2, 3)
>>> y = (4, 5, 6)
>>> zip(x, y) == zip(x, y) # separate objects, therefore not equal
False

That said: zip objects don’t "contain" the values in the iteration, which is why they can’t be reused. The only robust way to verify that they’ll give the same results when iterated, is to do that iteration.

Internally, the zip object just has some iterators over other data. One might think, why not compare the iterators, to see if they "point at" the same position in the same underlying data? But that cannot work, either: there are arbitrarily many ways to implement the iterator, plus the iterator doesn’t in general know anything about what it’s iterating over. Many iterators won’t "know where" they are in the underlying data. Many iterators aren’t at a position in underlying data, but instead they calculate values on the fly.

Leave a ReplyCancel reply