I have two elements, x and Y. x is a very long list of characters. Y is an even longer list, such that every element of it is a set. For every element in x, I would like to find the index of the set in Y that contains it.
- The sets are disjoint – there is no element that appears in more than one set.
- All the elements from
xare covered by the sets inY. - There is not a
yinYsuch that it contains an element that is not onx.
Y = [{"a", "b"}, {"c", "d"}, {"e", "f"}]
x = ["a", "b", "c", "d", "e", "f", "a", "f"]
# Desired results:
[('a', 0),
('b', 0),
('c', 1),
('d', 1),
('e', 2),
('f', 2),
('a', 0),
('f', 2)]
Frankly, I am ok with any shape of such results, as long as the computation is efficient.
>Solution :
You can create an index first to speed up the search:
Y = [{"a", "b"}, {"c", "d"}, {"e", "f"}]
x = ["a", "b", "c", "d", "e", "f", "a", "f"]
index = {v: i for i, s in enumerate(Y) for v in s}
out = [(v, index[v]) for v in x]
print(out)
Prints:
[("a", 0), ("b", 0), ("c", 1), ("d", 1), ("e", 2), ("f", 2), ("a", 0), ("f", 2)]