Home Extract differences of values of 2 given dictionaries – values are tuples of strings

Questions

Extract differences of values of 2 given dictionaries – values are tuples of strings

August 19, 2022

I have two dictionaries as follows, I need to extract which strings in the tuple values are in one dictionary but not in other:

dict_a = {"s": ("mmmm", "iiiii", "p11"), "yyzz": ("oo", "i9")}
dict_b = {"s": ("mmmm",), "h": ("pp",), "g": ("rr",)}

The desired output:

{"s": ("iiiii", "p11"), "yyzz": ("oo", "i9")}

The order of the strings in the output doesn’t matter.

One way that I tried to solve, but it doesn’t produce the expected result:

>>> [item for item in dict_a.values() if item not in dict_b.values()]
[('mmmm', 'iiiii', 'p11'), ('oo', 'i9')]

>Solution :

If order doesn’t matter, convert your dictionary values to sets, and subtract these:

{k: set(v) - set(dict_b.get(k, ())) for k, v in dict_a.items()}

The above takes all key-value pairs from dict_a, and for each such pair, outputs a new dictionary with those keys and a new value that’s the set difference between the original value and the corresponding value from dict_b, if there is one:

>>> dict_a = {"s": ("mmmm", "iiiii", "p11"), "yyzz": ("oo", "i9")}
>>> dict_b = {"s": ("mmmm",), "h": ("pp",), "g": ("rr",)}
>>> {k: set(v) - set(dict_b.get(k, ())) for k, v in dict_a.items()}
{'s': {'p11', 'iiiii'}, 'yyzz': {'oo', 'i9'}}

The output will have sets, but these can be converted back to tuples if necessary:

{k: tuple(set(v) - set(dict_b.get(k, ()))) for k, v in dict_a.items()}

The dict_b.get(k, ()) call ensures there is always a tuple to give to set().

If you use the set.difference() method you don’t even need to turn the dict_b value to a set:

{k: tuple(set(v).difference(dict_b.get(k, ()))) for k, v in dict_a.items()}

Demo of the latter two options:

>>> {k: tuple(set(v) - set(dict_b.get(k, ()))) for k, v in dict_a.items()}
{'s': ('p11', 'iiiii'), 'yyzz': ('oo', 'i9')}
>>> {k: tuple(set(v).difference(dict_b.get(k, ()))) for k, v in dict_a.items()}
{'s': ('p11', 'iiiii'), 'yyzz': ('oo', 'i9')}