Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pythonic way of joining lists of dictionaries on a key

Suppose I have two lists of dictionaries, l1 and l2.

l1 = [
    { "id": 0, "foo": 0 },
    { "id": 1, "foo": 1 },
    { "id": 2, "foo": 2 },
    ...
]

l2 = [
    { "id": 0, "bar": 0 },
    { "id": 1, "bar": 1 },
    { "id": 2, "bar": 2 },
    ...
]

Is there a Pythonic way of joining the two lists together on a key, say "id"?

Expected output:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

[
    { "id": 0, "foo": 0, "bar": 0 },
    { "id": 1, "foo": 1, "bar": 1 },
    { "id": 2, "foo": 2, "bar": 2 },
    ...
]

This can be achieved with comprehension, but it inefficiently runs in O(NM), and creates a duplicate key-value pair if the key of l1 and l2 are different.

[
    {**d1, **d2}
    for d1 in l1 for d2 in l2
    if d1["id"] == d2["id"]
]

Alternatively, without considering readability, one could solve it more time-efficiently by:

# Create a mapping from the key of d1 to d1.
# This dictionary will combine the entries of d1 and d2.
d = { d1["id"]: d1 for d1 in l1 }

# Insert d2 entries into their corresponding dictionaries.
for d2 in l2:
    key = d2["id"]
    d[key].update({
        k: v
        for (k, v) in d2.items()
        if k != "id"
    })

# Convert the dictionary back into a list of dictionaries.
result = list(d.values())

Is there a better solution?

>Solution :

"Pythonic" doesn’t mean "use list comprehensions instead of for loops". For-loops are very pythonic. Just use an intermediate dict as an index. Use the .setdefault grouping idiom. Use itertools to create convenient iterators that keep your code clean:

import itertools

index = {}

for d in itertools.chain(l1, l2):
    index.setdefault(d['id'], {}).update(d)

result = list(index.values())

Potentially, you could consider using a defaultdict instead of a plain dict with .setdefault (in this case, I probably would since the defaultdict would just be an intermediate data structure):

import itertools
import collections

index = collections.defaultdict(dict)

for d in itertools.chain(l1, l2):
    index[d["id"]].update(d)

result = list(index.values())
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading