How to map multiple nested dictionaries to Pandas DataFrame

August 10, 2022

I have a Pandas DataFrame that looks something like this:

import pandas as pd

df = pd.DataFrame({
    "class":["a","b","c","b","a","b","c","c","a","c","b","c","c","a"],
    "country":["aus","usa","fra","aus","ger","usa","usa","ger","aus","usa","fra","fra","usa","aus"],
    "category":["w","x","y","x","z","z","w","x","y","y","x","z","y","z"]
})

    class   country category
0   a       aus     w
1   b       usa     x
2   c       fra     y
3   b       aus     x
4   a       ger     z
5   b       usa     z
6   c       usa     w
7   c       ger     x
8   a       aus     y
9   c       usa     y
10  b       fra     x
11  c       fra     z
12  c       usa     y
13  a       aus     z

Furthermore, I have a dictionary for each class, each of which is a nested dictionary containing a number for each category for each country:

class_a_numbers = {'aus': {'w': 100, 'x': 50, 'y': 80, 'z': 10},
 'fra': {'w': 120, 'x': 60, 'y': 90, 'z': 10},
 'ger': {'w': 105, 'x': 75, 'y': 85, 'z': 0},
 'usa': {'w': 140, 'x': 100, 'y': 110, 'z': 50}}

class_b_numbers = {'aus': {'w': 1000, 'x': 500, 'y': 800, 'z': 100},
 'fra': {'w': 1200, 'x': 600, 'y': 900, 'z': 100},
 'ger': {'w': 1050, 'x': 750, 'y': 850, 'z': 0},
 'usa': {'w': 1400, 'x': 1000, 'y': 1100, 'z': 500}}

class_c_numbers = {'aus': {'w': 10, 'x': 5, 'y': 9, 'z': 2},
 'fra': {'w': 13, 'x': 5, 'y': 8, 'z': 1},
 'ger': {'w': 9, 'x': 6, 'y': 3, 'z': 1},
 'usa': {'w': 12, 'x': 12, 'y': 10, 'z': 3}}

I would like to create a number column in df and populate it with the appropriate number from the appropriate dictionary, such that the outcome will look like this:

    class   country category    number
0   a       aus     w           100
1   b       usa     x           1000
2   c       fra     y           8
3   b       aus     x           500
4   a       ger     z           0
5   b       usa     z           500
6   c       usa     w           12
7   c       ger     x           6
8   a       aus     y           80
9   c       usa     y           10
10  b       fra     x           600
11  c       fra     z           1
12  c       usa     y           10
13  a       aus     z           10

My real dataset is quite large, so efficiency is an important factor. Normally, I would use map from a dictionary, but I can’t figure out how to make it work with these three nested dictionaries.

>Solution :

df.apply to the rescue:

numbers = {
    'a': class_a_numbers,
    'b': class_b_numbers,
    'c': class_c_numbers,
}

df["number"] = df.apply(lambda row: numbers[row["class"]][row["country"]][row["category"]], axis=1)