Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Convert pandas dictionary to a multi key dictionary where key order is irrelevant

I would like to convert a pandas dataframe to a multi key dictionary, using 2 ore more columns as the dictionary key, and I would like these keys to be order irrelevant.

Here’s an example of converting a pandas dictionary to a regular multi-key dictionary, where order is relevant.

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(0,100,size=(5, 3)), columns=list('ABC'))

df_dict = df.set_index(['B', 'C']).to_dict()['A']
print(df_dict)
{(33, 21): 85, (61, 46): 88, (78, 12): 48, (89, 18): 65, (91, 19): 41}

so df_dict[(33, 21)] will get 85, but df_dict[(21, 33)] will result in a key error.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Potential Solutions

This is a SO question which covers ways to make order irrelevant dictionaries, using sorted, tuple, Counter, and/or frozenset.

Multiples-keys dictionary where key order doesn't matter

However, no apparent solutions jump out at me for using these datatypes and functions with Pandas conversion methods.

The next idea would be to convert the dictionary keys after the dataframe has been converted.

I tried this

new_d = {frozenset(key): value for key, value in df_dict}

But got this error

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-49-6a3244440ac2> in <module>()
----> 1 new_d = {frozenset(key): value for key, value in df_dict}
      2 new_d

<ipython-input-49-6a3244440ac2> in <dictcomp>(.0)
----> 1 new_d = {frozenset(key): value for key, value in df_dict}
      2 new_d

TypeError: 'int' object is not iterable

>Solution :

Why not create from df

d = dict(zip(df[['B', 'C']].apply(frozenset,1),df['A']))
d
{frozenset({72, 12}): 34, frozenset({98, 76}): 82, frozenset({67, 7}): 35, frozenset({60, 70}): 18, frozenset({8, 53}): 81}
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading