Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to make a pandas dataframe from multiple dictionaries?

As an output of my analysis, I get a dictionary containing the measurements for each sample. I would like to have these in a dataframe with a row for each sample (thus the dictionary). The dictionary for each sample has the same keys. Is there a way to efficiently add each dictionary as a row to a dataframe?

sample_1 = {"area": 2, "perimeter": 3, "diameter": 5}
sample_2 = {"area": 6, "perimeter": 3, "diameter": 8}

I want to combine these in a dataframe. The columns should be area, perimeter and diameter, and the rows should be the samples. I have over 5000 samples and 20 variables stored in the dictionaries.

I have tried the function pd.DataFrame.from_dict but this would result in having to turn each dictionary in a dataframe that then had to be merged.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I cannot change the output of the function I use to measure to a dataframe, so the dictionaries is what I have to work with.

>Solution :

Combine all your samples in a list:

sample_1 = {'area': 2, 'perimeter': 3, 'diameter': 5}
sample_2 = {'area': 6, 'perimeter': 3, 'diameter': 8}

samples = [sample_1, sample_2]

out = pd.DataFrame(samples)

If you can, it’s even better to drop the intermediate variable names:

samples = [{'area': 2, 'perimeter': 3, 'diameter': 5},
           {'area': 6, 'perimeter': 3, 'diameter': 8},
          ]

out = pd.DataFrame(samples)

Output:

   area  perimeter  diameter
0     2          3         5
1     6          3         8

If your samples have meaningful names:

sample_1 = {'area': 2, 'perimeter': 3, 'diameter': 5}
sample_2 = {'area': 6, 'perimeter': 3, 'diameter': 8}

samples = {'A': sample_1, 'B': sample_2}

out = pd.DataFrame.from_dict(samples, orient='index')

Output:

   area  perimeter  diameter
A     2          3         5
B     6          3         8
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading