Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Creating data frame from dictionary in function but function not returning entire data frame

I am relatively new to Python (worked in R for awhile), and I feel that I am fundamentally misunderstanding something here about Python. Below is a minimal reproducible example, and in it, I would like the data frame with each integer as a variable. The function I write below will output only the "2" integer as a variable. If I tab the "return df" then I get the "0" integer as a variable and its contents as observations. If I use print and tab it, so that it occurs under the "df," I get what I want, but it’s not in a data frame. Can someone explain what is going on here?

Expected output would be:

enter image description here

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

d = {0: ([1.5, 2.3, 4.5]), 1: ([5.6, 2.4,  4.4]), 2: ([3.5,  3.4,  5])}

def classify(z):
    for i in z:
        df = pd.DataFrame({i: z[i]})
    return df
    
classify(d)

>Solution :

If you pass a dictionary to the DataFrame, the keys will be the name of each column with the values of the dictionary in it.
In your example:

df = pd.DataFrame(d)

Output:
    0   1   2
0   1.5 5.6 3.5
1   2.3 2.4 3.4
2   4.5 4.4 5.0

To get the desired output you could do the following things:

#1 only pass the dictionary values to df (I'd prefer that one)
df = pd.DataFrame(d.values()) 

#2
df = pd.DataFrame(d).T

I think you created the dict just for this question, it is not quite clear, but you can pass dicts or lists/tuples directly to the df as ddejohn already mentioned. In your code you don’t update the dict, you define it everytime as new df, so in the end the df contains only data of the last item assigned.

Edit to your question:

look here. it is highly recommended not to do it the way you want to.

Have a look at the official pandas DataFrame documentation. I think things are much more clear after that. But since you asked, if you would fill your df in a loop, I think that’s the easiest way to go:

df = pd.DataFrame()
for k,v in d.items():
    df[k] = v
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading