Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas DataFrame Created from Dictionary vs Created from List

Is there a line or two of code that would make the DataFrame created from lists behave like the one created from a dictionary?

#DataFrame created from dictionary, this works:
import pandas as pd
data= {'Salary': [30000, 40000, 50000, 85000, 75000],            
        'Exp': [1, 3, 5, 10, 25],          
        'Gender': ['M','F', 'M', 'F', 'M']} 
df = pd.DataFrame(data)
print(df), print()

new_df1 = df[df['Salary'] >= 50000]
print(new_df1), print()

new_df2 = df.sort_values(['Exp'], axis = 0, ascending=[False])
print(new_df2)


#This doesn't work with the df.functions, sort and conditionals    
data = [['Salary', 'Exp', 'Gender'],[30000, 1, 'M'],
        [40000, 3, 'F'], [50000, 5, 'M'], [85000, 10, 'F'], [75000, 25, 'M']]

df = pd.DataFrame(data)
print(df), print()

new_df1 = df[df['Salary'] >= 50000]  #doesn't work
print(new_df1), print()

new_df2 = df.sort_values(['Exp'], axis = 0, ascending=[False])  #ditto
print(new_df2)

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

In your second code, you’re not using the first sublist as column names but rather data.

Pass instead the first sublist as the columns parameter of your DataFrame constructor:

df = pd.DataFrame(data[1:], columns=data[0])

Output:

   Salary  Exp Gender
0   30000    1      M
1   40000    3      F
2   50000    5      M
3   85000   10      F
4   75000   25      M
why your code failed

You code was incorrectly mapping the first sublist as data:

pd.DataFrame(data)

        0    1       2   # incorrect header
0  Salary  Exp  Gender   # this shouldn't be a data row
1   30000    1       M
2   40000    3       F
3   50000    5       M
4   85000   10       F
5   75000   25       M

full code:
df = pd.DataFrame(data[1:], columns=data[0])
print(df), print()

new_df1 = df[df['Salary'] >= 50000]  #doesn't work
print(new_df1), print()

new_df2 = df.sort_values(['Exp'], axis = 0, ascending=[False])  #ditto
print(new_df2)

Output:

   Salary  Exp Gender
0   30000    1      M
1   40000    3      F
2   50000    5      M
3   85000   10      F
4   75000   25      M

   Salary  Exp Gender
2   50000    5      M
3   85000   10      F
4   75000   25      M

   Salary  Exp Gender
4   75000   25      M
3   85000   10      F
2   50000    5      M
1   40000    3      F
0   30000    1      M
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading