Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

python – drop duplicated index in place in a pandas dataframe

I have a list of dataframes:

all_df = [df1, df2, df3]

I would like to remove rows with duplicated indices in all dataframes in the list, such that the changes are reflected in the original dataframes df1, df2 and df3.
I tried to do

for df in all_df:
    df = df[~df.index.duplicated()]

But the changes are only applied in the list, not on the original dataframes.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Essentially, I want to avoid doing the following:

df1 = df1[~df1.index.duplicated()]
df2 = df2[~df2.index.duplicated()]
df3 = df3[~df3.index.duplicated()]
all_df = [df1,df2,df3]

>Solution :

You need recreate list of DataFrames:

all_df = [df[~df.index.duplicated()] for df in all_df]

Or:

for i, df in enumerate(all_df):
    all_df[i] = df[~df.index.duplicated()]

print (all_df[0])

EDIT: If name of dictionary is important use dictionary of DataFrames, but also inplace modification df1, df2 is not here, need select by keys of dicts:

d = {'price': df1, 'volumes': df2}

d  = {k: df[~df.index.duplicated()] for k, df in all_df.items()}

print (d['price'])
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading