Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas apply same filtering on two dataframes

I have two pandas dataframes with the same indexes and lengths. However, they might contain nans at different places. See, example input below:

timestamp,y
2019-08-01 00:00:00,  772.00
2019-08-01 00:15:00,  648.00
2019-08-01 00:30:00,  nan
timestamp,y
2019-08-01 00:00:00,  612.00
2019-08-01 00:15:00,  nan
2019-08-01 00:30:00,  500.00

I want to do some calculations where I want to remove indexes in both the dataframes even if one of them has a NaN values. So, in those case only the first row is valid.

Is there a way to do this easily in pandas?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You can merge to get the common indices, then subset:

keep = df1.merge(df2, on='timestamp').dropna(how='any')['timestamp']

new_df1 = df1[df1['timestamp'].isin(keep)]

new_df2 = df2[df2['timestamp'].isin(keep)]

Alternative, you can use python sets to get the intersection:

keep = set(df1.dropna()['timestamp']).intersection(df2.dropna()['timestamp'])

output:

# new_df1
             timestamp      y
0  2019-08-01 00:00:00  772.0

# new_df2
             timestamp      y
0  2019-08-01 00:00:00  612.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading