Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Drop all rows of a dataframe based on another dataframe

I have two dataframes, df_1 and df_2. Both of them have datetimeindex, starting in 2022-01-01 and goes until 2022-08-14. The first one, df_1, has hourly measurements, and the second one, df_2, has daily measurements.

df_1 = pd.DataFrame(np.random.rand(5424, 1),
                    columns=["Random"],
                    index=pd.date_range(start="20220101000000", end="20220814230000", freq='H'))
df_2 = pd.DataFrame(np.random.randint(0, 3, size=226),
                    columns=["Random"],
                    index=pd.date_range(start="20220101", end="20220814", freq='D'))

How could i drop all rows from df_1 in wich the same day of df_2 has a measurement different from zero? For example, if the first two days of January have 6 and 7 as measurements, i would need to drop all 48 hours of those days in df_1.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Use merge_asof with filter rows by another DataFrame:

np.random.seed(123)
df_1 = pd.DataFrame(np.random.rand(5424, 1),
                    columns=["Random"],
                    index=pd.date_range(start="20220101000000", 
                                        end="20220814230000", freq='H'))
df_2 = pd.DataFrame(np.random.randint(0, 3, size=226),
                    columns=["Random"],
                    index=pd.date_range(start="20220101", end="20220814", freq='D'))



df = pd.merge_asof(df_1, df_2, left_index=True, right_index=True)
df = df[df['Random_y'].ne(0)]
print (df)
                     Random_x  Random_y
2022-01-01 00:00:00  0.696469         1
2022-01-01 01:00:00  0.286139         1
2022-01-01 02:00:00  0.226851         1
2022-01-01 03:00:00  0.551315         1
2022-01-01 04:00:00  0.719469         1
                      ...       ...
2022-08-14 19:00:00  0.461419         2
2022-08-14 20:00:00  0.181182         2
2022-08-14 21:00:00  0.676527         2
2022-08-14 22:00:00  0.213839         2
2022-08-14 23:00:00  0.049919         2

[3696 rows x 2 columns]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading