Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Delete rows when two conditions is complete in a Pandas dataframe with python

I have the following DataFrame :

  column1  column2  columns3  column4
0       A        1         2      3.0
1       B        1         2      3.0
2       B        1         2      NaN
3       B        1         2      NaN

I’m trying to delete all rows that have the value "B" in column1 and a blank cell (or a NaN value) in column4.

This does not work:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

for row in df.iterrows():
    if (df.column1.items() == "B"):
        if (df.column4.isnull()):
            df.drop()

And this does not work either:

for row in df.iterrows():
    if (df.column1.items() == "B") & (df.column4.isna()):
        df.drop()

I do not have an error when I run but nothing happens when I print the dataframe.

>Solution :

Use multiple conditions and boolean indexing:

out = df[df['column1'].ne('B') | df['column4'].notna()]

which, according to DeMorgan’s law is equivalent to:

out = df[~(df['column1'].eq('B') & df['column4'].isna())]

Output:

  column1  column2  columns3  column4
0       A        1         2      3.0
1       B        1         2      3.0

Intermediates for the first approach:

  column1  column2  columns3  column4  col1 ≠ b  col4.notna()  (col1 ≠ b) OR col4.notna()
0       A        1         2      3.0      True          True                        True
1       B        1         2      3.0     False          True                        True
2       B        1         2      NaN     False         False                       False
3       B        1         2      NaN     False         False                       False

Intermediates for the second approach:

  column1  column2  columns3  column4  col1 == b  col4.isna()  (col1 == b) AND col4.isna()      ~
0       A        1         2      3.0      False         True                        False   True
1       B        1         2      3.0       True         True                        False   True
2       B        1         2      NaN       True        False                         True  False
3       B        1         2      NaN       True        False                         True  False
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading