Home Drop rows if some columns have some value

Questions

Drop rows if some columns have some value

August 22, 2022

I have this df:

import pandas as pd
import numpy as np
d = {'name': ['bob', 'jake','Sem'], 'F1': [3, 4, np.nan], 'F2': [14, 40, 7], 'F3': 
[np.nan, 1, 55]}
df = pd.DataFrame(data=d)
print (df)

out>>>
   name   F1  F2    F3
0   bob  3.0  14   NaN
1  jake  4.0  40   1.0
2   Sem  NaN   7   NaN

I would like to delete all the rows that under at least 2 columns (between F1 F2 and F3) are NaN.
Like:

   name   F1  F2    F3
0   bob  3.0  14   NaN
1  jake  4.0  40   1.0

This is just an example, but I may have up to many columns (up to F100) and I may want to delete with other values instead of 2 out of 3 columns.
What is the best way to do this?

>Solution :

You can use the subset and thresh parameters of dropna:

df.dropna(subset=['F1', 'F2', 'F3'], thresh=2)

Example:

import pandas as pd
import numpy as np
d = {'name': ['bob', 'jake', 'Sem', 'Mario'],
     'F1': [3, 4, np.nan, np.nan],
     'F2': [14, 40, 7, 42],
     'F3': [np.nan, 1, 55, np.nan]}
df = pd.DataFrame(data=d)
print(df)

    name   F1  F2    F3
0    bob  3.0  14   NaN
1   jake  4.0  40   1.0
2    Sem  NaN   7  55.0
3  Mario  NaN  42   NaN

df2 = df.dropna(subset=['F1', 'F2', 'F3'], thresh=2)
print(df2)

   name   F1  F2    F3
0   bob  3.0  14   NaN
1  jake  4.0  40   1.0
2   Sem  NaN   7  55.0

Selecting the columns automatically:

cols = list(df.filter(regex=r'F\d+'))
df.dropna(subset=cols, thresh=2)

alternative without `dropna`

Using boolean indexing:

m = df.filter(regex=r'F\d+').isna().sum(1)

df2 = df[m.lt(2)]

values above threshold

drop if at least 2 values ≥ 4:

m = df.filter(regex=r'F\d+').gt(4).sum(1)

df2 = df[m.lt(2)]

dataframe

byMR

Published August 22, 2022

Add a comment

Django QuerySet filter by Range with DateTimeField

byMR

August 22, 2022

Questions

How to simplify when expression in kotlin

byMR

August 22, 2022

Questions

Is it possible to create a default implementation for 2 protocol implementations?

byMR

August 22, 2022

Questions

How can I use the lag function to skip over rows and provide number of previous occurrences

byMR

August 22, 2022

Questions

React text input inside modal not working

byMR

August 22, 2022

Questions

How do I remove undeline of TabRow in Jetpack Compose?

byMR

August 22, 2022

Drop rows if some columns have some value

MEDevel.com: Open-source for Healthcare and Education

>Solution :

alternative without `dropna`

values above threshold

Like this:

Leave a ReplyCancel reply

Read more

Django QuerySet filter by Range with DateTimeField

How to simplify when expression in kotlin

Is it possible to create a default implementation for 2 protocol implementations?

How can I use the lag function to skip over rows and provide number of previous occurrences

React text input inside modal not working

How do I remove undeline of TabRow in Jetpack Compose?

Keep Up to Date with the Most Important News

Drop rows if some columns have some value

MEDevel.com: Open-source for Healthcare and Education

>Solution :

alternative without dropna

values above threshold

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Django QuerySet filter by Range with DateTimeField

How to simplify when expression in kotlin

Is it possible to create a default implementation for 2 protocol implementations?

How can I use the lag function to skip over rows and provide number of previous occurrences

React text input inside modal not working

How do I remove undeline of TabRow in Jetpack Compose?

Discover more from Dev solutions

alternative without `dropna`