I have a dataframe which has nan or empty cell in specific column for example column index 2. unfortunately I don’t have subset. I just have index. I want to delete the rows which has this features. in stackoverflow there are too many soluntions which are using subset
This is the dataframe for example:
12 125 36 45 665
15 212 12 65 62
65 9 nan 98 84
21 54 78 5 654
211 65 58 26 65
…
output:
12 125 36 45 665
15 212 12 65 62
21 54 78 5 654
211 65 58 26 65
>Solution :
If need test third column (with index=2) use boolean indexing if nan is missing value np.nan or string nan:
idx = 2
df1 = df[df.iloc[:, idx].notna() & df.iloc[:, idx].ne('nan')]
#if no value is empty string or nan string or missing value NaN/None
#df1 = df[df.iloc[:, idx].notna() & ~df.iloc[:, idx].isin(['nan',''])]
print (df1)
0 1 2 3 4
0 12 125 36.0 45 665
1 15 212 12.0 65 62
3 21 54 78.0 5 654
4 211 65 58.0 26 65
If nans are missing values:
df1 = df.dropna(subset=df.columns[[idx]])
print (df1)
0 1 2 3 4
0 12 125 36.0 45 665
1 15 212 12.0 65 62
3 21 54 78.0 5 654
4 211 65 58.0 26 65