I’m looking how to add a column to df that holds the column name of the first column that is not NaN or Null. Example: column "first" based on columns ‘A’, ‘B’, ‘C’, ‘D’
Name A B C D First
Alex 100 Nan Null NaN A
Michael 300 400 Null Null A
Steve Null 100 100 100 B
John Null Null Nan Null Null
>Solution :
Convert Null, Nan values to misisng values and if necessary Name to index, then test DataFrame.notna and get first matched column by DataFrame.idxmax, last set NaN if all NaNs per row:
df = df.replace(['Null', 'Nan'], np.nan)
df = df.set_index('Name')
m = df.notna()
df['First'] = m.idxmax(axis=1).where(m.any(axis=1))
print (df)
A B C D First
Name
Alex 100 Nan NaN NaN A
Michael 300 400 NaN NaN A
Steve NaN 100 100 100 B
John NaN NaN NaN NaN NaN
If need original values without replace:
df = df.set_index('Name')
m = df.isin(['Null', 'Nan']) | df.isna()
df['First'] = (~m).idxmax(axis=1).mask(m.all(axis=1), 'Null')