Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Name of first column that is non NaN ir Null

I’m looking how to add a column to df that holds the column name of the first column that is not NaN or Null. Example: column "first" based on columns ‘A’, ‘B’, ‘C’, ‘D’

Name    A     B   C   D   First

Alex    100  Nan Null NaN   A

Michael 300  400 Null Null  A

Steve   Null 100 100  100   B

John    Null Null Nan Null Null

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Convert Null, Nan values to misisng values and if necessary Name to index, then test DataFrame.notna and get first matched column by DataFrame.idxmax, last set NaN if all NaNs per row:

df = df.replace(['Null', 'Nan'], np.nan)
df = df.set_index('Name')

m = df.notna()
df['First'] = m.idxmax(axis=1).where(m.any(axis=1))
print (df)
           A    B    C    D First
Name                             
Alex     100  Nan  NaN  NaN     A
Michael  300  400  NaN  NaN     A
Steve    NaN  100  100  100     B
John     NaN  NaN  NaN  NaN   NaN

If need original values without replace:

df = df.set_index('Name')
m = df.isin(['Null', 'Nan']) | df.isna()

df['First'] = (~m).idxmax(axis=1).mask(m.all(axis=1), 'Null')
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading