Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Create Pandas DataFrame column which joins column names for any non na values

How do I create a new column which joins the column names for any non na values on a per row basis.

  • Please note the duplicate index.

Code

so_df = pd.DataFrame({"ma_1":[10,np.nan,13,15],
             "ma_2":[10,11,np.nan,15],
             "ma_3":[np.nan,11,np.nan,15]},index=[0,1,1,2])

Example DF

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

   ma_1     ma_2    ma_3
0   10.0    10.0    NaN
1   NaN     11.0    11.0
1   13.0    NaN     NaN
2   15.0    15.0    15.0

Desired output is a new column which joins the column names for non na values as per col_names example below.

so_df["col_names"] = ["ma_1, ma_2","ma_2, ma_3","ma_1","ma_1, ma_2, ma_3"]


    ma_1    ma_2    ma_3    col_names
0   10.0    10.0    NaN     ma_1, ma_2
1   NaN     11.0    11.0    ma_2, ma_3
1   13.0    NaN     NaN     ma_1
2   15.0    15.0    15.0    ma_1, ma_2, ma_3

>Solution :

Try with dot

df['new'] = df.notna().dot(df.columns+',').str[:-1]
df
Out[77]: 
   ma_1  ma_2  ma_3             new
0  10.0  10.0   NaN       ma_1,ma_2
1   NaN  11.0  11.0       ma_2,ma_3
1  13.0   NaN   NaN            ma_1
2  15.0  15.0  15.0  ma_1,ma_2,ma_3
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading