Home Iterate through dataframe and create new column based if values in columns are not NaN (Pandas)

Questions

Iterate through dataframe and create new column based if values in columns are not NaN (Pandas)

October 31, 2023

df = pd.DataFrame({
    'subsegment': ['corp', np.nan, 'terr'],
    'region': ['japan', np.nan, np.nan],
    'subregion': [np.nan, 'se', 'ne'], 
    'segment': [np.nan,'ent','comm']
})

I am trying to iterate through the above dataframe and if the value is not NaN than adding the column header as the value or part of the value (depending on how many NaNs) in the new column "Mode".

Original DF

subsegment	region	subregion	segment
corp	japan	NaN	NaN
NaN	NaN	se	ent
terr	NaN	ne	comm

Desired Output DF

subsegment	region	subregion	segment	mode
corp	japan	NaN	NaN	subsegment-region
NaN	NaN	se	ent	subregion-segment
terr	NaN	ne	comm	subsegment-subregion-segment

I have tried to create separate smaller dfs with all the combinations of the columns to which are not null and then concatenating those dfs together but this seems extremely inefficient.

df1 = df.loc[~(df['subsegment'].isna()) & (~df['region'].isna()) & (~df['region'].isna())]
df2 = df.loc[~(df['region'].isna()) & (~df['subregion'].isna()) & (~df['segment'].isna())]
df3 = df.loc[~(df['subsegment'].isna()) & (~df['subregion'].isna()) & (~df['segment'].isna())] 

pd.concat(df1,df2,df3.....)

>Solution :

You can use the below code. I’m using apply with a custom function that just finds the columns that are not null

from itertools import compress
def temp_func(x, cols):

    list_a = (~x.isna()).to_list()
    ret = list(compress(cols, list_a))
    return '-'.join(ret)
df['mode']= df.apply(lambda x: temp_func(x, list(df.columns)), axis=1)
print(df)

Ouput:

  subsegment region subregion segment                          mode
0       corp  japan       NaN     NaN             subsegment-region
1        NaN    NaN        se     ent             subregion-segment
2       terr    NaN        ne    comm  subsegment-subregion-segment