Suppose you have the following data frame:
ID Q1 Q2 Q3
0 1 0 1
1 0 0 1
2 1 1 1
3 0 1 1
I would like to return an array with column numbers wherever there is a 1 and add it as another column like this:
ID Array Q1 Q2 Q3
0 [0,2] 1 0 1
1 [2] 0 0 1
2 [0,1,2] 1 1 1
3 [1,2] 0 1 1
Thanks
>Solution :
I would use numpy.where:
a, b = np.where(df.filter(like='ID')==1)
# or
a, b = np.where(df.drop(columns='ID')==1)
df['Array'] = pd.Series(b).groupby(a).agg(list).set_axis(df.index)
Output:
ID Q1 Q2 Q3 Array
0 0 1 0 1 [0, 2]
1 1 0 0 1 [2]
2 2 1 1 1 [0, 1, 2]
3 3 0 1 1 [1, 2]
Pure pandas variant:
df2 = df.filter(like='Q')
df['Array'] = (df2.set_axis(range(df2.shape[1]), axis=1).stack()
.loc[lambda s: s==1].reset_index()
.groupby('level_0')['level_1'].agg(list)
)