I have a dataframe like this,
df = pd.DataFrame({
'id': ['A','A','A','B','B','C','C','C','C'],
'groupId': [11,35,46,11,26,25,39,50,55],
'type': [1,1,1,1,1,2,2,2,2],
})
I want to turn the groups into the numpy arrays including the type value inside a list. I tried:
df.groupby(['id','type'])['groupId'].apply(np.array).tolist()
It is almost done. But I also want the type value at the very beginning of the numpy array. What I desire is:
[
np.array([1,11,35,46]),
np.array([1,11,26]),
np.array([2,25,39,50,55])
]
I feel it is easy. But I am stuck.
>Solution :
Use x.name for type value and add to np.array:
a = df.groupby(['id','type'])['groupId'].apply(lambda x: np.array([x.name[1], *x])).tolist()
print (a)
[array([ 1, 11, 35, 46], dtype=int64),
array([ 1, 11, 26], dtype=int64),
array([ 2, 25, 39, 50, 55], dtype=int64)]