I have a stupid problem on pandas, and today I don’t find solution, I feel it’s simple but a little bit complicated to explain…
Quickly, I have this dataframe :
| id_client | name_client | purchases | |
|---|---|---|---|
| 0 | 26 | John | Alpha |
| 1 | 26 | John | Beta |
| 2 | 26 | John | Omega |
| 3 | 45 | Michael | Alpha |
| 4 | 45 | Michael | Beta |
| 5 | 45 | Michael | Omega |
| 6 | 78 | Charlie | Alpha |
| 7 | 78 | Charlie | Beta |
code to reproduce :
df = pd.DataFrame({'id_client':[26,26,26,45,45,45,78,78], 'name_client':["John","John","John","Michael","Michael","Michael","Charlie","Charlie"], 'purchases':["Alpha","Beta","Omega","Alpha","Beta","Omega","Alpha","Beta"]})
And I want this result :
| id_client | name_client | options | |
|---|---|---|---|
| 0 | 26 | John | Alpha, Beta, Omega |
| 1 | 45 | Michael | Alpha, Beta, Omega |
| 2 | 78 | Charlie | Alpha, Beta |
I try :
df.groupby(by=['id_client']).transform(lambda x : ', '.join(x))
But I Have this bad result :
| name_client | purchases | |
|---|---|---|
| 0 | John, John, John | Alpha, Beta, Omega |
| 1 | John, John, John | Alpha, Beta, Omega |
| 2 | John, John, John | Alpha, Beta, Omega |
| 3 | Michael, Michael, Michael | Alpha, Beta, Omega |
| 4 | Michael, Michael, Michael | Alpha, Beta, Omega |
| 5 | Michael, Michael, Michael | Alpha, Beta, Omega |
| 6 | Charlie, Charlie | Alpha, Beta |
| 7 | Charlie, Charlie | Alpha, Beta |
I tried some differents options with groupby() but I don’t find the right solution
Thanks you.
>Solution :
Try:
>>> df.groupby(["id_client","name_client"], as_index=False).agg(", ".join)
id_client name_client purchases
0 26 John Alpha, Beta, Omega
1 45 Michael Alpha, Beta, Omega
2 78 Charlie Alpha, Beta