Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Get values from current till last column values in pandas groupby

Image following pandas dataframe:

+----+------+-------+
| ID | Name | Value |
+----+------+-------+
| 1  | John | 1     |
+----+------+-------+
| 1  | John | 4     |
+----+------+-------+
| 1  | John | 10    |
+----+------+-------+
| 1  | John | 50    |
+----+------+-------+
| 1  | Adam | 6     |
+----+------+-------+
| 1  | Adam | 3     |
+----+------+-------+
| 2  | Jen  | 9     |
+----+------+-------+
| 2  | Jen  | 6     |
+----+------+-------+

I want to apply groupby function and create a new column which stores the Value values as a list from the current till the last groupby value.

Like that:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

+----+------+-------+----------------+
| ID | Name | Value | NewCol         |
+----+------+-------+----------------+
| 1  | John | 1     | [1, 4, 10, 50] |
+----+------+-------+----------------+
| 1  | John | 4     | [4, 10, 50]    |
+----+------+-------+----------------+
| 1  | John | 10    | [10, 50]       |
+----+------+-------+----------------+
| 1  | John | 50    | [50]           |
+----+------+-------+----------------+
| 1  | Adam | 6     | [6, 3]         |
+----+------+-------+----------------+
| 1  | Adam | 3     | [3]            |
+----+------+-------+----------------+
| 2  | Jen  | 9     | [9, 6]         |
+----+------+-------+----------------+
| 2  | Jen  | 6     | [9]            |
+----+------+-------+----------------+

Is this anyhow possible using pandas groupby function?

>Solution :

Use GroupBy.transform with custom lambda functions:

f = lambda x: [x.iloc[i:len(x)].tolist() for i, y in enumerate(x)]
df['new'] = df.groupby(['Name', 'ID'])['Value'].transform(f)

Or:

f = lambda x: [y[::-1].tolist() for y in x.expanding()]
df['new'] = df.iloc[::-1].groupby(['Name', 'ID'])['Value'].transform(f)
print (df)
   ID  Name  Value             new
0   1  John      1  [1, 4, 10, 50]
1   1  John      4     [4, 10, 50]
2   1  John     10        [10, 50]
3   1  John     50            [50]
4   1  Adam      6          [6, 3]
5   1  Adam      3             [3]
6   2   Jen      9          [9, 6]
7   2   Jen      6             [6]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading