I have the following ordered Dataframe
Index ID Amount
1 A 10
2 A 15
3 A 17
4 A 12
5 A 10
6 B 20
7 B 15
...
What I want is to add a column indicating the median of all the precedent Amounts for the same IDs in the dataframe
The result must be the following
Index ID Amount (PastElements) MedianOfPastElements
1 A 10 ()
2 A 15 (10) 10
3 A 17 (10;15) 12.5
4 A 12 (10;15;17) 15
5 A 10 (10;12;15;17) 13.5
6 B 20 ()
7 B 15 (20) 20
...
I don’t have to keep the PastElements column in my result, I just added it to clarify my problem.
Does someone see any way to do so ? Thanks in advance
>Solution :
Use groupby.transform with shift and expanding.median:
df['MedianOfPastElements'] = (df.groupby('ID')['Amount']
.transform(lambda s: s.shift().expanding().median())
)
Output:
Index ID Amount MedianOfPastElements
0 1 A 10 NaN
1 2 A 15 10.0
2 3 A 17 12.5
3 4 A 12 15.0
4 5 A 10 13.5
5 6 B 20 NaN
6 7 B 15 20.0