I’m sure there is a super simple answer to this…
I have a dataframe like below:
The simulation is run 3 times for 4 time steps t. At each time step, a task (0,1,2) is chosen.
I want to find out the % of times that task 1 is chosen averaged over the 3 simulations at each time step t. I’m sure its some sort of simple groupby().mean() but i can’t seem to get it. Any help would be appreciated!
| t | simulation | chosen_task |
|---|---|---|
| 0 | 0 | 1 |
| 1 | 0 | 2 |
| 2 | 0 | 0 |
| 3 | 0 | 1 |
| 0 | 1 | 0 |
| 1 | 1 | 1 |
| 2 | 1 | 1 |
| 3 | 1 | 1 |
| 0 | 2 | 0 |
| 1 | 2 | 1 |
| 2 | 2 | 2 |
| 3 | 2 | 0 |
>Solution :
You can use crosstab to calculate the normalized counts of chosen_task for each time step
pd.crosstab(df['t'], df['chosen_task'], normalize='index')[1]
t
0 0.333333
1 0.666667
2 0.333333
3 0.666667
Name: 1, dtype: float64