I have been presented with a very small dataset that has the date of each time a user logs into a system, I have to use this data set to create a table where I can show for each log-in the cumulative monthly counts of logs and the overall cumulative counts of logs, this is the data set I have:
| date | user |
|---|---|
| 1/01/2022 | Mark |
| 2/01/2022 | Mark |
| 3/02/2022 | Mark |
| 4/02/2022 | Mark |
| 5/03/2022 | Mark |
| 6/03/2022 | Mark |
| 7/03/2022 | Mark |
| 8/03/2022 | Mark |
| 9/03/2022 | Mark |
and this is my desired output:
| row | date | user | monthly_track | acum_track |
|---|---|---|---|---|
| 1 | 1/01/2022 | Mark | 1 | 1 |
| 2 | 2/01/2022 | Mark | 2 | 2 |
| 3 | 3/02/2022 | Mark | 1 | 3 |
| 4 | 4/02/2022 | Mark | 2 | 4 |
| 5 | 5/03/2022 | Mark | 1 | 5 |
| 6 | 6/03/2022 | Mark | 2 | 6 |
| 7 | 7/03/2022 | Mark | 3 | 7 |
| 8 | 8/03/2022 | Mark | 4 | 8 |
| 9 | 9/03/2022 | Mark | 5 | 9 |
Why? Let’s look at the row number 5. This is the first time the user Mark has logged into the system during the month 3 (March) but it is the 5th overall login in the data set (for the purpose of learning there will only be one year (2022).
I have no idea as to how to get the monthly and overall count together. I can groupby user and sort by date to count how many times in total a user has logged in, but I know that in order to achive my desired output I will have to group by date and user and then make counts based on month but I will have to somehow group the data by user (only) to get the overall count and I dont think I could group twice the data.
>Solution :
First you need to convert date to actual datetime values with to_datetime. The rest is simple with groupby and cumcount:
df['date'] = pd.to_datetime(df['date'], format='%d/%m/%Y')
df['monthly_count'] = df.groupby([df['user'], df['date'].dt.year, df['date'].dt.month]).cumcount() + 1
df['acum_count'] = df.groupby('user').cumcount() + 1
Output:
>>> df
date user monthly_count acum_count
0 2022-01-01 Mark 1 1
1 2022-01-02 Mark 2 2
2 2022-02-03 Mark 1 3
3 2022-02-04 Mark 2 4
4 2022-03-05 Mark 1 5
5 2022-03-06 Mark 2 6
6 2022-03-07 Mark 3 7
7 2022-03-08 Mark 4 8
8 2022-03-09 Mark 5 9