Suppose I have a DataFrame like this –
ID-A ID-B ID-C Time
1 A X 2022/01/01 09:00:00
1 A X 2022/01/01 09:10:00
1 A Y 2022/01/02 10:15:00
2 B Y 2022/01/01 11:45:00
2 C Y 2022/01/01 01:00:00
2 C Y 2022/01/01 12:00:00
I want to group by columns ID-A and ID-B, and find the count of ID-C per each group. The new time field should be start time(min value per group) and end time(max value of time per group).
Desired Dataframe –
ID-A ID-B Value start_time end_time
1 A 3 2022/01/01 09:00:00 2022/01/02 10:15:00
2 B 1 2022/01/01 11:45:00 2022/01/01 11:45:00
2 C 2 2022/01/01 01:00:00 2022/01/01 12:00:00
>Solution :
Use:
(df.groupby(['ID-A', 'ID-B'], as_index=False)
.agg(Value=('ID-C', 'size'),
start_time=('Time', 'min'),
end_time=('Time', 'max'),
)
)