It has one data frame with a column of model and date like below
df = pd.DataFrame({'model':['A','B','C','D', 'E','F','G','I','J','K'],
'date':['2022-10-28 12:10:28 AM','2022-12-07 12:12:07 AM','2022-12-07 12:12:07 AM','2022-12-07 12:12:07 AM',
'2022-12-08 12:12:08 AM','2022-12-10 12:12:10 AM','2023-02-22 12:02:22 AM','2023-02-22 12:02:22 AM',
'2023-02-24 12:02:24 AM','2023-03-04 12:03:04 AM']})
I want to distinguish between the 1st and 15th of each month and the 16th and 31st(or 30th) of each month
and put numbers in a class column like below
Is it possible?
>Solution :
You can use pd.cut:
# Find begin and end dates that enclose your dates
start = df['date'].min().date() - pd.offsets.MonthBegin(1)
end = df['date'].max().date() + pd.offsets.MonthEnd()
# Create the range and bin values
bins = pd.date_range(start, end, freq='MS')
bins = sorted(bins.tolist() + list(bins + pd.DateOffset(days=15)))
df['class'] = pd.factorize(pd.cut(df['date'], bins=bins, labels=False))[0]
print(df)
# Output
model date class
0 A 2022-10-28 00:10:28 0
1 B 2022-12-07 00:12:07 1
2 C 2022-12-07 00:12:07 1
3 D 2022-12-07 00:12:07 1
4 E 2022-12-08 00:12:08 1
5 F 2022-12-10 00:12:10 1
6 G 2023-02-22 00:02:22 2
7 I 2023-02-22 00:02:22 2
8 J 2023-02-24 00:02:24 2
9 K 2023-03-04 00:03:04 3
