From a pair of dates, I would like to create a list of dates at monthly frequency, including the months of both dates indicated.
import pandas as pd
import datetime
# Option 1
pd.date_range(datetime(2022, 1, 13),datetime(2022, 4, 5), freq='M', inclusive='both')
# Option 2
pd.date_range("2022-01-13", "2022-04-05", freq='M', inclusive='both')
both return the list: DatetimeIndex(['2022-01-31', '2022-02-28', '2022-03-31'], dtype='datetime64[ns]', freq='M')
. However, I am expecting the outcome with a list of dates (4 long) with one date for each month: [january, february, mars, april]
If now we run:
pd.date_range("2022-01-13", "2022-04-05", freq='M', inclusive='right')
we still obtain the same result as before. It looks like inclusive
has no effect on the outcome.
Pandas version. 1.5.3
>Solution :
using MonthEnd
and Day
offsets
This is because 2022-04-05 is before your month end (2022-04-30).
You can use:
pd.date_range("2022-01-13", pd.Timestamp("2022-04-05")+pd.offsets.MonthEnd(),
freq='M', inclusive='both')
A more robust variant to also handle the case in which the input date is already the month end:
pd.date_range("2022-01-13",
pd.Timestamp("2022-04-05")-pd.offsets.Day()+pd.offsets.MonthEnd(),
freq='M', inclusive='both')
Output:
DatetimeIndex(['2022-01-31', '2022-02-28', '2022-03-31', '2022-04-30'],
dtype='datetime64[ns]', freq='M')
alternative: using Period
pd.date_range(pd.Period('2022-01-13', 'M').to_timestamp(),
pd.Period('2022-04-30', 'M').to_timestamp(how='end'),
freq='M', inclusive='both')
Intermediates:
pd.Period('2022-01-13', 'M').to_timestamp()
# Timestamp('2022-01-01 00:00:00')
pd.Period('2022-04-30', 'M').to_timestamp(how='end')
# Timestamp('2022-04-30 23:59:59.999999999')
or as periods: period_range
pd.period_range('2022-01-13', '2022-04-30', freq='M')
Output:
PeriodIndex(['2022-01', '2022-02', '2022-03', '2022-04'], dtype='period[M]')