How to check if the time-series belongs to last year using pandas?

I am making an app where the users can upload their time-series CSV data. I want the user to always upload last year’s data (in 2022, the time-series should be of 2021; in 2023 the data should be of 2022 and so on) because of which I have to do a check if the data is from last year or not.

Is there a way I can do this check using pandas while reading the csv (I read the csv by doing pd.read_csv(my_file))?

Sample of time-series

                 dates   values
0  2021-01-01 01:00:00  371.428
1  2021-01-01 02:00:00  390.194
2  2021-01-01 03:00:00  349.924
3  2021-01-01 04:00:00  342.886
4  2021-01-01 05:00:00  331.157
.
.
.
.
8779  2021-12-31 20:00:00  515.307
8780  2021-12-31 21:00:00  432.811
8781  2021-12-31 22:00:00  421.082
8782  2021-12-31 23:00:00  394.886
8783  2022-01-01 00:00:00  373.773

The last row will always be of current year at 00:00

>Solution :

I think no, need first read values. You can convert values to DataFrame first and then compare years by Series.dt.year with Timestamp.year subtracted 1 and for test if all values match use Series.all:

df = pd.read_csv(my_file, parse_dates=['dates'])
test = df['dates'].dt.year.iloc[:-1].eq(pd.Timestamp('now').year - 1).all()

Leave a Reply