A column in dataframe looks like month, I want to use it to plus a month, as a ‘future’ month, then to compare this ‘future’ month with current (calendar) month.
import pandas as pd
from io import StringIO
import numpy as np
from datetime import datetime
csvfile = StringIO(
"""Name Year - Month Score
Mike 2022-11 31
Mike 2022-09 136
""")
df = pd.read_csv(csvfile, sep = '\t', engine='python')
d_name_plus_month = {"Mike":2}
month_of_first_row = pd.to_datetime(df.iloc[[0]]['Year - Month']).values.astype("datetime64[M]")
plus_months = d_name_plus_month['Mike']
scheduled_month = month_of_first_row + int(plus_months)
# scheduled_month_in_string = scheduled_month.astype('str')
current_month = datetime.now().strftime("%Y") +'-' +datetime.now().strftime("%m") # it's string
current_month = np.array(current_month)
print (scheduled_month <= current_month)
# month_of_first_row: 2022-11
# scheduled_month: 2023-01
# current_month: 2023-02
# so "scheduled_month" is earlier than "current_month".
But it has error:
TypeError: '<=' not supported between instances of 'numpy.ndarray' and 'numpy.ndarray'
I’ve tried to alert the lines to make them into string for compare, but not successful.
How can I correct the lines? Thank you.
>Solution :
I suggest use month periods by Serie.dt.to_period for easy add/ remove months by integers:
d_name_plus_month = {"Mike":2}
month_of_first_row = pd.to_datetime(df['Year - Month'].iat[0]).to_period('m')
print (month_of_first_row)
2022-11
plus_months = d_name_plus_month['Mike']
scheduled_month = month_of_first_row + int(plus_months)
current_month = pd.Timestamp.now().to_period('m')
print (current_month)
2023-02
print (scheduled_month <= current_month)
True