np.where() conversion failing

I’m trying to map an integer which represents Year-Month (‘Period’ column in code below) into a new column that represents the Year-Quarter (‘DIST_PERIOD’ column).

For instance,

202101, 202102, and 202103 become ‘20211’

202104, 202105, and 202106 become ‘20212’

etc.

My code below is running but not how I thought it would.

df['DIST_PERIOD'] = np.where((str(df['Period'])[4:] == '01') | (str(df['Period'])[4:] == '02') | (str(df['Period'])[4:] == '03'), str(df['Period'])[:4]+'1', df['DIST_PERIOD'])
df['DIST_PERIOD'] = np.where((str(df['Period'])[4:] == '04') | (str(df['Period'])[4:] == '05') | (str(df['Period'])[4:] == '06'), str(df['Period'])[:4]+'2', df['DIST_PERIOD'])
df['DIST_PERIOD'] = np.where((str(df['Period'])[4:] == '07') | (str(df['Period'])[4:] == '08') | (str(df['Period'])[4:] == '09'), str(df['Period'])[:4]+'3', df['DIST_PERIOD'])
df['DIST_PERIOD'] = np.where((str(df['Period'])[4:] == '10') | (str(df['Period'])[4:] == '11') | (str(df['Period'])[4:] == '12'), str(df['Period'])[:4]+'4', df['DIST_PERIOD'])

Not sure how to correct my str() so that I am correctly capturing the last two characters for each row.

>Solution :

A better way is to convert the column to datetime and then access and combine the datetime properties year and quarter.

month_year = pd.to_datetime(df['DIST_PERIOD'], format="%Y%m")
df['DIST_PERIOD'] = month_year.dt.year.astype(str) + month_year.dt.quarter.astype(str)

Leave a Reply