I have a pandas dataframe with a column representing dates but saved in int format. For several dates I have a 13th and a 14th month. I would like to replace these 13th and 14th months by the 12th month. And then, eventually transform it into date_time format.
Original_date
20190101
20191301
20191401
New_date
20190101
20191201
20191201
I tried by replacing the format into string then replace only based on the index of the months in the string [4:6], but it didn’t work out:
df.original_date.astype(str)
for string in df['original_date']:
if string[4:6]=="13" or string[4:6]=="14":
string.replace(string, string[:4]+ "12" + string[6:])
print(df['original_date'])
>Solution :
You can use .str.replace with regex
df['New_date'] = df['Original_date'].astype(str).str.replace('(\d{4})(13|14)(\d{2})', r'\g<1>12\3', regex=True)
print(df)
Original_date New_date
0 20190101 20190101
1 20191301 20191201
2 20191401 20191201