I have a dataframe which has the words Due Date written differently but it all means the same. The problem is in my master data(xls file), one due date has an extra space or doesnt and i cant change that.All i can change is my final output.
Sr no Due Date Due Date DueDate
1 1/2/22
2 1/5/22
3
4
5 ASAP
I just want that column 2 and 3 all combine under column one at the same location they were
Sr No. Due Date
1 1/2/22
2 1/5/22
3
4
5 ASAP
>Solution :
You can use filter with a regex to get similar names, then bfill and get the first. Finally join to original devoid of the found columns:
d = df.filter(regex=r'(?i)due\s*date')
df2 = (df
.drop(columns=list(d.columns))
.join(d.bfill(1).iloc[:,0])
)
Output:
Sr no Due Date
0 1 1/2/22
1 2 1/5/22
2 3 None
3 4 None
4 5 ASAP