I have a dataframe with date of births as
pl.DataFrame({'idx':[1,2,3,4,5,6],
'date_of_birth':['03/06/1990','3/06/1990','11/12/2000','01/02/2021','1/02/2021','3/06/1990']})
Here I would like to compare date of birth(Format: Month/Day/Year) of each row and tag yes if the months are equal such as 03 – 3, 01 -1.
There are dates as 03/06/1900, 3/06/1990, they are generally same. but here they are treated as different. How to figure out these kind scenarios ?
The expected output as:
>Solution :
Let’s convert date_of_birth to datetime then check if the month is satisfied
import numpy as np
import pandas as pd
import polars as pl
s = np.where(pd.to_datetime(df['date_of_birth']).month.isin([3, 1]), 'Yes', 'No')
out = df.with_columns([
pl.lit(s).alias('flag')
])
print(out)
┌─────┬───────────────┬──────┐
│ idx ┆ date_of_birth ┆ flag │
│ --- ┆ --- ┆ --- │
│ i64 ┆ str ┆ str │
╞═════╪═══════════════╪══════╡
│ 1 ┆ 03/06/1990 ┆ Yes │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 2 ┆ 3/06/1990 ┆ Yes │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 3 ┆ 11/12/2000 ┆ No │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 4 ┆ 01/02/2021 ┆ Yes │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 5 ┆ 1/02/2021 ┆ Yes │
├╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┤
│ 6 ┆ 3/06/1990 ┆ Yes │
└─────┴───────────────┴──────┘

