Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to filter data in Pandas by custom date?

There are some columns in Pandas dataframe:

name age mm yy
         01 23

How to filter rows where mmyy < current date using custom function (apply)?

mm and yy is as 8 non-null object

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I tried this way:

from datetime import datetime

# mm = datetime.strptime(mm, "%m").strftime("%m")
# mm = datetime.strptime(mm, "%b").strftime("%m")  #str

mm = '06'
yy = '22'

cdd, cmm, cyy = datetime.today().strftime("%d %m %Y").split()
yy = datetime.strptime(yy, '%y').strftime("%Y")
mm = datetime.strptime(mm, "%m").strftime("%m")
 
dt = datetime(year=int(yy), month=int(mm), day=1)
present = datetime.now()
print(dt <present)

So, how to wrap this into custom function and apply in filter Pandas?

>Solution :

Here’s a way to do what your question asks, assuming the mm and yy columns contain strings:

import pandas as pd
df = pd.DataFrame({'name':['Alice', 'Bob'], 'age':[20, 30], 'mm':['01', '04'], 'yy':['23', '22']})
print(df)

from datetime import datetime
now = datetime.today().date()
df = df[df.apply(lambda x: datetime.strptime(f"{x.yy}/{x.mm}/01", "%y/%m/%d").date() < now, axis=1)]
print(df)

Input

    name  age  mm  yy
0  Alice   20  01  23
1    Bob   30  04  22

Output

  name  age  mm  yy
1  Bob   30  04  22

UPDATE: Based on OP’s question in comments, this is how to do it if month is an abbreviated name like Jan for January (full docs on format codes here):

import pandas as pd
df = pd.DataFrame({'name':['Alice', 'Bob'], 'age':[20,30], 'mm':['Jan','Apr'], 'yy':['23','22']})
print(df)

from datetime import datetime
now = datetime.today().date()
df = df[df.apply(lambda x: datetime.strptime(f"{x.yy}/{x.mm}/01", "%y/%b/%d").date() < now, axis=1)]
print(df)

Input

    name  age   mm  yy
0  Alice   20  Jan  23
1    Bob   30  Apr  22

Output

  name  age   mm  yy
1  Bob   30  Apr  22
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading