Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python – Forward fill with non gap date values with a loop condition

I have a data frame as below.

Index Date Group Name
0   Oct-10  A   Sam
1   Oct-10  A   Sam
2   Dec-12  A   Sam
3   Dec-12  A   Sam
4   May-13  A   Peter
5   May-14  A   Peter
6   May-15  A   Chris
7   Apr-17  A   Chris
8   Apr-17  A   Sam
9   Jul-23  A   Sam
10  Dec-12  B   Peter
11  Dec-12  B   Peter
12  May-13  B   Peter
13  May-14  B   Peter
14  Apr-18  B   Sam
15  Jul-19  B   Sam
16  Jul-20  B   Sam
17  Jul-22  B   Sam
18  Jul-23  B   Sam

I want to forward fill the date values for every "Group" in such a way that every person start date should be forward filled.

Example : Group A, Sam started from Oct 2010, hence till from index 0 to 3, it should be Oct 2010 for him. Peter started on May 2013, so same May 2013 for index 4 & 5. However, again Sam comes in April 2017 so it should be April 2017 for index 8 & 9.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I am finding this is very difficult because Sam is appearing twice in group A. Your help will be highly appriciated to solve this. I am trying someting like below but not giving any results.

for every group in df["Group"]:
      df["date2"] = (df["Date"].groupby(["Name"]).transform(lambda x: x.ffill()))

the result I want is below

Index Date Group Name date2
0   Oct-10  A   Sam   Oct-10
1   Oct-10  A   Sam   Oct-10
2   Dec-12  A   Sam   Oct-10
3   Dec-12  A   Sam   Oct-10
4   May-13  A   Peter May-13
5   May-14  A   Peter May-13
6   May-15  A   Chris May-15
7   Apr-17  A   Chris May-15
8   Apr-17  A   Sam   Apr-17
9   Jul-23  A   Sam   Apr-17
10  Dec-12  B   Peter Dec-12
11  Dec-12  B   Peter Dec-12
12  May-13  B   Peter Dec-12
13  May-14  B   Peter Dec-12
14  Apr-18  B   Sam   Apr-18
15  Jul-19  B   Sam   Apr-18
16  Jul-20  B   Sam   Apr-18
17  Jul-22  B   Sam   Apr-18
18  Jul-23  B   Sam   Apr-18

>Solution :

You can compute a mask with shift and any, then use it to select the first values and ffill:

cols = ['Group', 'Name']
m = df[cols].ne(df[cols].shift()).any(axis=1)

df['date2'] = df['Date'].where(m).ffill()

Output:

    Index    Date Group   Name   date2
0       0  Oct-10     A    Sam  Oct-10
1       1  Oct-10     A    Sam  Oct-10
2       2  Dec-12     A    Sam  Oct-10
3       3  Dec-12     A    Sam  Oct-10
4       4  May-13     A  Peter  May-13
5       5  May-14     A  Peter  May-13
6       6  May-15     A  Chris  May-15
7       7  Apr-17     A  Chris  May-15
8       8  Apr-17     A    Sam  Apr-17
9       9  Jul-23     A    Sam  Apr-17
10     10  Dec-12     B  Peter  Dec-12
11     11  Dec-12     B  Peter  Dec-12
12     12  May-13     B  Peter  Dec-12
13     13  May-14     B  Peter  Dec-12
14     14  Apr-18     B    Sam  Apr-18
15     15  Jul-19     B    Sam  Apr-18
16     16  Jul-20     B    Sam  Apr-18
17     17  Jul-22     B    Sam  Apr-18
18     18  Jul-23     B    Sam  Apr-18

Intermediates:

    Index    Date Group   Name   date2 Group_shift Name_shift  Group_eq_shift  Name_eq_shift      m
0       0  Oct-10     A    Sam  Oct-10        None       None            True           True   True
1       1  Oct-10     A    Sam  Oct-10           A        Sam           False          False  False
2       2  Dec-12     A    Sam  Oct-10           A        Sam           False          False  False
3       3  Dec-12     A    Sam  Oct-10           A        Sam           False          False  False
4       4  May-13     A  Peter  May-13           A        Sam           False           True   True
5       5  May-14     A  Peter  May-13           A      Peter           False          False  False
6       6  May-15     A  Chris  May-15           A      Peter           False           True   True
7       7  Apr-17     A  Chris  May-15           A      Chris           False          False  False
8       8  Apr-17     A    Sam  Apr-17           A      Chris           False           True   True
9       9  Jul-23     A    Sam  Apr-17           A        Sam           False          False  False
10     10  Dec-12     B  Peter  Dec-12           A        Sam            True           True   True
11     11  Dec-12     B  Peter  Dec-12           B      Peter           False          False  False
12     12  May-13     B  Peter  Dec-12           B      Peter           False          False  False
13     13  May-14     B  Peter  Dec-12           B      Peter           False          False  False
14     14  Apr-18     B    Sam  Apr-18           B      Peter           False           True   True
15     15  Jul-19     B    Sam  Apr-18           B        Sam           False          False  False
16     16  Jul-20     B    Sam  Apr-18           B        Sam           False          False  False
17     17  Jul-22     B    Sam  Apr-18           B        Sam           False          False  False
18     18  Jul-23     B    Sam  Apr-18           B        Sam           False          False  False
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading