import pandas as pd
df = {'Date': ["2011-10-19",
"2013-01-14",
"2014-05-27",
"2014-06-23",
"2014-08-12",
"2014-09-22",
"2014-09-22",
"2014-09-22"
], 'Status': ["Pending",
"Pending",
"Complete",
"Pending",
"Complete",
"Pending",
"Pending",
"Pending"],
'Group': ["a",
"a",
"a",
"a",
"b",
"b",
"b",
"b"]}
df = pd.DataFrame(data=df)
df
I would like to create another variable based on the change in Status over time for each group such that they are considered a "completer" the next row after they have Status = "Complete"
For example I would like to create the "completer" column in the df2 table:
df2 = {'Date': ["2011-10-19",
"2013-01-14",
"2014-05-27",
"2014-06-23",
"2014-08-12",
"2014-09-22",
"2014-09-22",
"2014-09-22"
], 'Status': ["Pending",
"Pending",
"Complete",
"Pending",
"Complete",
"Pending",
"Pending",
"Pending"],
'Group': ["a",
"a",
"a",
"a",
"b",
"b",
"b",
"b"],
'Completer': ["Non-Completer",
"Non-Completer",
"Non-Completer",
"Completer",
"Non-Completer",
"Completer",
"Completer",
"Completer"]}
df2 = pd.DataFrame(data=df2)
df2
Thanks!
>Solution :
I was able to solve it in two steps.
Firstly I created a column which marks the first date that each group has "Completed", and added 1 to it so that we update values starting from the next row.
Secondly, since True / False translates to 1 / 0 respectively I used a groupby on Group and cummax, which would update all following rows per group to True.
Then lastly I just used replace and dropped the helping column.
df['first_date_per_group'] = df.index.isin(df.loc[df['Status'].eq('Complete')]['Group'].index + 1)
df = df.assign(Completer=df.groupby('Group')['first_date_per_group'].cummax()).replace(
{True:'Completer',False:'Non-Completer'}).drop('first_date_per_group',axis=1)
prints:
Date Status Group Completer
0 2011-10-19 Pending a Non-Completer
1 2013-01-14 Pending a Non-Completer
2 2014-05-27 Complete a Non-Completer
3 2014-06-23 Pending a Completer
4 2014-08-12 Complete b Non-Completer
5 2014-09-22 Pending b Completer
6 2014-09-22 Pending b Completer
7 2014-09-22 Pending b Completer