Let it be the following Python Panda Dataframe:
| code | visit_time | flag | other | counter |
|---|---|---|---|---|
| 0 | NaT | True | X | 3 |
| 0 | 1 days 03:00:12 | False | Y | 1 |
| 0 | NaT | False | X | 3 |
| 0 | 0 days 05:00:00 | True | X | 2 |
| 1 | NaT | False | Z | 3 |
| 1 | NaT | True | X | 3 |
| 1 | 1 days 03:00:12 | False | Y | 1 |
| 2 | NaT | True | X | 3 |
| 2 | 5 days 10:01:12 | True | Y | 0 |
To solve the problem, only the columns: code, visit_time and flag are needed.
Each row with a value of visit_time, has a previous row with value NaT. Knowing this, I want to do next modification in the dataframe:
- Sets the flag of the row with non-null value of
visit_timeto the same value as its previous row.
Code used @Cameron Riddell:
out = df.assign(
flag=df['flag'].mask(df['visit_time'].notnull(), df['flag'].shift())
)
print(out)
code visit_time flag other counter
0 0 NaT True X 3
1 0 1 days 03:00:12 True Y 1
2 0 NaT False X 3
3 0 0 days 05:00:00 False X 2
4 1 NaT False Z 3
5 1 NaT True X 3
6 1 1 days 03:00:12 True Y 1
7 2 NaT True X 3
8 2 5 days 10:01:12 True Y 0
The problem is that I want to reuse the code in a function, so the flag column will be stored in a variable, say name. If I use the following code, a new column name is created in the DataFrame.
name = 'flag'
out = df.assign(
name=df[name].mask(df['visit_time'].notnull(), df[name].shift())
)
How could I get the same functionality but being able to modify the values of the column passed by parameter? I am grateful for the help offered in advance.
>Solution :
You can use ** for unpacks dictionary into keyword arguments:
name = 'flag'
out = df.assign(**{name: df[name].mask(df['visit_time'].notnull(), df[name].shift())})