Fill empty rows based on condition other column pandas

February 20, 2023

I’m struggling in Python (Pandas) with a way to fill empty rows from one column based in the following example:

|     email         |  run   | other cols .... 
| cris@gmail.com    | 12345  | 
| patty@gmail.com   | 134254 |
| rick@outlook.com  | 23232  |
| rick@outlook.com  |        |   
|                   | 134254 |
|                   | 134254 |
| cris@gmail.com    |        |

due I have other columns, the rows aren’t duplicates, so I would like to fill the empty rows depending if I have the same information in other rows like this:

|     email         |  run   | other cols .... 
| cris@gmail.com    | 12345  | 
| patty@gmail.com   | 134254 |
| rick@outlook.com  | 23232  |
| rick@outlook.com  | 23232  |     
| patty@gmail.com   | 134254 |
| patty@gmail.com   | 134254 |
| cris@gmail.com    | 12345  |

Anyone could help me please?

>Solution :

You can perform several groupby:

out = df.assign(run=df['run'].fillna(df.groupby('email')['run'].transform('first')),
                email=df['email'].fillna(df.groupby('run')['email'].transform('first'))
                )

Using a helper function:

def fill_from(target, group, df=df):
    return df[target].fillna(df.groupby(group)[target].transform('first'))
                             
out = df.assign(run=fill_from('run', 'email'), email=fill_from('email', 'run'))

Output:

              email       run  other cols
0    cris@gmail.com   12345.0         NaN
1   patty@gmail.com  134254.0         NaN
2  rick@outlook.com   23232.0         NaN
3  rick@outlook.com   23232.0         NaN
4   patty@gmail.com  134254.0         NaN
5   patty@gmail.com  134254.0         NaN
6    cris@gmail.com   12345.0         NaN