Pandas replace values in columns based on condtion

I have a dataframe like below:

dummy_df_dict  = {'Email':['joblogs@gmail.com', 'joblogs@gmail.com', 'johnsmith@gmail.com', 'johnsmith@gmail.com'],
              'Transaction_Country': ['CA', 'No Country Listed', 'No Country Listed', 'DE'],
             'Country_name': ['Canada', 'No Country Listed', 'No Country Listed', 'Germany'],
                       'Continent':['North America', 'No Contient listed', 'No Contient listed', 'Europe']}
    Email   Transaction_Country Country_name    Continent
0   joblogs@gmail.com   CA  Canada  North America
1   joblogs@gmail.com   No Country Listed   No Country Listed   No Contient listed
2   johnsmith@gmail.com No Country Listed   No Country Listed   No Contient listed
3   johnsmith@gmail.com DE  Germany Europe

I’m just trying to replace the "No "XYZ" listed text with the actual country / continent based on the email above / below.

So for example, the 2nd row [1] instead of showing ‘Transaction_Country’ as ‘No Country Listed’ it instead would be replaced with the value ‘CA’. And then on the next row, for johnsmith it would be replaced with "DE".

Many thanks!



>Solution :

df = df.replace('No Country Listed', np.nan).replace('No Contient listed', np.nan)
df = df.sort_values(['Email', 'Transaction_Country']).groupby('Email')[df.columns].ffill()
print(df)

Output:

                 Email Transaction_Country Country_name      Continent
0    joblogs@gmail.com                  CA       Canada  North America
1    joblogs@gmail.com                  CA       Canada  North America
3  johnsmith@gmail.com                  DE      Germany         Europe
2  johnsmith@gmail.com                  DE      Germany         Europe

Looking at it again, I think just this works as well:

df = df.replace('No Country Listed', np.nan).replace('No Contient listed', np.nan)
df = df.sort_values(['Email', 'Transaction_Country']).ffill()

Leave a Reply