Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas replace values in columns based on condtion

I have a dataframe like below:

dummy_df_dict  = {'Email':['joblogs@gmail.com', 'joblogs@gmail.com', 'johnsmith@gmail.com', 'johnsmith@gmail.com'],
              'Transaction_Country': ['CA', 'No Country Listed', 'No Country Listed', 'DE'],
             'Country_name': ['Canada', 'No Country Listed', 'No Country Listed', 'Germany'],
                       'Continent':['North America', 'No Contient listed', 'No Contient listed', 'Europe']}
    Email   Transaction_Country Country_name    Continent
0   joblogs@gmail.com   CA  Canada  North America
1   joblogs@gmail.com   No Country Listed   No Country Listed   No Contient listed
2   johnsmith@gmail.com No Country Listed   No Country Listed   No Contient listed
3   johnsmith@gmail.com DE  Germany Europe

I’m just trying to replace the "No "XYZ" listed text with the actual country / continent based on the email above / below.

So for example, the 2nd row [1] instead of showing ‘Transaction_Country’ as ‘No Country Listed’ it instead would be replaced with the value ‘CA’. And then on the next row, for johnsmith it would be replaced with "DE".

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Many thanks!



>Solution :

df = df.replace('No Country Listed', np.nan).replace('No Contient listed', np.nan)
df = df.sort_values(['Email', 'Transaction_Country']).groupby('Email')[df.columns].ffill()
print(df)

Output:

                 Email Transaction_Country Country_name      Continent
0    joblogs@gmail.com                  CA       Canada  North America
1    joblogs@gmail.com                  CA       Canada  North America
3  johnsmith@gmail.com                  DE      Germany         Europe
2  johnsmith@gmail.com                  DE      Germany         Europe

Looking at it again, I think just this works as well:

df = df.replace('No Country Listed', np.nan).replace('No Contient listed', np.nan)
df = df.sort_values(['Email', 'Transaction_Country']).ffill()
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading