Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Filtering for different conditions in different columns pandas

Python 3.9 and Pandas 1.3.4

So here’s the df:

1   First Name  Last Name  fullname
2   Freddie     Mercury    Freddie Mercury
3   John        Lennon     John Lennon
4   David       Bowie      David Bowie
5                          John Doe
6   Joseph                 Joseph
7               Jovi       Jovi

My piece of code currently just finds the fullname column is just First Name + Last Name.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I’m currently trying to filter for blank entries in the First Name column, Last Name column, and any "John Does" in the fullname column.

Current code:

import pandas as pd

df = pd.read_csv('file.csv', dtype=str, header=0)

df2 = pd.DataFrame(df, columns=['First Name', 'Last Name', 'fullname'])

df['fullname'] = (df[['First Name', 'Last Name']].fillna('').agg(' '.join, axis=1).str.strip().replace('', 'John Doe'))

df_sort = df2.loc[df2['First Name'] == " "] | df2.loc[df2['Last Name'] == " "] | df2.loc[df2['fullname'] == "John Doe"]


df.to_csv('file.csv', index=False)
df_sort.to_csv('missing names.csv', index=False)

Currently I am having the missing names write to a new file and outputs only this:

First Name   Last Name   fullname

Everything is empty under.

I would like for the output to be:

First Name   Last Name   fullname
                         John Doe
Joseph
             Jovi

>Solution :

Replace possible missing values to empty string, compare and test if at least one value match in DataFrame.any:

df_sort = df2[df2[['First Name', 'Last Name']].fillna('').eq('').any(axis=1)]

Or if there are missing values use:

df_sort = df2[df2[['First Name', 'Last Name']].isna().any(axis=1)]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading