Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Filtering a single column to two unique values

I have used .loc to filter my dataframe to two columns ‘Worker’ and ‘Time Type’.

Example dataset

df = pd.DataFrame({'Worker': ['Sam','Ben','Tom'], 'Time Type':['Full Time', 'Part Time', 'paert Tme']})
df

Worker  Time Type
0   Sam Full Time
1   Ben Part Time
2   Tom paert Tme

I now want to see an output of only those with ‘Part time’ or ‘Full time’.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

The code i’ve built thus far is:

df2 = df.loc[:, ['Worker', 'Time Type']]
df2[(df2['Time Type'] == 'Part time' | 'Full time')]

However I am getting the error TypeError: unsupported operand type(s) for |: 'str' and 'str'

Does anybody know an easy way to get around this?

Ideally I want to end up with two things:

  1. An output showing Full Time and Part Time employees.

  2. Another output showing anomalies outside of this parameter, i.e. ‘Tom’ in row 2 shows ‘paert Tme’ which is an anomaly and worthwhile viewing as a separate output.

Any tips on best practice or approaches would be excellent help, thanks folks

>Solution :

Use .isin() function, much easier and nicer to read

df[df['Time Type'].isin(['Full Time', 'Part Time'])]

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading