Home How to filter out multiple rows in a pandas.DataFrame based on multiple conditions for the same column

Questions

How to filter out multiple rows in a pandas.DataFrame based on multiple conditions for the same column

May 17, 2022

I have an exemplary pd.DataFrame containing codenames of software developed in different development studios:

df = pd.DataFrame({'project_id': [36423, 28564, 96648, 96648, 10042, 68277, 68277, 68277], 'codename': ['banana', 'apple', 'peach', 'peach', 'melon', 'pear', 'pear', 'pear'], 'studio': ['paris', 'amsterdam', 'frankfurt', 'paris', 'london', 'brussel', 'amsterdam', 'sofia']})

      id codename     studio
0  36423   banana      paris
1  28564    apple  amsterdam
2  96648    peach  frankfurt
3  96648    peach      paris
4  10042    melon     london
5  68277     pear    brussel
6  68277     pear  amsterdam
7  68277     pear      sofia

What would be the best way to filter out these rows which hold projects developed

in at least two different studios?
in two specific studios?

The results I am trying to achieve look like as follows:

Which projects are getting developed in at least two different studios:

   project_id codename     studio
0       96648    peach  frankfurt
1       96648    peach      paris
2       68277     pear    brussel
3       68277     pear  amsterdam
4       68277     pear      sofia

Which projects are getting developed in frankfurt AND paris?

   project_id codename     studio
0       96648    peach  frankfurt
1       96648    peach      paris

Using df.loc[df['studio'].isin(['frankfurt', 'paris'])] for instance does not work, as this function filters out all rows which contain either frankfurt or paris in the column studio. Is there a more elegant way than filtering the dataframe for frankfurt and paris and using the Series.intersection() method? I am running out of Ideas here.

Thanks in advance! 🙂

>Solution :

For the first question:

df[df.groupby('project_id')['studio'].transform('nunique').ge(2)]

output:

   project_id codename     studio
2       96648    peach  frankfurt
3       96648    peach      paris
5       68277     pear    brussel
6       68277     pear  amsterdam
7       68277     pear      sofia

For the second:

df[df.groupby('project_id')['studio']
     .transform(lambda x: set(x)=={'frankfurt', 'paris'})]
# if you want at least frankfurt+paris, use
# set(x)>={'frankfurt', 'paris'})

output:

   project_id codename     studio
2       96648    peach  frankfurt
3       96648    peach      paris

byMR

Published May 17, 2022

Add a comment

ggplot within function does not fully work

byMR

May 17, 2022

Questions

What kinds of expressions are allowed in a `#if` (the conditional inclusion preprocesssor directives)

byMR

May 17, 2022

Questions

How to define generic typescript type with keyof property?

byMR

May 17, 2022

Questions

How to copy all column names and data types from table in DBeaver?

byMR

May 17, 2022

Questions

Query Expression Join context with primitive list

byMR

May 17, 2022

Questions

Network Docker-Compose

byMR

May 17, 2022

How to filter out multiple rows in a pandas.DataFrame based on multiple conditions for the same column

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

ggplot within function does not fully work

What kinds of expressions are allowed in a `#if` (the conditional inclusion preprocesssor directives)

How to define generic typescript type with keyof property?

How to copy all column names and data types from table in DBeaver?

Query Expression Join context with primitive list

Network Docker-Compose

Keep Up to Date with the Most Important News

How to filter out multiple rows in a pandas.DataFrame based on multiple conditions for the same column

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

ggplot within function does not fully work

What kinds of expressions are allowed in a `#if` (the conditional inclusion preprocesssor directives)

How to define generic typescript type with keyof property?

How to copy all column names and data types from table in DBeaver?

Query Expression Join context with primitive list

Network Docker-Compose

Discover more from Dev solutions