Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to Filter Pandas Dataframe using ‘in’ and ‘not in’ like in SQL

Learn how to filter Pandas dataframe using ‘in’ and ‘not in’ operators, similar to SQL. Master efficient data filtering techniques in Pandas for data analysis and manipulation.
Python logo Python logo
Python logo

Introduction

Filtering data is an essential task in data analysis and manipulation. When working with Pandas, you may often encounter scenarios where you need to filter a dataframe based on specific values, similar to using the ‘in’ and ‘not in’ operators in SQL queries. In this blog post, we will explore how to achieve this in Pandas and perform efficient data filtering.

Filtering with ‘in’ operator

The ‘in’ operator allows you to filter a Pandas dataframe based on a set of values. Here’s an example:

import pandas as pd

# Create a sample dataframe
data = {'Name': ['John', 'Emma', 'Peter', 'Sarah'],
        'Age': [25, 30, 28, 32],
        'City': ['New York', 'London', 'Paris', 'Tokyo']}

df = pd.DataFrame(data)

# Filter the dataframe based on a list of names
names = ['Emma', 'Sarah']
filtered_df = df[df['Name'].isin(names)]

print(filtered_df)

In this example, we have a dataframe with columns ‘Name’, ‘Age’, and ‘City’. We use the ‘isin()’ method on the ‘Name’ column to filter rows where the name is either ‘Emma’ or ‘Sarah’. The resulting dataframe will only contain the matching rows.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Filtering with ‘not in’ operator

Similarly, you can use the ‘not in’ operator to exclude specific values from the dataframe. Here’s an example:

# Filter the dataframe to exclude rows with specific names
filtered_df = df[~df['Name'].isin(names)]

print(filtered_df)

In this case, we use the ‘~’ operator to negate the filtering condition. The resulting dataframe will exclude rows where the name is ‘Emma’ or ‘Sarah’.

Conclusion

Filtering a Pandas dataframe using the ‘in’ and ‘not in’ operators provides a powerful mechanism to extract relevant data based on specific criteria. By leveraging these operators, you can mimic the functionality of SQL queries and perform efficient data filtering in Pandas. Experiment with different filtering conditions and explore the vast capabilities of Pandas for data manipulation and analysis.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading