Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Filter variable not working properly in Pandas

I have a very large dataset, and I am applying multiple filters on many columns. In order to make the code more readable, I assign the filters to some variables – but I noticed that although the values in the dataframe have changed, the filter seems like doesn’t take into account the new values.

This is my dataframe:

data = {'id':[12, 84, 156, 228, 300, 372, 444, 516, 588, 660, 732],
       'age':['18-18', '22-22', '35-35', '33-33', '45-45', '40-40', '55-55', '60-60', '47-47', '25-25', '59-59'],
       'height':['175-177', '165-167', '175-178', '165-168', '175-179', '165-169', '175-180', '165-170', '175-181', '165-171', '175-182'],
       'weight':['65-70', '65-70', '80-85', '75-80', '90-95', '100-105', '80-85', '70-75', '70-75', '85-90', '90-95'],
       'education':['10-12', '11-13', '12-14', '13-15', '14-16', '15-17', '16-18', '17-19', '18-20', '19-21', '20-22'],
       'employment':['1-4', '8-11', '8-11', '4-7', '5-8', '5-8', '9-12', '15-18', '13-16', '12-15', '12-15'],
       'country':['France-EU', 'Austria-EU', 'Netherland-EU', 'Italy-EU', 'Texas-US', 'California-US', 'Washington-US', 'Poland-EU', 'Spain-EU', 'Greece-EU', 'New York-US'],
       'city':['Paris-FR', 'Vienna-AUS', 'Amsterdam-NL', 'Rome-ITA', 'Austin-TX', 'LA-CAL', 'Olympia-WAS', 'Warsaw-PL', 'Madrid-SPA', 'Athens-GR', 'Albany-NY']}

df = pd.DataFrame(data)

And I want to apply this filter:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

`df['weight'] = df['weight'].astype(str)
filter1 = (df['weight'].str.slice(stop=2)=='65') & (df['country'].str.slice(stop=2)=='Au')`

Initially, I get what I want using the filter:

df.loc[filter1]

Later, I change the filtered rows as follows:

df.loc[filter1,'weight'] = '100'

And when I use again the filter I expect no result, but instead it returns me the same rows, although the value of the filter should be False

>Solution :

filter1 doesn’t magically update to match values that you set after it is created… make it again after your changes and you’ll see that it works as expected:

def get_filter1(df):
    return df['weight'].str[:2].eq('65') & df['country'].str[:2].eq('Au')


print(df.loc[get_filter1(df)])

df.loc[get_filter1(df), 'weight'] = '100'

print(df.loc[get_filter1(df)])

Output:

   id    age   height weight education employment     country        city
1  84  22-22  165-167  65-70     11-13       8-11  Austria-EU  Vienna-AUS

Empty DataFrame
Columns: [id, age, height, weight, education, employment, country, city]
Index: []
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading