How to remove some rows in a dataframe which are not in a list

January 28, 2022

I have a pandas data frame, lets call it df looking like this

|Rank| Country         | All    | Agr    |Ind  |Dom   |
-------------------------------------------------------
|1   |Argentina        |2       |3       |1    |5     |
|4   |Chile            |3       |3       |4    |3     |
|3   |Colombia         |1       |2       |1    |4     |
|4   |Mexico           |3       |5       |4    |2     |
|3   |Panama           |2       |1       |5    |4     |
|2   |Peru             |3       |3       |4    |2     |

I want to remove from the rows that are not in the next list:

paises = ["Colombia", "Peru", "Chile"]

For that I tried this code:

df = df.drop(df["Country"]= paises)

But it did not work, because they do not have the same length.

I´m so rookie in python. Can you help me?

>Solution :

Use pandas.dataFrame.isin()

Create an appropriate dataFrame mask using the list of countries you don’t want in your dataFrame.

paises = ["Colombia", "Peru", "Chile"]
df[df['country'].isin(paises)==False]

Use the dataFrame mask to assign the masked dataFrame, which excludes the paises.

paises = ["Colombia", "Peru", "Chile"]
df = df[df['country'].isin(paises)==False]

>>> df
   Rank    Country  All  Agr  Ind  Dom
0     1  Argentina    2    3    1    5
3     4     Mexico    3    5    4    2
4     3     Panama    2    1    5    4

You can also use the not isin() operator "~" to check if values are not in the DataFrame. To use the ~ operator: df = df[~df['country'].isin(paises)]

You don’t even need the hand of Maradona for this one.