I have a pandas data frame, lets call it df looking like this
|Rank| Country | All | Agr |Ind |Dom |
-------------------------------------------------------
|1 |Argentina |2 |3 |1 |5 |
|4 |Chile |3 |3 |4 |3 |
|3 |Colombia |1 |2 |1 |4 |
|4 |Mexico |3 |5 |4 |2 |
|3 |Panama |2 |1 |5 |4 |
|2 |Peru |3 |3 |4 |2 |
I want to remove from the rows that are not in the next list:
paises = ["Colombia", "Peru", "Chile"]
For that I tried this code:
df = df.drop(df["Country"]= paises)
But it did not work, because they do not have the same length.
I´m so rookie in python. Can you help me?
>Solution :
Use pandas.dataFrame.isin()
Create an appropriate dataFrame mask using the list of countries you don’t want in your dataFrame.
paises = ["Colombia", "Peru", "Chile"]
df[df['country'].isin(paises)==False]
Use the dataFrame mask to assign the masked dataFrame, which excludes the paises.
paises = ["Colombia", "Peru", "Chile"]
df = df[df['country'].isin(paises)==False]
>>> df
Rank Country All Agr Ind Dom
0 1 Argentina 2 3 1 5
3 4 Mexico 3 5 4 2
4 3 Panama 2 1 5 4
You can also use the not isin() operator "~" to check if values are not in the DataFrame. To use the ~ operator: df = df[~df['country'].isin(paises)]
You don’t even need the hand of Maradona for this one.