I have a pandas dataframe where one column is all float, another column either contains list of floats, None, or just float values. I have ensured all values are floats.
Ultimately, I want to use pd.isin()
to check how many records of value_1
are in value_2
but it is not working for me. When I ran this code below:
df[~df['value_1'].isin(df['value_2'])]
This below is what it returned which is not expected since clearly some values in value_1
are in the value_2
lists.:
0 88870.0 [88870.0]
1. 150700.0 None
2 225000.0 [225000.0, 225000.0]
3. 305000.0 [305606.0, 305000.0, 1067.5]
4 392000.0 [392000.0]
5 198400.0 396
What am I missing? Please help.
>Solution :
You can use boolean indexing with numpy.isin
in a list comprehension:
import numpy as np
out = df[[bool(np.isin(v1, v2)) for v1, v2 in zip(df['value_1'], df['value_2'])]]
Output:
value_1 value_2
0 88870.0 [88870.0]
2 225000.0 [225000.0, 225000.0]
3 305000.0 [305606.0, 305000.0, 1067.5]
4 392000.0 [392000.0]