I am trying to find dataframe values exist in another dataframe, I used isin function but it retrieve only values for the first column. ex: df1 has 5 columns, and df2 has 10 columns. The output I need all rows from df2 depend on column B1 value in df2.
Example
df1
| A1 | A2 | A3 |
|---|---|---|
| DC | loss | battery |
| Power | door | DC |
| battery | link | AC |
df2
| B1 | B2 | B3 | B4 | B5 | B6 |
|---|---|---|---|---|---|
| DC | Two | Three | One | Two | Three |
| battery | GG | 44 | kk | ii | pp |
| AC | Two | Three | One | Two | Three |
| circuit | Two | Three | One | Two | Three |
| door | SS | TT | BB | RR | EE |
| current | Two | Three | One | Two | Three |
The needed output
df3
| B1 | B2 | B3 | B4 | B5 | B6 |
|---|---|---|---|---|---|
| DC | Two | Three | One | Two | Three |
| battery | GG | 44 | kk | ii | pp |
| AC | Two | Three | One | Two | Three |
| door | SS | TT | BB | RR | EE |
I used the code :
df3_1 = df2[df2.B1.isin(df1.A1)]
df3_2 = df2[df2.B1.isin(df1.A2)]
df3_3 = df2[df2.B1.isin(df1.A3)]
only df3_1 has correct values ,df3_2 and df3_3 are empty. I don’t know the reason.
>Solution :
One reason should be traling spaces, try remove them, also convert values to lowercase:
s = df2.B1.str.strip().str.lower()
df3_2 = df2[s.isin(df1.A2.str.strip().str.lower())]
df3_3 = df2[s.isin(df1.A3.str.strip().str.lower())]