I have the dataframe as shown in the table below:
| Column A | Column B |
|---|---|
| 7 | 124 |
| 7 | nan |
| 7 | nan |
| 8 | nan |
| 8 | nan |
| 8 | nan |
| 9 | 124 |
| 9 | nan |
| 9 | nan |
I need to keep the rows where Column A is "7" and "9" since Column B contains "124". I don’t need all the rows with "8" in i Column A, since "124" is not present in Column B in these rows.
The dataframe should end up with all the instances where the value in Column A matches with "124" in Column B + the rest of the rows where values in Column A are identical with the value in Column A which matches "124" in Column B:
| Column A | Column B |
|---|---|
| 7 | 124 |
| 7 | nan |
| 7 | nan |
| 9 | 124 |
| 9 | nan |
| 9 | nan |
I have tried to filter the dataframe but can only manage to filter the single rows where Column B is 124 and not the following rows where Column A is identical.
>Solution :
You can first get all the values of column A, where B is 124 and afterwards filter your dataframe for these values of column A:
a_values = df[df['B']==124]['A']
df[df['A'].isin(a_values)]