I have a filtered dataframe called correct_df and a raw dataframe example_df.
example_df = pd.DataFrame({'Test': ['Test_1', 'Test_1', 'Test_1', 'Test_2', 'Test_2', 'Test_2', 'Test_3', 'Test_3', 'Test_3'], 'A': [1, 2, 3, 1, 2, 3, 1, 2, 3]})
other_df = pd.DataFrame({'Test': ['Test_1', 'Test_1', 'Test_3', 'Test_3'], 'A': [1, 2, 1, 3]})
Desired result:
I want the indexes of example_df where ‘Test’ and ‘A’ column values both match the combined row value of correct_df.
I have tried:
result = example_df.loc[ (example_df['Test'].isin(other_df['Test'])) & (example_df['A'].isin(other_df['A'])) ]
result
Test A
0 Test_1 1
1 Test_1 2
2 Test_1 3
6 Test_3 1
7 Test_3 2
8 Test_3 3
But as these two conditions are separated, the resulting value only applies the conditions on single columns without chaining them i.e. A then also B, not A and B. How do I get the .loc of both column conditions?
>Solution :
Use DataFrame.reset_index for avoid lost indices and then DataFrame.merge:
result = example_df.reset_index().merge(other_df, on=['Test','A'])
print (result)
index Test A
0 0 Test_1 1
1 1 Test_1 2
2 6 Test_3 1
3 8 Test_3 3
result = (example_df.reset_index()
.merge(other_df, on=['Test','A'])
.set_index('index')
.rename_axis(None))
print (result)
Test A
0 Test_1 1
1 Test_1 2
6 Test_3 1
8 Test_3 3
Another idea with MultiIndex with Index.isin and filtering in boolean indexing:
result = example_df[example_df.set_index(['Test','A']).index
.isin(other_df.set_index(['Test','A']).index)]
print (result)
Test A
0 Test_1 1
1 Test_1 2
6 Test_3 1
8 Test_3 3