I’m trying to eliminate all rows that match in col0
and col1
, but don’t have a pair of -1, 1
between rows (for example in the dataframe below there isn’t a a2, b1, -1
row). I was trying to come up with someway to do this, but was groupby
and getting multiindex and not getting anywhere…
# no a2, b1, -1
df = pd.DataFrame([
['a1', 'b1', -1, 0/1],
['a1', 'b1', 1, 1/1],
['a1', 'b2', -1, 2/1],
['a1', 'b2', 1, 1/2],
['a2', 'b1', 1, 1/3],
['a2', 'b2', -1, 2/3],
['a2', 'b2', 1, 4/1]
], columns=['col0', 'col1', 'col2', 'val'])
# desired output
# a1, b1, -1, 0.0
# a1, b1, 1, 1.0
# a1, b2, -1, 2.0
# a1, b2, 1, 0.5
# a2, b2, -1, 0.66667
# a2, b2, 1, 4.0
>Solution :
We can use groupby filter
to test if there are at least 1 (any
) of each value (-1 and 1) per group with Series.any
:
result_df = df.groupby(['col0', 'col1']).filter(
lambda x: x['col2'].eq(-1).any() and x['col2'].eq(1).any()
)
result_df
:
col0 col1 col2 val
0 a1 b1 -1 0.000000
1 a1 b1 1 1.000000
2 a1 b2 -1 2.000000
3 a1 b2 1 0.500000
5 a2 b2 -1 0.666667
6 a2 b2 1 4.000000