I have the following df
df = pd.DataFrame(
{'id':[1,1,1,2,2,2,3,3,3],
'value':['pot','pot','jebus','pot','jebus','pot','pot','jebus','jebus']})
What I want to do is to identify if an id contains repetitive values but only if a row is followed by another row with the same value. So if I have ‘pot’ and after that ‘pot’ again, I want to flag both as true.
Things worth noting, it needs to be based according to the ids. So if I have ‘pot’ in the last row, and ‘pot’ in the first row of a different id. I dont want to flag that value.
The values must be followed by one of the same value in the next row, meaning if ‘pot’,’jebus’,’pot’ no flag.
Wanted result:
s = {true,true,false,false,false,false,true,true}
>Solution :
With Series.shift to compare adjacent values in forward/backward direction:
s = df.groupby('id').transform(lambda x: (x.eq(x.shift()) | x.eq(x.shift(-1))))
value
0 True
1 True
2 False
3 False
4 False
5 False
6 False
7 True
8 True