I have the following pandas dataframe:
df = pd.DataFrame({
'A': [1,1,1,1,2,2,2,1,1,3,3,3],
'B': [0,0,1,1,0,0,0,1,1,0,0,0],
});
df.index.names = ['Index']
df
A B
Index
0 1 0
1 1 0
2 1 1
3 1 1
4 2 0
5 2 0
6 2 0
7 1 1
8 1 1
9 3 0
10 3 0
11 3 0
I can group this dataframe into blocks of constant ‘A’ like so:
df = df.groupby(df['A'].diff().ne(0).cumsum()).apply(lambda x: x)
df.index.names = ['Block', 'Index']
df
A B
Block Index
1 0 1 0
1 1 0
2 1 1
3 1 1
2 4 2 0
5 2 0
6 2 0
3 7 1 1
8 1 1
4 9 3 0
10 3 0
11 3 0
How do I instead group this dataframe into blocks of constant ‘A’ AND constant ‘B’? My desired result is:
A B
Block Index
1 0 1 0
1 1 0
2 2 1 1
3 1 1
3 4 2 0
5 2 0
6 2 0
4 7 1 1
8 1 1
5 9 3 0
10 3 0
11 3 0
>Solution :
Use the same logic with any (df.diff().ne(0).any(axis=1).cumsum()) as grouper:
out = df.groupby(df.diff().ne(0).any(axis=1).cumsum(), group_keys=True).apply(lambda x: x)
out.index.names = ['Block', 'Index']
Or:
out = (df.assign(Block=df.diff().ne(0).any(axis=1).cumsum())
.groupby('Block', group_keys=True)
.apply(lambda x: x)
)
Output:
A B
Block Index
1 0 1 0
1 1 0
2 2 1 1
3 1 1
3 4 2 0
5 2 0
6 2 0
4 7 1 1
8 1 1
5 9 3 0
10 3 0
11 3 0