I have a dataframe. I want to group by rows on some columns and then form a count column and then get the max of counts and create a column for it and attach it to dataframe.
I tried:
df["max_pred"] = df.groupby(['fid','prefix','pred_text1'],
sort=False)["pred_text1"].transform("max")
However it lists a row with max repeat for pred_text1, but I want the number of reparation for it
For example:
A B C
a d b
a d b
a d b
a d a
a d a
b b c
b b c
b b d
If I group by A and B and then count of C and get max and store that in new column F, I expect:
A B F
a d 3
a d 3
a d 3
a d 3
a d 3
b b 2
b b 2
b b 2
>Solution :
You can use groupby.transform with value_counts:
df['F'] = (df.groupby(['A', 'B'])['C']
.transform(lambda g: g.value_counts(sort=False).max())
)
Variant with collections.Counter:
from collections import Counter
df['F'] = (df.groupby(['A', 'B'])['C']
.transform(lambda g: max(Counter(g).values()))
)
Output:
A B C F
0 a d b 3
1 a d b 3
2 a d b 3
3 a d a 3
4 a d a 3
5 b b c 2
6 b b c 2
7 b b d 2